Go Gin Source Learning

Posted by dannyz on Fri, 17 May 2019 03:01:50 +0200

radix tree

This time we're learning about routing in Gin. In learning about source code, one of the things we see about Gin's routing is its feature.However, the use of cardinality trees in the underlying data also provides performance protection.Because this part of routing is relatively independent and logical, it needs to be learned independently.
The first thing we need to know is the cardinality tree. Interpretation in Baidu Encyclopedia
One of the diagrams gives us a more intuitive view of how the data is stored.

A cardinality tree, which is equivalent to a prefix tree.For each node of the cardinality tree, it is merged with the parent node if it is a determined subtree.A cardinality tree can be used to build an associated array.
As you can see in the figure above, the data structure extracts all the same prefixes and the rest as children.

Application of cardinality tree in Gin

From above, you can see that the cardinality tree is a prefix tree, and you can also see the data structure in the diagram.How does cardinality tree work in Gin?As an example, you can actually see
router.GET("/support", handler1)
router.GET("/search", handler2)
router.GET("/contact", handler3)
router.GET("/group/user/", handler4)
router.GET("/group/user/test", handler5)
The final memory structure is:

/ (handler = nil, indices = "scg")
    s (handler = nil, indices = "ue")
        upport (handler = handler1, indices = "")
        earch (handler = handler2, indices = "")
    contact (handler = handler3, indices = "")
    group/user/ (handler = handler4, indices = "u")
        test (handler = handler5, indices = "")

You can see that router added five routes using the get method, and the actual stored results are shown above.I specifically added handler s and indices from each node to the end.Indices is a string formed by orderly saving the first character of all child nodes.Why highlight this field deliberately, because you don't need to loop through the child nodes to find out if they contain path s underneath them; you just need to loop through this field to see if it does.Such an operation can also improve some efficiency.

Source Viewing

Let's first look at the definition of the node's object and how it is invoked. Note that indices is already mentioned above in this field.

type node struct {
    // Save the URL path on this node
    // For example, search and support in the figure above, path="s" for a common parent node
    // The path s of the latter two nodes are "earch" and "upport"
    path string
    // Determine if the current node path is a parameter node, such as the wildChild node in the following figure: the post part?
    wildChild bool
    // Node types include static, root, param, catchAll
    // Static: a static node, such as s split above as a parent
    // Root: If the inserted node is the first, then it is the root node
    // catchAll: Has * Matched Nodes
    // param: Node except above
    nType nodeType
    // Maximum number of parameters on record path
    maxParams uint8
    // Corresponds to children[], which stores the first character of a split branch
    // For example, search and support, the indices of the s-node correspond to "eu"
    // Represents two branches, the first letters of which are e and u, respectively
    indices string
    // Save Child Node
    children []*node
    // Processing functions of the current node
    handle Handle
    // priority
    priority uint32
}

//The GET method implemented by RouterGrou calls handler
func (group *RouterGroup) GET(relativePath string, handlers ...HandlerFunc) IRoutes {
    return group.handle("GET", relativePath, handlers)
}

func (group *RouterGroup) handle(httpMethod, relativePath string, handlers HandlersChain) IRoutes {
    //Method calculates the path and merges the basepath and relativepath in the group
    absolutePath := group.calculateAbsolutePath(relativePath)
    //Merge handlers merge the added Middleware in the group with the incoming handlers
    handlers = group.combineHandlers(handlers)
    //Call addRoute to add router
    group.engine.addRoute(httpMethod, absolutePath, handlers)
    return group.returnObj()
}

The next thing we need to look at is the addRoute method, which has a long body.In fact, most of the logic is dealing with nodes with parameters, the real core of the logic is not many.I wrote notes on the main logic that should be easier to understand.If you can't understand it, debug can help you understand it several times.

func (engine *Engine) addRoute(method, path string, handlers HandlersChain) {
    assert1(path[0] == '/', "path must begin with '/'")
    assert1(method != "", "HTTP method can not be empty")
    assert1(len(handlers) > 0, "there must be at least one handler")

    debugPrintRoute(method, path, handlers)
    //Get the root node of the method's tree, each method has a root node, such as GET, POST maintains a root node
    root := engine.trees.get(method)
    //If not, create a node
    if root == nil {
        root = new(node)
        engine.trees = append(engine.trees, methodTree{method: method, root: root})
    }
    //Formal Add Route
    root.addRoute(path, handlers)
}

func (n *node) addRoute(path string, handlers HandlersChain) {
    //Record original path
    fullPath := path
    n.priority++
    //The maximum number of parameters in a statistical path is a judgment `:', `*', up to 255
    numParams := countParams(path)

    //Determine whether a node is empty
    if len(n.path) > 0 || len(n.children) > 0 {
    walk:
        for {
            // Update maximum number of parameters
            if numParams > n.maxParams {
                n.maxParams = numParams
            }

            // Finding the same prefix loop number is the smaller length of path and n.path lengths
            i := 0
            max := min(len(path), len(n.path))
            //Loop to determine if the characters are the same, i++ until the end
            for i < max && path[i] == n.path[i] {
                i++
            }

            //Determine if the prefixes are identical and if they are identical, extract the current node as a child node
            //Then use the path part of the same prefix as the parent node
            //For example, n's path = romaned path = romanus with the same prefix as roman
            //The steps are:
            //1. Extract ed to create a new child node to copy all the attributes of the original n
            //2. Change the path of the original n to the same prefix: roman adds the first character of the child node for indices:
            if i < len(n.path) {
                child := node{
                    path:      n.path[i:],
                    wildChild: n.wildChild,
                    indices:   n.indices,
                    children:  n.children,
                    handlers:  n.handlers,
                    priority:  n.priority - 1,
                }

                // Update maxParams (max of all children)
                for i := range child.children {
                    if child.children[i].maxParams > child.maxParams {
                        child.maxParams = child.children[i].maxParams
                    }
                }

                n.children = []*node{&child}
                // []byte for proper unicode char conversion, see #65
                n.indices = string([]byte{n.path[i]})
                n.path = path[:i]
                n.handlers = nil
                n.wildChild = false
            }

            //The original node n is now divided into two nodes:
            //roman parent node
            //    ed child node [0]
            //Now you need to add the incoming route to this parent node
            //The final structure is
            //roman parent node
            //    ed child node [0]
            //    us child node [1]
            // There are also cases where self-invocation is equivalent to recursion:
            //roman
            //    ed
            //    uie
            //When it is determined that the parent node n already has a uie child node, then uie and us have the same prefix u, this U needs to be extracted again as the parent node so walk needs to be called recursively
            //The end result is a three-tiered structure
            //roman
            //    ed
            //    u
            //        ie
            //        s
            //Another case is that walk is also called again if it is a route with parameters
            if i < len(path) {
                path = path[i:]

                if n.wildChild {
                    n = n.children[0]
                    n.priority++

                    // Update maxParams of the child node
                    if numParams > n.maxParams {
                        n.maxParams = numParams
                    }
                    numParams--

                    // Check if the wildcard matches
                    if len(path) >= len(n.path) && n.path == path[:len(n.path)] {
                        // check for longer wildcard, e.g. :name and :names
                        if len(n.path) >= len(path) || path[len(n.path)] == '/' {
                            continue walk
                        }
                    }

                    panic("path segment '" + path +
                        "' conflicts with existing wildcard '" + n.path +
                        "' in path '" + fullPath + "'")
                }

                c := path[0]

                // slash after param
                if n.nType == param && c == '/' && len(n.children) == 1 {
                    n = n.children[0]
                    n.priority++
                    continue walk
                }

                // Check if a child with the next path byte exists
                for i := 0; i < len(n.indices); i++ {
                    if c == n.indices[i] {
                        i = n.incrementChildPrio(i)
                        n = n.children[i]
                        continue walk
                    }
                }

                // Otherwise insert it
                if c != ':' && c != '*' {
                    // []byte for proper unicode char conversion, see #65
                    n.indices += string([]byte{c})
                    child := &node{
                        maxParams: numParams,
                    }
                    n.children = append(n.children, child)
                    n.incrementChildPrio(len(n.indices) - 1)
                    n = child
                }
                n.insertChild(numParams, path, fullPath, handlers)
                return

            } else if i == len(path) {
                if n.handlers != nil {
                    panic("handlers are already registered for path '" + fullPath + "'")
                }
                n.handlers = handlers
            }
            return
        }
    } else { // Node is empty, add direct route directly
        n.insertChild(numParams, path, fullPath, handlers)
        n.nType = root
    }
}

//The Add Node function deals primarily with nodes with parameters
func (n *node) insertChild(numParams uint8, path string, fullPath string, handlers HandlersChain) {
    var offset int // already handled bytes of the path

    // Loop lookup prefix is':'or'*'
    for i, max := 0, len(path); numParams > 0; i++ {
        c := path[i]
        if c != ':' && c != '*' {
            continue
        }

        // Judges that there can be no more * or: otherwise an error will be made unless the next/
        end := i + 1
        for end < max && path[end] != '/' {
            switch path[end] {
            // the wildcard name must not contain ':' and '*'
            case ':', '*':
                panic("only one wildcard per path segment is allowed, has: '" +
                    path[i:] + "' in path '" + fullPath + "'")
            default:
                end++
            }
        }

        //Check to see if this node has children, and if we insert wildcards here, the children will be inaccessible
        if len(n.children) > 0 {
            panic("wildcard route '" + path[i:end] +
                "' conflicts with existing children in path '" + fullPath + "'")
        }

        // check if the wildcard has a name
        if end-i < 2 {
            panic("wildcards must be named with a non-empty name in path '" + fullPath + "'")
        }

        // The parameter type is equivalent to registering a route with:
        if c == ':' {
            // split path at the beginning of the wildcard
            if i > 0 {
                n.path = path[offset:i]
                offset = i
            }

            child := &node{
                nType:     param,
                maxParams: numParams,
            }
            n.children = []*node{child}
            n.wildChild = true
            n = child
            n.priority++
            numParams--

            if end < max {
                n.path = path[offset:end]
                offset = end

                child := &node{
                    maxParams: numParams,
                    priority:  1,
                }
                n.children = []*node{child}
                n = child
            }

        } else {
            //If it is a wildcard character*
            if end != max || numParams > 1 {
                panic("catch-all routes are only allowed at the end of the path in path '" + fullPath + "'")
            }

            if len(n.path) > 0 && n.path[len(n.path)-1] == '/' {
                panic("catch-all conflicts with existing handle for the path segment root in path '" + fullPath + "'")
            }

            // currently fixed width 1 for '/'
            i--
            if path[i] != '/' {
                panic("no / before catch-all in path '" + fullPath + "'")
            }

            n.path = path[offset:i]

            // first node: catchAll node with empty path
            child := &node{
                wildChild: true,
                nType:     catchAll,
                maxParams: 1,
            }
            n.children = []*node{child}
            n.indices = string(path[i])
            n = child
            n.priority++

            // second node: node holding the variable
            child = &node{
                path:      path[i:],
                nType:     catchAll,
                maxParams: 1,
                handlers:  handlers,
                priority:  1,
            }
            n.children = []*node{child}

            return
        }
    }

    // Insert Route If Node Offet Without Parameters is 0
    n.path = path[offset:]
    n.handlers = handlers
}

Finally, let's look at the getRouter method of getting router s from path s.This method is relatively simple, and the notes can basically understand it.

//Ways to find routes based on path s
func (n *node) getValue(path string, po Params, unescape bool) (handlers HandlersChain, p Params, tsr bool) {
    p = po
walk:
    for {
        if len(path) > len(n.path) {
            if path[:len(n.path)] == n.path {
                path = path[len(n.path):]
                // Determine if it is not a parameter node
                // The first character of that path loops through each character in indices to find a child node
                if !n.wildChild {
                    c := path[0]
                    for i := 0; i < len(n.indices); i++ {
                        if c == n.indices[i] {
                            n = n.children[i]
                            continue walk
                        }
                    }

                    tsr = path == "/" && n.handlers != nil
                    return
                }

                // handle wildcard child
                n = n.children[0]
                switch n.nType {
                case param:
                    // If it's a normal':'node, find/or path end and get the parameter
                    end := 0
                    for end < len(path) && path[end] != '/' {
                        end++
                    }

                    // save param value
                    if cap(p) < int(n.maxParams) {
                        p = make(Params, 0, n.maxParams)
                    }
                    i := len(p)
                    p = p[:i+1] // expand slice within preallocated capacity
                    p[i].Key = n.path[1:]
                    val := path[:end]
                    if unescape {
                        var err error
                        if p[i].Value, err = url.QueryUnescape(val); err != nil {
                            p[i].Value = val // fallback, in case of error
                        }
                    } else {
                        p[i].Value = val
                    }

                    // If the parameter has not been processed, continue walk ing
                    if end < len(path) {
                        if len(n.children) > 0 {
                            path = path[end:]
                            n = n.children[0]
                            continue walk
                        }

                        // ... but we can't
                        tsr = len(path) == end+1
                        return
                    }
                    // Otherwise get handle return on OK
                    if handlers = n.handlers; handlers != nil {
                        return
                    }
                    if len(n.children) == 1 {
                        // No handle found. Check if a handle for this path + a
                        // trailing slash exists for TSR recommendation
                        n = n.children[0]
                        tsr = n.path == "/" && n.handlers != nil
                    }

                    return

                case catchAll:
                    // *Match all parameters
                    if cap(p) < int(n.maxParams) {
                        p = make(Params, 0, n.maxParams)
                    }
                    i := len(p)
                    p = p[:i+1] // expand slice within preallocated capacity
                    p[i].Key = n.path[2:]
                    if unescape {
                        var err error
                        if p[i].Value, err = url.QueryUnescape(path); err != nil {
                            p[i].Value = path // fallback, in case of error
                        }
                    } else {
                        p[i].Value = path
                    }

                    handlers = n.handlers
                    return

                default:
                    panic("invalid node type")
                }
            }
        } else if path == n.path {
            // We should have reached the node containing the handle.
            // Check if this node has a handle registered.
            if handlers = n.handlers; handlers != nil {
                return
            }

            if path == "/" && n.wildChild && n.nType != root {
                tsr = true
                return
            }

            // No handle found. Check if a handle for this path + a
            // trailing slash exists for trailing slash recommendation
            for i := 0; i < len(n.indices); i++ {
                if n.indices[i] == '/' {
                    n = n.children[i]
                    tsr = (len(n.path) == 1 && n.handlers != nil) ||
                        (n.nType == catchAll && n.children[0].handlers != nil)
                    return
                }
            }

            return
        }

        // Nothing found. We can recommend to redirect to the same URL with an
        // extra trailing slash if a leaf exists for that path
        tsr = (path == "/") ||
            (len(n.path) == len(path)+1 && n.path[len(path)] == '/' &&
                path == n.path[:len(n.path)-1] && n.handlers != nil)
        return
    }
}

summary

Gin's route is unique because of its storage structure.The storage structure of the cardinality tree can be quickly queried to the corresponding route and executed to the handler.Avoids the logic of looping all routes per request and improves Gin's overall performance.Imagine if you have 100 GET routes in a large project, if you loop 100 lookups per request, the performance will be poor, and if you use a cardinality tree storage method, you may only need to go through several queries.

Gin routing code is long, mostly dealing with the logic of nodes with parameters.Next time, it's the old rule to imitate the routing lookup logic of writing an Arithmetic Tree storage structure.Removing those parameter logic leaves only the main core logic behind.

Topics: Go REST IE

Programmer Think

Go Gin Source Learning

radix tree

Application of cardinality tree in Gin

Source Viewing

summary

Hot Topics