radix tree
This time we're learning about routing in Gin. In learning about source code, one of the things we see about Gin's routing is its feature.However, the use of cardinality trees in the underlying data also provides performance protection.Because this part of routing is relatively independent and logical, it needs to be learned independently.
The first thing we need to know is the cardinality tree. Interpretation in Baidu Encyclopedia
One of the diagrams gives us a more intuitive view of how the data is stored.
A cardinality tree, which is equivalent to a prefix tree.For each node of the cardinality tree, it is merged with the parent node if it is a determined subtree.A cardinality tree can be used to build an associated array.
As you can see in the figure above, the data structure extracts all the same prefixes and the rest as children.
Application of cardinality tree in Gin
From above, you can see that the cardinality tree is a prefix tree, and you can also see the data structure in the diagram.How does cardinality tree work in Gin?As an example, you can actually see
router.GET("/support", handler1)
router.GET("/search", handler2)
router.GET("/contact", handler3)
router.GET("/group/user/", handler4)
router.GET("/group/user/test", handler5)
The final memory structure is:
/ (handler = nil, indices = "scg") s (handler = nil, indices = "ue") upport (handler = handler1, indices = "") earch (handler = handler2, indices = "") contact (handler = handler3, indices = "") group/user/ (handler = handler4, indices = "u") test (handler = handler5, indices = "")
You can see that router added five routes using the get method, and the actual stored results are shown above.I specifically added handler s and indices from each node to the end.Indices is a string formed by orderly saving the first character of all child nodes.Why highlight this field deliberately, because you don't need to loop through the child nodes to find out if they contain path s underneath them; you just need to loop through this field to see if it does.Such an operation can also improve some efficiency.
Source Viewing
Let's first look at the definition of the node's object and how it is invoked. Note that indices is already mentioned above in this field.
type node struct { // Save the URL path on this node // For example, search and support in the figure above, path="s" for a common parent node // The path s of the latter two nodes are "earch" and "upport" path string // Determine if the current node path is a parameter node, such as the wildChild node in the following figure: the post part? wildChild bool // Node types include static, root, param, catchAll // Static: a static node, such as s split above as a parent // Root: If the inserted node is the first, then it is the root node // catchAll: Has * Matched Nodes // param: Node except above nType nodeType // Maximum number of parameters on record path maxParams uint8 // Corresponds to children[], which stores the first character of a split branch // For example, search and support, the indices of the s-node correspond to "eu" // Represents two branches, the first letters of which are e and u, respectively indices string // Save Child Node children []*node // Processing functions of the current node handle Handle // priority priority uint32 } //The GET method implemented by RouterGrou calls handler func (group *RouterGroup) GET(relativePath string, handlers ...HandlerFunc) IRoutes { return group.handle("GET", relativePath, handlers) } func (group *RouterGroup) handle(httpMethod, relativePath string, handlers HandlersChain) IRoutes { //Method calculates the path and merges the basepath and relativepath in the group absolutePath := group.calculateAbsolutePath(relativePath) //Merge handlers merge the added Middleware in the group with the incoming handlers handlers = group.combineHandlers(handlers) //Call addRoute to add router group.engine.addRoute(httpMethod, absolutePath, handlers) return group.returnObj() }
The next thing we need to look at is the addRoute method, which has a long body.In fact, most of the logic is dealing with nodes with parameters, the real core of the logic is not many.I wrote notes on the main logic that should be easier to understand.If you can't understand it, debug can help you understand it several times.
func (engine *Engine) addRoute(method, path string, handlers HandlersChain) { assert1(path[0] == '/', "path must begin with '/'") assert1(method != "", "HTTP method can not be empty") assert1(len(handlers) > 0, "there must be at least one handler") debugPrintRoute(method, path, handlers) //Get the root node of the method's tree, each method has a root node, such as GET, POST maintains a root node root := engine.trees.get(method) //If not, create a node if root == nil { root = new(node) engine.trees = append(engine.trees, methodTree{method: method, root: root}) } //Formal Add Route root.addRoute(path, handlers) } func (n *node) addRoute(path string, handlers HandlersChain) { //Record original path fullPath := path n.priority++ //The maximum number of parameters in a statistical path is a judgment `:', `*', up to 255 numParams := countParams(path) //Determine whether a node is empty if len(n.path) > 0 || len(n.children) > 0 { walk: for { // Update maximum number of parameters if numParams > n.maxParams { n.maxParams = numParams } // Finding the same prefix loop number is the smaller length of path and n.path lengths i := 0 max := min(len(path), len(n.path)) //Loop to determine if the characters are the same, i++ until the end for i < max && path[i] == n.path[i] { i++ } //Determine if the prefixes are identical and if they are identical, extract the current node as a child node //Then use the path part of the same prefix as the parent node //For example, n's path = romaned path = romanus with the same prefix as roman //The steps are: //1. Extract ed to create a new child node to copy all the attributes of the original n //2. Change the path of the original n to the same prefix: roman adds the first character of the child node for indices: if i < len(n.path) { child := node{ path: n.path[i:], wildChild: n.wildChild, indices: n.indices, children: n.children, handlers: n.handlers, priority: n.priority - 1, } // Update maxParams (max of all children) for i := range child.children { if child.children[i].maxParams > child.maxParams { child.maxParams = child.children[i].maxParams } } n.children = []*node{&child} // []byte for proper unicode char conversion, see #65 n.indices = string([]byte{n.path[i]}) n.path = path[:i] n.handlers = nil n.wildChild = false } //The original node n is now divided into two nodes: //roman parent node // ed child node [0] //Now you need to add the incoming route to this parent node //The final structure is //roman parent node // ed child node [0] // us child node [1] // There are also cases where self-invocation is equivalent to recursion: //roman // ed // uie //When it is determined that the parent node n already has a uie child node, then uie and us have the same prefix u, this U needs to be extracted again as the parent node so walk needs to be called recursively //The end result is a three-tiered structure //roman // ed // u // ie // s //Another case is that walk is also called again if it is a route with parameters if i < len(path) { path = path[i:] if n.wildChild { n = n.children[0] n.priority++ // Update maxParams of the child node if numParams > n.maxParams { n.maxParams = numParams } numParams-- // Check if the wildcard matches if len(path) >= len(n.path) && n.path == path[:len(n.path)] { // check for longer wildcard, e.g. :name and :names if len(n.path) >= len(path) || path[len(n.path)] == '/' { continue walk } } panic("path segment '" + path + "' conflicts with existing wildcard '" + n.path + "' in path '" + fullPath + "'") } c := path[0] // slash after param if n.nType == param && c == '/' && len(n.children) == 1 { n = n.children[0] n.priority++ continue walk } // Check if a child with the next path byte exists for i := 0; i < len(n.indices); i++ { if c == n.indices[i] { i = n.incrementChildPrio(i) n = n.children[i] continue walk } } // Otherwise insert it if c != ':' && c != '*' { // []byte for proper unicode char conversion, see #65 n.indices += string([]byte{c}) child := &node{ maxParams: numParams, } n.children = append(n.children, child) n.incrementChildPrio(len(n.indices) - 1) n = child } n.insertChild(numParams, path, fullPath, handlers) return } else if i == len(path) { if n.handlers != nil { panic("handlers are already registered for path '" + fullPath + "'") } n.handlers = handlers } return } } else { // Node is empty, add direct route directly n.insertChild(numParams, path, fullPath, handlers) n.nType = root } } //The Add Node function deals primarily with nodes with parameters func (n *node) insertChild(numParams uint8, path string, fullPath string, handlers HandlersChain) { var offset int // already handled bytes of the path // Loop lookup prefix is':'or'*' for i, max := 0, len(path); numParams > 0; i++ { c := path[i] if c != ':' && c != '*' { continue } // Judges that there can be no more * or: otherwise an error will be made unless the next/ end := i + 1 for end < max && path[end] != '/' { switch path[end] { // the wildcard name must not contain ':' and '*' case ':', '*': panic("only one wildcard per path segment is allowed, has: '" + path[i:] + "' in path '" + fullPath + "'") default: end++ } } //Check to see if this node has children, and if we insert wildcards here, the children will be inaccessible if len(n.children) > 0 { panic("wildcard route '" + path[i:end] + "' conflicts with existing children in path '" + fullPath + "'") } // check if the wildcard has a name if end-i < 2 { panic("wildcards must be named with a non-empty name in path '" + fullPath + "'") } // The parameter type is equivalent to registering a route with: if c == ':' { // split path at the beginning of the wildcard if i > 0 { n.path = path[offset:i] offset = i } child := &node{ nType: param, maxParams: numParams, } n.children = []*node{child} n.wildChild = true n = child n.priority++ numParams-- if end < max { n.path = path[offset:end] offset = end child := &node{ maxParams: numParams, priority: 1, } n.children = []*node{child} n = child } } else { //If it is a wildcard character* if end != max || numParams > 1 { panic("catch-all routes are only allowed at the end of the path in path '" + fullPath + "'") } if len(n.path) > 0 && n.path[len(n.path)-1] == '/' { panic("catch-all conflicts with existing handle for the path segment root in path '" + fullPath + "'") } // currently fixed width 1 for '/' i-- if path[i] != '/' { panic("no / before catch-all in path '" + fullPath + "'") } n.path = path[offset:i] // first node: catchAll node with empty path child := &node{ wildChild: true, nType: catchAll, maxParams: 1, } n.children = []*node{child} n.indices = string(path[i]) n = child n.priority++ // second node: node holding the variable child = &node{ path: path[i:], nType: catchAll, maxParams: 1, handlers: handlers, priority: 1, } n.children = []*node{child} return } } // Insert Route If Node Offet Without Parameters is 0 n.path = path[offset:] n.handlers = handlers }
Finally, let's look at the getRouter method of getting router s from path s.This method is relatively simple, and the notes can basically understand it.
//Ways to find routes based on path s func (n *node) getValue(path string, po Params, unescape bool) (handlers HandlersChain, p Params, tsr bool) { p = po walk: for { if len(path) > len(n.path) { if path[:len(n.path)] == n.path { path = path[len(n.path):] // Determine if it is not a parameter node // The first character of that path loops through each character in indices to find a child node if !n.wildChild { c := path[0] for i := 0; i < len(n.indices); i++ { if c == n.indices[i] { n = n.children[i] continue walk } } tsr = path == "/" && n.handlers != nil return } // handle wildcard child n = n.children[0] switch n.nType { case param: // If it's a normal':'node, find/or path end and get the parameter end := 0 for end < len(path) && path[end] != '/' { end++ } // save param value if cap(p) < int(n.maxParams) { p = make(Params, 0, n.maxParams) } i := len(p) p = p[:i+1] // expand slice within preallocated capacity p[i].Key = n.path[1:] val := path[:end] if unescape { var err error if p[i].Value, err = url.QueryUnescape(val); err != nil { p[i].Value = val // fallback, in case of error } } else { p[i].Value = val } // If the parameter has not been processed, continue walk ing if end < len(path) { if len(n.children) > 0 { path = path[end:] n = n.children[0] continue walk } // ... but we can't tsr = len(path) == end+1 return } // Otherwise get handle return on OK if handlers = n.handlers; handlers != nil { return } if len(n.children) == 1 { // No handle found. Check if a handle for this path + a // trailing slash exists for TSR recommendation n = n.children[0] tsr = n.path == "/" && n.handlers != nil } return case catchAll: // *Match all parameters if cap(p) < int(n.maxParams) { p = make(Params, 0, n.maxParams) } i := len(p) p = p[:i+1] // expand slice within preallocated capacity p[i].Key = n.path[2:] if unescape { var err error if p[i].Value, err = url.QueryUnescape(path); err != nil { p[i].Value = path // fallback, in case of error } } else { p[i].Value = path } handlers = n.handlers return default: panic("invalid node type") } } } else if path == n.path { // We should have reached the node containing the handle. // Check if this node has a handle registered. if handlers = n.handlers; handlers != nil { return } if path == "/" && n.wildChild && n.nType != root { tsr = true return } // No handle found. Check if a handle for this path + a // trailing slash exists for trailing slash recommendation for i := 0; i < len(n.indices); i++ { if n.indices[i] == '/' { n = n.children[i] tsr = (len(n.path) == 1 && n.handlers != nil) || (n.nType == catchAll && n.children[0].handlers != nil) return } } return } // Nothing found. We can recommend to redirect to the same URL with an // extra trailing slash if a leaf exists for that path tsr = (path == "/") || (len(n.path) == len(path)+1 && n.path[len(path)] == '/' && path == n.path[:len(n.path)-1] && n.handlers != nil) return } }
summary
Gin's route is unique because of its storage structure.The storage structure of the cardinality tree can be quickly queried to the corresponding route and executed to the handler.Avoids the logic of looping all routes per request and improves Gin's overall performance.Imagine if you have 100 GET routes in a large project, if you loop 100 lookups per request, the performance will be poor, and if you use a cardinality tree storage method, you may only need to go through several queries.
Gin routing code is long, mostly dealing with the logic of nodes with parameters.Next time, it's the old rule to imitate the routing lookup logic of writing an Arithmetic Tree storage structure.Removing those parameter logic leaves only the main core logic behind.