This paper mainly makes some analysis of go slice's expansion mechanism. Environment, 64-bit centos docker image + go1.12.1.
Routine operation
When slice append occurs, growSlice occurs when slice cap is not enough to accommodate new elements.
For example, for the following code
func main() { slice1 := make([]int,1,) fmt.Println("cap of slice1",cap(slice1)) slice1 = append(slice1,1) fmt.Println("cap of slice1",cap(slice1)) slice1 = append(slice1,2) fmt.Println("cap of slice1",cap(slice1)) fmt.Println() slice1024 := make([]int,1024) fmt.Println("cap of slice1024",cap(slice1024)) slice1024 = append(slice1024,1) fmt.Println("cap of slice1024",cap(slice1024)) slice1024 = append(slice1024,2) fmt.Println("cap of slice1024",cap(slice1024)) }
output
cap of slice1 1 cap of slice1 2 cap of slice1 4 cap of slice1024 1024 cap of slice1024 1280 cap of slice1024 1280
Many blogs on the Internet also mentioned that slice expansion, cap less than 1024, directly doubled; cap more than 1024, the new cap into the old cap 1.25 times.
The source code for this statement is as follows. The specific code is $GOROOT/src/runtime/slice.go.
func growslice(et *_type, old slice, cap int) slice { // Eliminate some judgments... newcap := old.cap doublecap := newcap + newcap if cap > doublecap { newcap = cap } else { if old.len < 1024 { newcap = doublecap } else { // Check 0 < newcap to detect overflow // and prevent an infinite loop. for 0 < newcap && newcap < cap { newcap += newcap / 4 } // Set newcap to the requested cap when // the newcap calculation overflowed. if newcap <= 0 { newcap = cap } } } // Omit some follow-up.
A sharp-eyed friend may see the problem. The expansion mechanism mentioned above corresponds to a branch of the source code. In other words, in fact, the expansion mechanism is not necessarily the same. What is it? Enter the next section with questions
Unconventional operation
The above operation is one element per append. Considering another situation, what happens when there are many elements in a single append? For example, what is the capacity of the following code?
package main import "fmt" func main() { a := []byte{1, 0} a = append(a, 1, 1, 1) fmt.Println("cap of a is ",cap(a)) b := []int{23, 51} b = append(b, 4, 5, 6) fmt.Println("cap of b is ",cap(b)) c := []int32{1, 23} c = append(c, 2, 5, 6) fmt.Println("cap of c is ",cap(c)) type D struct{ age byte name string } d := []D{ {1,"123"}, {2,"234"}, } d = append(d,D{4,"456"},D{5,"567"},D{6,"678"}) fmt.Println("cap of d is ",cap(d)) }
Should it be four eight? Based on the idea of doubling, cap is from 2 - > 4 - > 8.
Or four five? The guess for 4 5 is based on the following assumption: if one expansion is not enough to accommodate the placement of elements when append has more than one element, if I were a designer, I would first estimate how much capacity it would take to place elements, and then expand it again. The advantage is that there is no need to apply for new underlying arrays frequently, and no need for frequent data copies.
But the results were somewhat unexpected.
cap of a is 8 cap of b is 6 cap of c is 8 cap of d is 5
Do you feel confused? "No, I know so. Comrade Duxiu, you can close this article.
Why is there such a strange phenomenon? Upper text
gdb analysis
Looking at the source code has not made much progress, we can only use some auxiliary tools to see the operation, so as to better analyze the source code, just as GDB is the right tool to do so.
It's still the code above. Let's compile it and load it into gdb.
[root@a385d77a9056 jack]# go build -o jack [root@a385d77a9056 jack]# ls jack main.go [root@a385d77a9056 jack]# gdb jack GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/goblog/src/jack/jack...done. Loading Go Runtime support. (gdb)
In the line where append occurs, the breakpoint is hit on the append of [] int slice b, which has a capacity of 6 after expansion, and then the program starts to run. For better illustration, the breakpoint is hit on the append of [] int slice B.
gdb) l 10 5 ) 6 7 func main() { 8 9 a := []byte{1, 0} 10 a = append(a, 1, 1, 1) 11 fmt.Println("cap of a is ", cap(a)) 12 13 b := []int{23, 51} 14 b = append(b, 4, 5, 6) (gdb) b 14 Breakpoint 2 at 0x4872d5: file /home/goblog/src/jack/main.go, line 14. (gdb) r Starting program: /home/goblog/src/jack/jack cap of a is 8 Breakpoint 2, main.main () at /home/goblog/src/jack/main.go:14 14 b = append(b, 4, 5, 6)
Jump into the breakpoint and see how it works
(gdb) s runtime.growslice (et=0x497dc0, old=..., cap=5, ~r3=...) at /usr/local/src/go/src/runtime/slice.go:76 76 func growslice(et *_type, old slice, cap int) slice { (gdb) p *et $1 = {size = 8, ptrdata = 0, hash = 4149441018, tflag = 7 '\a', align = 8 '\b', fieldalign = 8 '\b', kind = 130 '\202', alg = 0x555df0 <runtime.algarray+80>, gcdata = 0x4ce4f8 "\001\002\003\004\005\006\a\b\t\n\v\f\r\016\017\020\022\024\025\026\027\030\031\033\036\037\"%&,2568<BQUX\216\231\330\335\345\377", str = 987, ptrToThis = 45312} (gdb) p old $2 = {array = 0xc000074ec8, len = 2, cap = 2}
It's complicated. The only thing you can understand at the beginning is
First, the incoming cap is 5, that is to say, the idea mentioned above is correct at present. When append has more than one element, we should first estimate the capacity and then expand it.
Second, slice is a struct, and struct is a value type.
It is not until we have a general understanding of the process that we know that ET is a kind of metadata information of the element type in slice. We can analyze slice. It is enough to know size in et, which represents the byte size of the element in the computer. The author uses a 64-bit centos docker image, int is int64, which is 8 bytes in size.
Go on, this part of the analysis involves another part of the code, first paste on
switch { case et.size == 1: lenmem = uintptr(old.len) newlenmem = uintptr(cap) capmem = roundupsize(uintptr(newcap)) overflow = uintptr(newcap) > maxAlloc newcap = int(capmem) case et.size == sys.PtrSize: lenmem = uintptr(old.len) * sys.PtrSize newlenmem = uintptr(cap) * sys.PtrSize capmem = roundupsize(uintptr(newcap) * sys.PtrSize) overflow = uintptr(newcap) > maxAlloc/sys.PtrSize newcap = int(capmem / sys.PtrSize) case isPowerOfTwo(et.size): var shift uintptr if sys.PtrSize == 8 { // Mask shift for better code generation. shift = uintptr(sys.Ctz64(uint64(et.size))) & 63 } else { shift = uintptr(sys.Ctz32(uint32(et.size))) & 31 } lenmem = uintptr(old.len) << shift newlenmem = uintptr(cap) << shift capmem = roundupsize(uintptr(newcap) << shift) overflow = uintptr(newcap) > (maxAlloc >> shift) newcap = int(capmem >> shift) default: lenmem = uintptr(old.len) * et.size newlenmem = uintptr(cap) * et.size capmem, overflow = math.MulUintptr(et.size, uintptr(newcap)) capmem = roundupsize(capmem) newcap = int(capmem / et.size) }
In the case of gdb analysis, some minor details are omitted and only some of the more important processes are picked up.
(gdb) n 96 doublecap := newcap + newcap // Combined with the source code analysis listed in the routine operations, newcap is initialized as old.cap, i.e. 2, doublecap is 4 (gdb) n 97 if cap > doublecap { // cap is an incoming parameter with a value of 5, which is larger than doublecap=4 after doubling. (gdb) n 98 newcap = cap // Thus, newcap assigns a calculated capacity of 5, while len < 1024 does not enter the branch. (gdb) n 123 case et.size == 1: (gdb) disp newcap // Print the value of newcap 3: newcap = 5 (gdb) n 129 case et.size == sys.PtrSize: // et.size is 8 bytes of type, which is exactly equal to the pointer size of 64-bit system. 3: newcap = 5 (gdb) n 132 capmem = roundupsize(uintptr(newcap) * sys.PtrSize) // The capmem obtained is the memory required for this capacity, the core steps, the following focus on analysis. 3: newcap = 5 (gdb) disp capmem // Print capmem, combined with the following you can see that it is 48 4: capmem = <optimized out> (gdb) n 134 newcap = int(capmem / sys.PtrSize) // Get new capacity 4: capmem = 48 3: newcap = 5 (gdb) n 122 switch { 4: capmem = <optimized out> 3: newcap = 5 (gdb) n 169 if overflow || capmem > maxAlloc { // This is the code that jumps out of the switch block. It's not important, but we've already seen the desired result. The capacity of newcap is exactly 6, which is cap(b) obtained above. 4: capmem = 48 3: newcap = 6
The latter code uses capmem for memory allocation, and then uses newcap as cap for the new slice. Let's analyze this step capmem = roundupsize(uintptr(newcap) * sys.PtrSize).
round-up, round up size, take a size up. The product of (uintptr)* sys.PtrSize) should be 58 = 40. After rounding up, the new required memory capmem=48 is obtained. Then the required memory / type size int(capmem / sys.PtrSize) is obtained, and the new capacity is 6.*
To understand why roundupsize changes 40 to 48, you need to simply introduce go's memory management. You can trace into the roundupsize method, and then into the sizeclasses.go file. At the beginning of this file, the size table of golang object is given, which is roughly as follows
// class bytes/obj bytes/span objects tail waste max waste // 1 8 8192 1024 0 87.50% // 2 16 8192 512 0 43.75% // 3 32 8192 256 0 46.88% // 4 48 8192 170 32 31.52% // 5 64 8192 128 0 23.44% // 6 80 8192 102 32 19.07% // 7 96 8192 85 32 15.95% // 8 112 8192 73 16 13.56% // 9 128 8192 64 0 11.72% // 10 144 8192 56 128 11.82% // ... // 65 28672 57344 2 0 4.91% // 66 32768 32768 1 0 12.50%
Others are not concerned for the time being. Let's first look at the bytes/obj column. This column is the predefined object size in go. The minimum is 8b, the maximum is 32K, and the other one is beyond 32K. There are 67 categories (over 32K is not listed in this file, 66 + 1 = 67). As you can see, there is no type of size 40, so 40 is rounded up and 48 is taken, which is the truth that happens in roundupsize. Here is a more professional term, memory alignment. Why does it need to be designed like this? Interested readers can take a close look at golang's memory management, which is limited in space and will not be launched.
There are other types of append s in unconventional operations. There is no analysis of gdb here. There are also roundupsize operations, which are similar to each other. Interested friends can play on their own.
Doubt
In append, roundupsize is not a special branch of the operation, I feel it is impossible to always double the expansion and 1.25 times the expansion ah, I suspect that there are many blogs on the Internet said that there is a problem.
So I tested it again.
e := []int32{1,2,3} fmt.Println("cap of e before:",cap(e)) e = append(e,4) fmt.Println("cap of e after:",cap(e)) f := []int{1,2,3} fmt.Println("cap of f before:",cap(f)) f = append(f,4) fmt.Println("cap of f after:",cap(f)) cap of e before: 3 cap of e after: 8 cap of f before: 3 cap of f after: 6
Well, that's not the case. The expanded slice capacity also depends on the type.
summary
Content jumping is a bit confusing. Summarize the action of expansion when append happens.
1. Appnd single element, or append a small number of elements, where a small amount refers to the capacity after the double can accommodate, so it will follow the following expansion process, less than 1024, double expansion, more than 1024, 1.25 times expansion.
2. If append has more than one element, and the capacity after double cannot be accommodated, use the estimated capacity directly.
Knock on the key points!!!! In addition, the two branches need to calculate the required memory capmem according to the type size of slice, and then proceed to capmem up to get the new required memory, in addition to the type size, to get the real final capacity as the capacity of the new slice.
Author: l_sivan
Link: https://www.jianshu.com/p/303daad705a3
Source: Brief Book