After reading this article, can you answer the following high-frequency interview questions
- Underlying implementation principle of Go slice
- The difference between Go array and slice
- Go slice deep and light copies
- What is the Go slice capacity expansion mechanism?
- Why is Go slice non thread safe?
Implementation principle
slice is an array with no fixed length. The underlying structure is a structure, which contains the following three attributes
A slice occupies 24 bytes in golang
type slice struct { array unsafe.Pointer len int cap int }
Array: contains a pointer to an array. The data is actually stored on the array pointed to by the pointer, occupying 8 bytes
len: the length used by the current slice, occupying 8 bytes
cap: the capacity of the current slice and the length of the underlying array, 8 bytes
Slice is not a real dynamic array, but a reference type. Slice always points to an underlying array. Slice declaration can be the same as array, but the length is variable. Through syntax sugar in golang, we can automatically create slice structure like declaring array
When the slice element value is taken according to the index position, the default value range is (0 ~ len(slice)-1). Generally, when slice is output, it usually refers to slice[0:len(slice)-1]. The value pointed to in the underlying array can be output according to the subscript
Main characteristics
reference type
golang has three commonly used advanced types: slice, map and channel. They are all reference types. When the reference type is used as a function parameter, the original content data may be modified.
func sliceModify(s []int) { s[0] = 100 } func sliceAppend(s []int) []int { s = append(s, 100) return s } func sliceAppendPtr(s *[]int) { *s = append(*s, 100) return } // Note: all parameters passed in Go language are value passing (value passing), which is a copy and a copy. // The copied content is of non reference type (int, string, struct, etc.), and the original content data cannot be modified in the function; // The copied content is a reference type (interface, pointer, map, slice, chan, etc.), so that the original content data can be modified. func TestSliceFn(t *testing.T) { // The parameter is the reference type slice: the len/cap of the outer slice will not change, and the underlying array pointed to will change s := []int{1, 1, 1} newS := sliceAppend(s) // Capacity expansion occurred in the function t.Log(s, len(s), cap(s)) // [1 1 1] 3 3 t.Log(newS, len(newS), cap(newS)) // [1 1 1 100] 4 6 s2 := make([]int, 0, 5) newS = sliceAppend(s2) // There is no capacity expansion in the function t.Log(s2, s2[0:5], len(s2), cap(s2)) // [] [100 0 0 0 0] 0 5 t.Log(newS, newS[0:5], len(newS), cap(newS)) // [100] [100 0 0 0 0] 1 5 // The parameter is the pointer of the reference type slice: the len/cap of the outer slice will change and the underlying array pointed to will change sliceAppendPtr(&s) t.Log(s, len(s), cap(s)) // [1 1 1 100] 4 6 sliceModify(s) t.Log(s, len(s), cap(s)) // [100 1 1 100] 4 6 }
The official account caspar reply to the code to get all the sample code.
Slice status
Slice has three special states: Zero slice, empty slice and nil slice
func TestSliceEmptyOrNil(t *testing.T) { var slice1 []int // slice1 is nil slice slice2 := make([]int, 0) // slcie2 is empty slice var slice3 = make([]int, 2) // slice3 is zero slice if slice1 == nil { t.Log("slice1 is nil.") // This line will be output } if slice2 == nil { t.Log("slice2 is nil.") // This line will not be output } t.Log(slice3) // [0 0] }
Non thread safe
Slice does not support concurrent reading and writing, so it is not thread safe. Multiple goroutine s are used to operate variables of type slice. The probability of output value will not be the same each time, which is inconsistent with the expected value; Slice will not report errors during concurrent execution, but data will be lost
/** * Slice non concurrent security * Execute multiple times and get different results each time * You can consider using the characteristics of channel itself (blocking) to achieve safe concurrent read and write */ func TestSliceConcurrencySafe(t *testing.T) { a := make([]int, 0) var wg sync.WaitGroup for i := 0; i < 10000; i++ { wg.Add(1) go func(i int) { a = append(a, i) wg.Done() }(i) } wg.Wait() t.Log(len(a)) // not equal 10000 }
There are two ways to implement slice thread safety:
Method 1: realize slice thread safety by locking, which is suitable for scenarios with low performance requirements.
func TestSliceConcurrencySafeByMutex(t *testing.T) { var lock sync.Mutex //mutex a := make([]int, 0) var wg sync.WaitGroup for i := 0; i < 10000; i++ { wg.Add(1) go func(i int) { defer wg.Done() lock.Lock() defer lock.Unlock() a = append(a, i) }(i) } wg.Wait() t.Log(len(a)) // equal 10000 }
Mode 2: slice thread safety is realized through channel, which is suitable for scenarios with high performance requirements.
func TestSliceConcurrencySafeByChanel(t *testing.T) { buffer := make(chan int) a := make([]int, 0) // consumer go func() { for v := range buffer { a = append(a, v) } }() // producer var wg sync.WaitGroup for i := 0; i < 10000; i++ { wg.Add(1) go func(i int) { defer wg.Done() buffer <- i }(i) } wg.Wait() t.Log(len(a)) // equal 10000 }
Shared storage
If multiple slices share the same underlying array, changes to one slice or the underlying array will affect other slices
/** * Slice shared storage */ func TestSliceShareMemory(t *testing.T) { slice1 := []string{"1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"} Q2 := slice1[3:6] t.Log(Q2, len(Q2), cap(Q2)) // [4 5 6] 3 9 Q3 := slice1[5:8] t.Log(Q3, len(Q3), cap(Q3)) // [6 7 8] 3 7 Q3[0] = "Unkown" t.Log(Q2, Q3) // [4 5 Unkown] [Unkown 7 8] a := []int{1, 2, 3, 4, 5} shadow := a[1:3] t.Log(shadow, a) // [2 3] [1 2 3 4 5] shadow = append(shadow, 100) // All slices pointing to the array are modified t.Log(shadow, a) // [2 3 100] [1 2 3 100 5] }
Common operation
establish
slice can be created in four ways, as follows:
func TestSliceInit(t *testing.T) { // Initialization method 1: direct declaration var slice1 []int t.Log(len(slice1), cap(slice1)) // 0, 0 slice1 = append(slice1, 1) t.Log(len(slice1), cap(slice1)) // 1, 1, 24 // Initialization method 2: use literal slice2 := []int{1, 2, 3, 4} t.Log(len(slice2), cap(slice2)) // 4, 4, 24 // Initialization method 3: create slice with make slice3 := make([]int, 3, 5) // Make ([] t, len, cap) if cap is not transmitted, it is the same as len t.Log(len(slice3), cap(slice3)) // 3, 5 t.Log(slice3[0], slice3[1], slice3[2]) // 0, 0, 0 // t.Log(slice3[3], slice3[4]) // panic: runtime error: index out of range [3] with length 3 slice3 = append(slice3, 1) t.Log(len(slice3), cap(slice3)) // 4, 5, 24 // Initialization method 4: intercept from slice or array arr := [100]int{} for i := range arr { arr[i] = i } slcie4 := arr[1:3] slice5 := make([]int, len(slcie4)) copy(slice5, slcie4) t.Log(len(slcie4), cap(slcie4), unsafe.Sizeof(slcie4)) // 2,99,24 t.Log(len(slice5), cap(slice5), unsafe.Sizeof(slice5)) // 2,2,24 }
increase
func TestSliceGrowing(t *testing.T) { slice1 := []int{} for i := 0; i < 10; i++ { slice1 = append(slice1, i) t.Log(len(slice1), cap(slice1)) } // 1 1 // 2 2 // 3 4 // 4 4 // 5 8 // 6 8 // 7 8 // 8 8 // 9 16 // 10 16 }
delete
func TestSliceDelete(t *testing.T) { slice1 := []int{1, 2, 3, 4, 5} var x int // Delete last element x, slice1 = slice1[len(slice1)-1], slice1[:len(slice1)-1] t.Log(x, slice1, len(slice1), cap(slice1)) // 5 [1 2 3 4] 4 5 // Delete the 2nd element slice1 = append(slice1[:2], slice1[3:]...) t.Log(slice1, len(slice1), cap(slice1)) // [1 2 4] 3 5 }
lookup
v := s[i] // Subscript access
modify
s[i] = 5 // Subscript modification
intercept
/** * Slice interception */ func TestSliceSubstr(t *testing.T) { slice1 := []int{1, 2, 3, 4, 5} slice2 := slice1[:] // Intercept slice[left:right:max] // left: omit the default 0 // right: omit the default len(slice1) // max: omit the default len(slice1) // len = right-left+1 // cap = max-left t.Log(slice2, len(slice2), cap(slice2)) // 1 2 3 4 5] 5 5 slice3 := slice1[1:] t.Log(slice3, len(slice3), cap(slice3)) // [2 3 4 5] 4 4 slice4 := slice1[:2] t.Log(slice4, len(slice4), cap(slice4)) // [1 2] 2 5 slice5 := slice1[1:2] t.Log(slice5, len(slice5), cap(slice5)) // [2] 1 4 slice6 := slice1[:2:5] t.Log(slice6, len(slice6), cap(slice6)) // [1 2] 2 5 slice7 := slice1[1:2:2] t.Log(slice7, len(slice7), cap(slice7)) // [2] 1 1 }
ergodic
There are three ways to traverse slices
/** * Slice traversal */ func TestSliceTravel(t *testing.T) { slice1 := []int{1, 2, 3, 4} for i := 0; i < len(slice1); i++ { t.Log(slice1[i]) } for idx, e := range slice1 { t.Log(idx, e) } for _, e := range slice1 { t.Log(e) } }
reversal
func TestSliceReverse(t *testing.T) { a := []int{1, 2, 3, 4, 5} for left, right := 0, len(a)-1; left < right; left, right = left+1, right-1 { a[left], a[right] = a[right], a[left] } t.Log(a, len(a), cap(a)) // [5 4 3 2 1] 5 5 }
Copy
During development, one variable will often be copied to another variable. This process may be a deep and shallow copy. Today, let's help you distinguish the difference between the two copies and the specific difference
Deep copy
The data itself is copied to create a new object. The newly created object does not share memory with the original object. The newly created object opens up a new memory address in memory. The modification of the new object value will not affect the original object value. Since the memory addresses are different, they can be released separately when releasing the memory address
For data of value type, the default assignment operation is deep copy, such as Array, Int, String, Struct, Float and Bool. If you want to implement deep copy of reference type data, you need to complete it through auxiliary functions
For example, the golang deep copy copy method will copy the elements in the source slice value (i.e. from Slice) to the target slice (i.e. to Slice), and return the number of copied elements. The two types of copy must be consistent. The final copy result of copy method depends on the shorter slice. When the shorter slice is copied, the whole copy process is completed
/** * Deep copy */ func TestSliceDeepCopy(t *testing.T) { slice1 := []int{1, 2, 3, 4, 5} slice2 := make([]int, 5, 5) // Deep copy copy(slice2, slice1) t.Log(slice1, len(slice1), cap(slice1)) // [1 2 3 4 5] 5 5 t.Log(slice2, len(slice2), cap(slice2)) // [1 2 3 4 5] 5 5 slice1[1] = 100 t.Log(slice1, len(slice1), cap(slice1)) // [1 100 3 4 5] 5 5 t.Log(slice2, len(slice2), cap(slice2)) // [1 2 3 4 5] 5 5 }
Shallow copy
The data address is copied. Only the pointer to the object is copied. At this time, the memory address pointed to by the new object and the old object is the same. When the value of the new object is modified, the old object will also change. When the memory address is released, the memory address is also released.
All data of reference type are shallow copies by default, such as Slice, Map, etc
The target slice and the source slice point to the same underlying array. Any change in array elements will affect both arrays at the same time.
/** * Shallow copy */ func TestSliceShadowCopy(t *testing.T) { slice1 := []int{1, 2, 3, 4, 5} // Shallow copy (Note: = shallow copy for reference type and deep copy for value type) slice2 := slice1 t.Logf("%p", slice1) // 0xc00001c120 t.Logf("%p", slice2) // 0xc00001c120 // When two arrays are changed at the same time, it is a shallow copy. When the capacity is not expanded, after modifying the elements of slice1, the elements of slice2 will also be modified slice1[0] = 10 t.Log(slice1, len(slice1), cap(slice1)) // [10 2 3 4 5] 5 5 t.Log(slice2, len(slice2), cap(slice2)) // [10 2 3 4 5] 5 5 // Note: after capacity expansion, slice1 and slice2 will no longer point to the same array. After modifying slice1 elements, slice2 elements will not be modified slice1 = append(slice1, 5, 6, 7, 8) slice1[0] = 11 // It can be found that slice1[0] is changed to 11, slice1[0] is still 10 t.Log(slice1, len(slice1), cap(slice1)) // [11 2 3 4 5 5 6 7 8] 9 10 t.Log(slice2, len(slice2), cap(slice2)) // [10 2 3 4 5] 5 5 }
When copying slices, the pointers of arrays in slices are also copied. Before triggering the expansion logic, the two slices point to the same array, and after triggering the expansion logic, they point to different arrays
Capacity expansion
Capacity expansion occurs when slice append is used. When the slice cap is insufficient to accommodate new elements, capacity expansion will occur
Source code: https://github.com/golang/go/...
func growslice(et *_type, old slice, cap int) slice { // Omit some judgment newcap := old.cap doublecap := newcap + newcap if cap > doublecap { newcap = cap } else { if old.len < 1024 { newcap = doublecap } else { // Check 0 < newcap to detect overflow // and prevent an infinite loop. for 0 < newcap && newcap < cap { newcap += newcap / 4 } // Set newcap to the requested cap when // the newcap calculation overflowed. if newcap <= 0 { newcap = cap } } } // Omit some follow-up }
- If the newly applied capacity is twice as large as the original capacity, the capacity after expansion is equal to the newly applied capacity
- If the original slice length is less than 1024, the capacity will be expanded twice each time
- If the original slice is greater than or equal to 1024, each expansion will be expanded to 1.25 times the original slice
Memory leak
Because the bottom layer of slice is an array, it is likely that the array is large, but the number of elements taken by slice is very small, which leads to the waste of most of the space occupied by the array
Case1:
For example, in the following code, if the incoming slice b is large, and then a small part is referenced to the global quantity a, the unreferenced part of b (the data after subscript 1) will not be released, resulting in the so-called memory leak.
var a []int func test(b []int) { a = b[:1] // And b share an underlying array return }
Then as long as the global quantity a is, b will not be recycled.
How to avoid?
In such a scenario, note: if we only use a small part of a slice, the entire array at the bottom will continue to be saved in memory. When the underlying array is large or there are many such scenarios, it may cause a sharp increase in memory and crash.
Therefore, in such a scenario, we can copy the required slices to a new slice to reduce the memory occupation
var a []int func test(b []int) { a = make([]int, 1) copy(a, b[:0]) return }
Case2:
For example, the slice returned by the following code is a small part, so that the original large underlying array cannot be recycled after the function exits
func test2() []int{ s = make([]int, 0, 10000) for i := 0; i < 10000; i++ { s = append(s, p) } s2 := s[100:102] return s2 }
How to avoid?
Copy the required slices to a new slice to reduce the memory occupation
func test2() []int{ s = make([]int, 0, 10000) for i := 0; i < 10000; i++ { // Some calculations s = append(s, p) } s2 := make([]int, 2) copy(s2, s[100:102]) return s2 }
Slice vs. array
The array is a fixed length. The length must be specified during initialization. If the length is not specified, it is sliced
Array is a value type. When assigning an array to another array, a deep copy is passed. The assignment and function parameter transfer operations will copy the entire array data and occupy additional memory; Slice is a reference type. When assigning a slice to another slice, a shallow copy is passed. Assignment and function parameter transfer operations will only copy len and cap, but the bottom layer shares the same array and will not occupy additional memory.
//A is an array. Note that the array is a fixed length. The length must be specified during initialization. If the length is not specified, it is slicing a := [3]int{1, 2, 3} //b is an array, a deep copy of A b := a //c is a slice, a reference type, and the underlying array is a c := a[:] for i := 0; i < len(a); i++ { a[i] = a[i] + 1 } //After changing the value of a, b is a copy of a, b remains unchanged, c is a reference, and the value of c changes fmt.Println(a) //[2,3,4] fmt.Println(b) //[1 2 3] fmt.Println(c) //[2,3,4]
//A is a slice. If the length is not specified, it is a slice a := []int{1, 2, 3} //b is a slice, a copy of A b := a //c is a slice and a reference type c := a[:] for i := 0; i < len(a); i++ { a[i] = a[i] + 1 } //After changing the value of a, b is the shallow copy of a, the value of b is modified, c is a reference, and the value of c is changed fmt.Println(a) //[2,3,4] fmt.Println(b) //[2,3,4] fmt.Println(c) //[2,3,4]
summary
- When creating slices, the capacity can be pre allocated according to the actual needs to avoid capacity expansion during the addition process as far as possible, which is conducive to improving performance
- Using append() to append elements to slices may trigger capacity expansion, and new slices will be generated after capacity expansion
- When using len() and cap() to calculate the slice length and capacity, the time complexity is O(1), and there is no need to traverse the slice
- Slicing is non thread safe. If you want to achieve thread safety, you can lock or use Channel
- When a large array is used as a function parameter, the entire array data will be copied, which consumes too much memory. It is recommended to use slices or pointers
- When the slice is used as a function parameter, the array pointed to by the slice can be changed, but the slice itself len and cap cannot be changed; To change the slice itself, you can return the changed slice or take the slice pointer as a function parameter.
- If only a small part of the large slice is used, it is recommended to copy the required slice to a new slice to reduce the memory occupation
This article is composed of blog one article multi posting platform OpenWrite release!