for & range performance comparison

Posted by lip9000 on Thu, 10 Feb 2022 08:56:34 +0100

origin

Recently used Go brush LeetCode
First question, sum of two numbers. When solving the problem, I use traversal to solve. I accidentally find that the beats of for and range are inconsistent. In the spirit of in-depth research (I don't know anything), I want to compare the performance of the two.
Article reference Geek rabbit The original of the boss

explore

Since you want to compare, use data to speak.
The GO test command can not only do unit tests, but also support bench performance comparison. The specific operation is studied by ourselves, so this paper will not do in-depth research.
Basic commands:

go test -bench .

First, verify the traversal of int type

func genIntSlice(n int) []int {
    rand.Seed(time.Now().UnixNano())
    nums := make([]int, 0, n)
    for i := 0; i < n; i++ {
        nums = append(nums, rand.Int())
    }
    return nums
}

func BenchmarkForIntSlice(b *testing.B) {
    nums := genIntSlice(1024 * 1024)
    for i := 0; i < b.N; i++ {
        length := len(nums)
        var tmp int
        for k := 0; k < length; k++ {
            tmp = nums[k]
        }
        _ = tmp
    }
}

func BenchmarkRangeIntSlice(b *testing.B) {
    nums := genIntSlice(1024 * 1024)
    for i := 0; i < b.N; i++ {
        var tmp int
        for _, num := range nums {
            tmp = num
        }
        _ = tmp
    }
}

The results are as follows:


There's no difference. Is the conclusion that there is no difference between the two? Let's verify the complex type again

thorough

Let's see what's the difference between traversing struct

type Item struct {
    id int
    val [4096]byte
}

func BenchmarkForStruct(b *testing.B) {
    var items [1024]Item
    for i := 0; i < b.N; i++ {
        length := len(items)
        var tmp int
        for k := 0; k < length; k++ {
            tmp = items[k].id
        }
        _ = tmp
    }
}

func BenchmarkRangeIndexStruct(b *testing.B) {
    var items [1024]Item
    for i := 0; i < b.N; i++ {
        var tmp int
        for k := range items {
            tmp = items[k].id
        }
        _ = tmp
    }
}

func BenchmarkRangeStruct(b *testing.B) {
    var items [1024]Item
    for i := 0; i < b.N; i++ {
        var tmp int
        for _, item := range items {
            tmp = item.id
        }
        _ = tmp
    }
}

The results are as follows:

The difference is obvious, isn't it!

  • In the case of traversing only subscripts, the performance of for and range is almost the same.
  • The type of each element of items is a structure type Item, which is composed of two fields, one is int and the other is [4096]byte. That is to say, each Item instance needs to apply for about 4KB of memory.
  • In this example, the performance of for is about 2000 times that of range (traversing subscripts and values at the same time).

Further on, what if we traverse the pointer type?

func generateItems(n int) []*Item {
    items := make([]*Item, 0, n)
    for i := 0; i < n; i++ {
        items = append(items, &Item{id: i})
    }
    return items
}

func BenchmarkForPointer(b *testing.B) {
    items := generateItems(1024)
    for i := 0; i < b.N; i++ {
        length := len(items)
        var tmp int
        for k := 0; k < length; k++ {
            tmp = items[k].id
        }
        _ = tmp
    }
}

func BenchmarkRangePointer(b *testing.B) {
    items := generateItems(1024)
    for i := 0; i < b.N; i++ {
        var tmp int
        for _, item := range items {
            tmp = item.id
        }
        _ = tmp
    }
}

The results are as follows:

As you can see, there is little difference

In fact, range returns a copy of the iteration value during the iteration. This can also be verified simply:

persons := []struct{ no int }{{no: 1}, {no: 2}, {no: 3}}
for _, s := range persons {
    s.no += 10
}
for i := 0; i < len(persons); i++ {
    persons[i].no += 100
}
fmt.Println(persons) // [{101} {102} {103}]

final

Range returns a copy of the iteration value during the iteration. If the memory occupation of the elements in each iteration is very low, the performance of for and range is almost the same, such as [] int. However, if the memory consumption of iterative elements is high, such as a struct structure containing many attributes, the performance of for will be significantly higher than that of range, and sometimes there will be thousands of times of performance difference. For this scenario, it is recommended to use for. If range is used, it is recommended to only iterate the subscript and access the iteration value through the subscript. This use is no different from for. If you want to use range to iterate over subscripts and values at the same time, you need to change the elements of the slice / array to pointers in order not to affect performance.

Topics: Go