There is no one of the strongest code self-test methods in history!

Posted by egg82 on Mon, 07 Mar 2022 09:00:58 +0100

Special note: This is really not the title party. I have written code for 20 + years. I really think go fuzzing is the most awesome code self-test method I have ever seen. When I used the AC automata algorithm to improve the keyword filtering efficiency (by ~ 50%) and the processing mechanism of mapreduce for panic, I found the edge bug through go fuzzing. So I deeply believe that this is one of the most awesome code self-test methods I have ever seen!

So far, go fuzzing has found more than 200 bug s in the Go standard library with high code quality. See: github.com/dvyukov/go-fuzz#trophie...

The blessing of programmers during the Spring Festival is often that your code will never be buggy! Although ridiculed, it is true that every programmer writes bugs every day. There is no bug in the code. It can only be falsified, not proved. The upcoming Go 1.18 official provides a great tool to help us prove falsification - go fuzzing.

Go 1.18 people are most concerned about generics. However, I really think go fuzzing is one of the most useful functions of go 1.18!

Let's take a closer look at fuzzing in this article:

What is it?
How to use it?
What are the best practices?

First, you need to upgrade to Go 1.18

Although Go 1.18 has not been officially released, you can download the RC version, and even if you use the earlier version of Go in production, you can use go fuzzing in the development environment to find bug s

What is go fuzzing

according to Official documents Introduction: go fuzzing is to automatically test a program by continuously giving different inputs, and intelligently find failed cases by analyzing code coverage. This method can find some edge cases as much as possible. The pro test does find some problems that are difficult to find at ordinary times.

How to use go fuzzing

Some rules for writing fuzzy tests are officially introduced:

The function must start with fuzzy, and the only parameter is * testing F. No return value
Fuzzy tests must be in*_ test. Files in go
The fuzzy target in the figure above is a method call (* testing. F) Fuzzy, the first parameter is * testing T. Then there is the parameter called fuzzing arguments, which has no return value
There can only be one fuzzy target in each fuzzy test
When calling f.Add(...), the parameter type should be consistent with the order and type of fuzzy arguments
fuzzing arguments only supports the following types:
- string, []byte
- int, int8, int16, int32/rune, int64
- uint, uint8/byte, uint16, uint32, uint64
- float32, float64
- bool
Fuzzy target doesn't depend on the global state. It will run in parallel.

Running fuzzing tests

If I write a fuzzing test, for example:

// See for specific code https://github.com/zeromicro/go-zero/blob/master/core/mr/mapreduce_fuzz_test.go
func FuzzMapReduce(f *testing.F) {
  ...
}

Then we can do this:

go test -fuzz=MapReduce

We will get similar results as follows:

fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 3s, execs: 3338 (1112/sec), new interesting: 56 (total: 57)
fuzz: elapsed: 6s, execs: 6770 (1144/sec), new interesting: 62 (total: 63)
fuzz: elapsed: 9s, execs: 10157 (1129/sec), new interesting: 69 (total: 70)
fuzz: elapsed: 12s, execs: 13586 (1143/sec), new interesting: 72 (total: 73)
^Cfuzz: elapsed: 13s, execs: 14031 (1084/sec), new interesting: 72 (total: 73)
PASS
ok    github.com/zeromicro/go-zero/core/mr  13.169s

In ^ C, I press ctrl-C to terminate the test. For detailed explanation, please refer to the official documents.

Best practices for go zero

According to the experience I have used, I preliminarily summarize the best practice into the following four steps:

To define fuzzing arguments, you must first understand how to define fuzzing arguments and write the fuzzing target through the given fuzzing arguments
Think about how to write the fuzzy target. The focus here is how to verify the correctness of the results. Because the fuzzy arguments are given "randomly", there should be a general result verification method
Think about how to print the results of failed case s, so as to generate a new unit test
Write a new unit test according to the printing results of the failed fuzzing test. This new unit test will be used to debug and solve the problems found in the fuzzing test, and solidify it for CI

Next, we will show the above steps with the simplest array summation function. The actual case of go zero is slightly complex. At the end of the paper, I will give the internal landing case of go zero for your reference.

This is a bug injected summation code implementation:

func Sum(vals []int64) int64 {
  var total int64

  for _, val := range vals {
    if val%1e5 != 0 {
      total += val
    }
  }

  return total
}

1. Defining fuzzy arguments

You need to give at least one fuzzing argument, otherwise go fuzzing can't generate test code, so even if we don't have good input, we need to define a fuzzing argument that will affect the result. Here, we use the number of slice elements as fuzzing arguments, Then go fuzzing will automatically generate different parameters according to the code coverage to simulate the test.

func FuzzSum(f *testing.F) {
  f.Add(10)
  f.Fuzz(func(t *testing.T, n int) {
    n %= 20
    ...
  })
}

n here is to let go fuzzing simulate the number of slice elements. In order to ensure that the number of elements is not too many, we limit it to 20 (0 is no problem), and we add a corpus with a value of 10 (called corpus in go fuzzing). This value is a value for the cold start of go fuzzing. The specific number is not important.

2. How to write fuzzy target

The focus of this step is how to write verifiable fuzzing target. While writing test code according to the given fuzzing arguments, it also needs to generate data to verify the correctness of the results.

For our Sum function, it is actually relatively simple. It is to randomly generate slice of n elements, and then Sum to calculate the desired result. As follows:

func FuzzSum(f *testing.F) {
  rand.Seed(time.Now().UnixNano())

  f.Add(10)
  f.Fuzz(func(t *testing.T, n int) {
    n %= 20
    var vals []int64
    var expect int64
    for i := 0; i < n; i++ {
      val := rand.Int63() % 1e6
      vals = append(vals, val)
      expect += val
    }

    assert.Equal(t, expect, Sum(vals))
  })
}

This code is still easy to understand. It's just to compare it with Sum, so I won't explain it in detail. But for complex scenarios, you need to think carefully about how to write verification code, but it won't be too difficult. If it's too difficult, it may be that you don't understand or simplify the test function enough.

At this time, you can run fuzzing tests with the following command, and the results are similar to the following:

$ go test -fuzz=Sum
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 0s, execs: 6672 (33646/sec), new interesting: 7 (total: 6)
--- FAIL: FuzzSum (0.21s)
    --- FAIL: FuzzSum (0.00s)
        sum_fuzz_test.go:34:
              Error Trace:  sum_fuzz_test.go:34
                                  value.go:556
                                  value.go:339
                                  fuzz.go:334
              Error:        Not equal:
                            expected: 8736932
                            actual  : 8636932
              Test:         FuzzSum

    Failing input written to testdata/fuzz/FuzzSum/739002313aceff0ff5ef993030bbde9115541cabee2554e6c9f3faaf581f2004
    To re-run:
    go test -run=FuzzSum/739002313aceff0ff5ef993030bbde9115541cabee2554e6c9f3faaf581f2004
FAIL
exit status 1
FAIL  github.com/kevwan/fuzzing  0.614s

So here comes the question...! We see that the result is wrong, but it is difficult for us to analyze why it is wrong. You carefully taste the output of the above paragraph. How do you analyze it?

3. How to print and input failure case s

For the above failed tests, if we can print out the input and form a simple test case, we can debug directly. The printed input should be directly copied / pasted into the new test case. If the format is wrong, you need to adjust the format line by line for so many lines of input, which is too tired, and there may not be only one failure case.

So we changed the code to the following:

func FuzzSum(f *testing.F) {
  rand.Seed(time.Now().UnixNano())

  f.Add(10)
  f.Fuzz(func(t *testing.T, n int) {
    n %= 20
    var vals []int64
    var expect int64
    var buf strings.Builder
    buf.WriteString("\n")
    for i := 0; i < n; i++ {
      val := rand.Int63() % 1e6
      vals = append(vals, val)
      expect += val
      buf.WriteString(fmt.Sprintf("%d,\n", val))
    }

    assert.Equal(t, expect, Sum(vals), buf.String())
  })
}

Run the command again and get the following results:

$ go test -fuzz=Sum
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 0s, execs: 1402 (10028/sec), new interesting: 10 (total: 8)
--- FAIL: FuzzSum (0.16s)
    --- FAIL: FuzzSum (0.00s)
        sum_fuzz_test.go:34:
              Error Trace:  sum_fuzz_test.go:34
                                  value.go:556
                                  value.go:339
                                  fuzz.go:334
              Error:        Not equal:
                            expected: 5823336
                            actual  : 5623336
              Test:         FuzzSum
              Messages:
                            799023,
                            110387,
                            811082,
                            115543,
                            859422,
                            997646,
                            200000,
                            399008,
                            7905,
                            931332,
                            591988,

    Failing input written to testdata/fuzz/FuzzSum/26d024acf85aae88f3291bf7e1c6f473eab8b051f2adb1bf05d4491bc49f5767
    To re-run:
    go test -run=FuzzSum/26d024acf85aae88f3291bf7e1c6f473eab8b051f2adb1bf05d4491bc49f5767
FAIL
exit status 1
FAIL  github.com/kevwan/fuzzing  0.602s

4. Write new test cases

According to the output of the above failure case, we can copy/paste to generate the following code. Of course, the framework is written by itself, and the input parameters can be copied directly.

func TestSumFuzzCase1(t *testing.T) {
  vals := []int64{
    799023,
    110387,
    811082,
    115543,
    859422,
    997646,
    200000,
    399008,
    7905,
    931332,
    591988,
  }
  assert.Equal(t, int64(5823336), Sum(vals))
}

In this way, we can easily debug and add an effective unit test to ensure that the bug will never appear again.

go fuzzing more experience

Go version problem

I believe that after the release of Go 1.18, most of the online code of the project will not be upgraded to 1.18 immediately, so the testing introduced by go fuzzing What if f cannot be used?

Online (go.mod) is not upgraded to Go 1.18, but we fully recommend upgrading this machine. At this time, we only need to put the above fuzzy sum into a file name similar to sum_fuzz_test.go file, and then add the following instructions to the file header:

//go:build go1.18
// +build go1.18

Note: the third line must be an empty line, otherwise it will become a comment of the package.

In this way, no matter which version we use online, we will not report an error, and we generally run fuzzy testing locally and will not be affected.

go fuzzing failure that cannot be repeated

The above steps are for simple cases, but sometimes when a new unit test is formed according to the input obtained from the failure case and the problem cannot be reproduced (especially the goroutine deadlock problem), the problem becomes complicated. The following output is for you to feel:

go test -fuzz=MapReduce
fuzz: elapsed: 0s, gathering baseline coverage: 0/2 completed
fuzz: elapsed: 0s, gathering baseline coverage: 2/2 completed, now fuzzing with 10 workers
fuzz: elapsed: 3s, execs: 3681 (1227/sec), new interesting: 54 (total: 55)
...
fuzz: elapsed: 1m21s, execs: 92705 (1101/sec), new interesting: 85 (total: 86)
--- FAIL: FuzzMapReduce (80.96s)
    fuzzing process hung or terminated unexpectedly: exit status 2
    Failing input written to testdata/fuzz/FuzzMapReduce/ee6a61e8c968adad2e629fba11984532cac5d177c4899d3e0b7c2949a0a3d840
    To re-run:
    go test -run=FuzzMapReduce/ee6a61e8c968adad2e629fba11984532cac5d177c4899d3e0b7c2949a0a3d840
FAIL
exit status 1
FAIL  github.com/zeromicro/go-zero/core/mr  81.471s

In this case, it just tells us that the fuzzy process is stuck or ends abnormally, and the status code is 2. In this case, generally re run will not reappear. Why simply return error code 2? I carefully looked at the source code of go fuzzing. Each fuzzing test is run by a separate process. Then go fuzzing threw away the process output of the fuzzy test, only displaying the status code. So how can we solve this problem?

After careful analysis, I decided to write a conventional unit test code similar to fuzzy test, so as to ensure that the failure is in the same process and print the error information to the standard output. The code is roughly as follows:

func TestSumFuzzRandom(t *testing.T) {
  const times = 100000
  rand.Seed(time.Now().UnixNano())

  for i := 0; i < times; i++ {
    n := rand.Intn(20)
    var vals []int64
    var expect int64
    var buf strings.Builder
    buf.WriteString("\n")
    for i := 0; i < n; i++ {
      val := rand.Int63() % 1e6
      vals = append(vals, val)
      expect += val
      buf.WriteString(fmt.Sprintf("%d,\n", val))
    }

    assert.Equal(t, expect, Sum(vals), buf.String())
  }
}

In this way, we can simply simulate go fuzzing by ourselves, but we can get clear output for any error. Maybe I haven't studied go fuzzing thoroughly here, or there are other ways to control it. If you know, thank you for telling me.

However, for this simulation case that needs to run for a long time, we don't want it to be executed every time during CI, so I put it in a separate file with a file name similar to sum_fuzzcase_test.go, and add the following instructions to the file header:

//go:build fuzz
// +build fuzz

In this way, we need to add - tags fuzz y when running the simulation case, for example:

go test -tags fuzz ./...

Complex usage examples

The above description is a simple example. If you don't know how to write in a complex scene, you can first see how go zero is applied to go fuzzing, as shown below:

MapReduce - github.com/zeromicro/go-zero/tree/...
- Fuzzy tests deadlock and goroutine leak, especially the complex scenario of chan + goroutine, which can be used for reference
stringx - github.com/zeromicro/go-zero/tree/...
- The fuzzy test implements the conventional algorithm, which can be used for reference for algorithm scenarios

Project address

github.com/zeromicro/go-zero

Welcome to go zero and star support us!

Wechat communication group

Focus on the "micro service practice" official account and click on the exchange group to get the community community's two-dimensional code.

Topics: Go Framework Microservices go-zero

Programmer Think