Go determines whether the element is in the slice

Posted by khanuja.sunpreet on Mon, 18 Oct 2021 20:14:25 +0200


1. Problems

How to judge whether an element is in the slice? Golang does not provide a direct library function to judge. The easiest implementation to think of is to judge by traversal.

2. Traversal query

Take the string slice as an example to judge whether a string is included in the string slice.

// ContainsInSlice determines whether the string is in the slice
func ContainsInSlice(items []string, item string) bool {
	for _, eachItem := range items {
		if eachItem == item {
			return true
		}
	}
	return false
}

The implementation time complexity is O(n), and N is the number of slice elements.

If the slice length is short (within 10) or is not called frequently, this performance is acceptable. However, if the slice length is long and called frequently, the performance of this method will be unacceptable. We can optimize a wave with the help of map.

3.map query

The specific implementation is to first convert slice into map, and quickly check whether elements exist in slice by querying map.

// ConvertStrSlice2Map converts the string slice to map[string]struct {}
func ConvertStrSlice2Map(sl []string) map[string]struct{} {
	set := make(map[string]struct{}, len(sl))
	for _, v := range sl {
		set[v] = struct{}{}
	}
	return set
}

// ContainsInMap determines whether the string is in the map
func ContainsInMap(m map[string]struct{}, s string) bool {
	_, ok := m[s]
	return ok
}

Note: use the empty struct {} as the type of value, because struct {} does not occupy any memory space.

fmt.Println(unsafe.Sizeof(bool(false))) // 1
fmt.Println(unsafe.Sizeof(struct{}{}))  // 0

Although the time complexity of converting slice to map is O(n), it can be ignored only once. The time complexity of querying whether the element is in the map is O(1).

4. Performance comparison

When the number of elements is 26, we can take the median element and make a benchmark to compare the query performance of the two.

func BenchmarkContainsInSlice(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ContainsInSlice(sl, "m")
	}
}

func BenchmarkContainsInMap(b *testing.B) {
	m := ConvertStrSlice2Map(sl)
	for i := 0; i < b.N; i++ {
		ContainsInMap(m, "m")
	}
}

Execute test command output:

D:\code\gotest\contain>go test -bench=.
goos: windows
goarch: amd64
pkg: main/contain
cpu: Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
BenchmarkContainsInSlice-8      30564058                38.35 ns/op
BenchmarkContainsInMap-8        134556465                8.846 ns/op
PASS
ok      main/contain    3.479s

In the test results, you can see that the - 8 values behind the function represent the values of GOMAXPROCS corresponding to the runtime. The next large number indicates the number of times to run the for loop, that is, the number of times to call the tested code. The last 38.35 ns/op indicates that it takes 38.35 nanoseconds each time.
The above is the test time. The default is 1 second, that is, 1 second. If you want to make the test run longer, you can specify it through - lunchtime, such as 5 seconds.

Performance comparison:

It can be expected that as the slice length increases, the performance gap will become larger and larger.

5. Conversion and generalization

We can use the empty interface interface {} to convert any type of slice into map, which is convenient for the caller.

// ToMapSetE converts a slice or array to map[interface{}]struct{} with error
func ToMapSetE(i interface{}) (map[interface{}]struct{}, error) {
	// judge the validation of the input
	if i == nil {
		return nil, fmt.Errorf("unable to converts %#v of type %T to map[interface{}]struct{}", i, i)
	}
	kind := reflect.TypeOf(i).Kind()
	if kind != reflect.Slice && kind != reflect.Array {
		return nil, fmt.Errorf("the input %#v of type %T isn't a slice or array", i, i)
	}

	// execute the convert
	v := reflect.ValueOf(i)
	m := make(map[interface{}]struct{}, v.Len())
	for j := 0; j < v.Len(); j++ {
		m[v.Index(j).Interface()] = struct{}{}
	}
	return m, nil
}

func main() {
	var sl = []string{"a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"}
	m, _ := ToMapSetE(sl)
	if _, ok := m["m"]; ok {
		fmt.Println("in")
	}
	if _, ok := m["mm"]; !ok {
		fmt.Println("not in")
	}
}

Run output:

in
not in

The above conversion function ToMapSetE() has been put into the open source Go tool library go-huge-util , you can import directly through go mod.

import (
	huge "github.com/dablelv/go-huge-util"
)

// Using go huge util
m, _ := huge.ToMapSetE(sl)

6. With the open source library golang set

In fact, the above is to use map to implement a set (the element does not duplicate the set), and then judge whether there is an element in a set. The Golang standard library does not have a set, but we can use a map to implement it indirectly, just like the above.

If you want to use the complete functions of set, such as initialization, Add, Del, Clear, Contains, etc., it is recommended to use the mature open source package on github golang-set . The description says Docker uses it too. The package provides two set implementations, thread safe set and non thread safe set.

golang-set Five functions for generating set s are provided:

// NewSet creates and returns a reference to an empty set.  Operations
// on the resulting set are thread-safe.
func NewSet(s ...interface{}) Set {}

// NewSetWith creates and returns a new set with the given elements.
// Operations on the resulting set are thread-safe.
func NewSetWith(elts ...interface{}) Set {}

// NewSetFromSlice creates and returns a reference to a set from an
// existing slice.  Operations on the resulting set are thread-safe.
func NewSetFromSlice(s []interface{}) Set {}

// NewThreadUnsafeSet creates and returns a reference to an empty set.
// Operations on the resulting set are not thread-safe.
func NewThreadUnsafeSet() Set {}

// NewThreadUnsafeSetFromSlice creates and returns a reference to a
// set from an existing slice.  Operations on the resulting set are
// not thread-safe.
func NewThreadUnsafeSetFromSlice(s []interface{}) Set {}

Now with the help of golang-set To determine whether an element exists in the slice.

package main

import (
	"fmt"

	mapset "github.com/deckarep/golang-set"
)

func main() {
	var sl = []interface{}{"a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r",
		"s", "t",
		"u", "v", "w", "x", "y", "z"}
	s := mapset.NewSetFromSlice(sl)
	fmt.Println(s.Contains("m"))	// true
	fmt.Println(s.Contains("mm"))	// false
}

7. Summary

This paper starts with the problem "judge whether the element is in the slice", and gives the relevant implementation methods. This problem can be extended and abstracted as "how to turn slice into a set with non repeating elements", and give its own general conversion function go-huge-util ToMapSetE().

Of course, there are many mature and excellent code bases on the Internet for direct use, such as golang-set , interested students can have an in-depth understanding of its usage and implementation.

reference

Know how to use set in Go
Golang Benchmark

Topics: Go