Source code interpretation
The readme of concurrent map says that this is a high-performance concurrent security map. Let's look at the source code to explain how it achieves high performance.
https://github.com/orcaman/concurrent-map/blob/master/concurrent_map.go
The source code is quite concise, only 343 lines. Let's first look at how to design the data structure
var SHARD_COUNT = 32 // A "thread" safe map of type string:Anything. // To avoid lock bottlenecks this map is dived to several (SHARD_COUNT) map shards. type ConcurrentMap []*ConcurrentMapShared // A "thread" safe string to anything map. type ConcurrentMapShared struct { items map[string]interface{} sync.RWMutex // Read Write mutex, guards access to internal map. } // Creates a new concurrent map. func New() ConcurrentMap { m := make(ConcurrentMap, SHARD_COUNT) for i := 0; i < SHARD_COUNT; i++ { m[i] = &ConcurrentMapShared{items: make(map[string]interface{})} } return m }
ConcurrentMap is a real external structure. Internally, it is a ConcurrentMapShared with 32 elements. Internally, ConcurrentMapShared encapsulates an anonymous read-write lock sync.RWMutex and a native map.
From here, we can roughly guess how it achieves high performance during concurrency. For a non concurrent secure map, a global lock must be added to achieve concurrent security. Here, 32 map structures and 32 locks are used to reduce lock waiting by reducing the granularity of locks.
Basic interface
ConcurrentMap provides the basic interface that a map should have
// Get partition key func (m ConcurrentMap) GetShard(key string) *ConcurrentMapShared // Merge map func (m ConcurrentMap) MSet(data map[string]interface{}) // Add an element func (m ConcurrentMap) Set(key string, value interface{}) // Get an element func (m ConcurrentMap) Get(key string) (interface{}, bool) // Calculate how many elements there are func (m ConcurrentMap) Count() int //Determine whether the element is saved func (m ConcurrentMap) Has(key string) bool // Removes the specified element func (m ConcurrentMap) Remove(key string) // Gets and removes the specified element func (m ConcurrentMap) Pop(key string) (v interface{}, exists bool) // Determine whether it is a map func (m ConcurrentMap) IsEmpty() bool // Empty map func (m ConcurrentMap) Clear()
set interface
Let's first look at the set interface:
func (m ConcurrentMap) Set(key string, value interface{}) { // Get map shard. shard := m.GetShard(key) shard.Lock() shard.items[key] = value shard.Unlock() }
According to the request key, find the partition where the key is located, lock the partition, then set it, and finally unlock it. By hashing the key, the global locks that should be set are distributed to 32 fine-grained partition locks to reduce the waiting probability when obtaining the lock, so as to improve the concurrency.
// GetShard returns shard under given key func (m ConcurrentMap) GetShard(key string) *ConcurrentMapShared { return m[uint(fnv32(key))%uint(SHARD_COUNT)] }
The partition selection is also relatively simple. hash the request key once, take the module and find the corresponding partition.
get interface
// Get retrieves an element from map under given key. func (m ConcurrentMap) Get(key string) (interface{}, bool) { // Get shard shard := m.GetShard(key) shard.RLock() // Get item from shard. val, ok := shard.items[key] shard.RUnlock() return val, ok }
The get interface is basically the same as the set interface, except that the granularity of locking is different. The set interface is a write lock to ensure serial writing. The get interface is a read lock. When there is no write operation, any coroutine can obtain the read lock and read data.
count interface
func (m ConcurrentMap) Count() int { count := 0 for i := 0; i < SHARD_COUNT; i++ { shard := m[i] shard.RLock() count += len(shard.items) shard.RUnlock() } return count }
The implementation of the count interface is to accumulate the number of elements in the traversal partition after locking the traversal partition separately. Personally, I think the value of this count is inaccurate in the case of high concurrency. The inaccuracy here is that when only count is called, the number of real elements in the map may be different from that after the call. Because the lock is added to the partition, when traversing partition 2, a new element is written to partition 1. Since the data written to partition 1 does not affect partition 2, there is a difference between the real number of partition 1 and the accumulated number of partition 1. Of course, in high concurrency scenarios, you don't have to worry about the accuracy of count.
Other basic interfaces are similar to those above, that is, reading class interfaces, adding read locks after obtaining partitions to ensure that reads are not mutually exclusive; Modify the class operation. After obtaining the partition, add a write lock to ensure consistency.
Advanced interface
It also provides some advanced interfaces, such as callback function interface,
1. Insert update callback
type UpsertCb func(exist bool, valueInMap interface{}, newValue interface{}) interface{} // Insert or Update - updates existing element or inserts a new one using UpsertCb func (m ConcurrentMap) Upsert(key string, value interface{}, cb UpsertCb) (res interface{}) { shard := m.GetShard(key) shard.Lock() v, ok := shard.items[key] res = cb(ok, v, value) shard.items[key] = res shard.Unlock() return res }
2. Remove callback
// RemoveCb is a callback executed in a map.RemoveCb() call, while Lock is held // If returns true, the element will be removed from the map type RemoveCb func(key string, v interface{}, exists bool) bool // RemoveCb locks the shard containing the key, retrieves its current value and calls the callback with those params // If callback returns true and element exists, it will remove it from the map // Returns the value returned by the callback (even if element was not present in the map) func (m ConcurrentMap) RemoveCb(key string, cb RemoveCb) bool { // Try to get shard. shard := m.GetShard(key) shard.Lock() v, ok := shard.items[key] remove := cb(key, v, ok) if remove && ok { delete(shard.items, key) } shard.Unlock() return remove }
3. Iterative callback
// Iterator callback,called for every key,value found in // maps. RLock is held for all calls for a given shard // therefore callback sess consistent view of a shard, // but not across the shards type IterCb func(key string, v interface{}) // Callback based iterator, cheapest way to read // all elements in a map. func (m ConcurrentMap) IterCb(fn IterCb) { for idx := range m { shard := (m)[idx] shard.RLock() for key, value := range shard.items { fn(key, value) } shard.RUnlock() } }
Benchmarking
The code is compared with the sync.map provided by go.
func BenchmarkSingleInsertPresent(b *testing.B) { m := New() m.Set("key", "value") b.ResetTimer() for i := 0; i < b.N; i++ { m.Set("key", "value") } } func BenchmarkSingleInsertPresentSyncMap(b *testing.B) { var m sync.Map m.Store("key", "value") b.ResetTimer() for i := 0; i < b.N; i++ { m.Store("key", "value") } }
go test -bench=InsertPresent -benchtime 5s goos: linux goarch: amd64 pkg: concurrent-map BenchmarkSingleInsertPresent-8 172822759 34.9 ns/op BenchmarkSingleInsertPresentSyncMap-8 65351324 92.9 ns/op
From the result, set is a fixed element. Concurrent map is about three times faster than sync.map.
Let's look at the performance when inserting different key s.
func benchmarkMultiInsertDifferent(b *testing.B) { m := New() finished := make(chan struct{}, b.N) _, set := GetSet(m, finished) b.ResetTimer() for i := 0; i < b.N; i++ { go set(strconv.Itoa(i), "value") } for i := 0; i < b.N; i++ { <-finished } } func BenchmarkMultiInsertDifferentSyncMap(b *testing.B) { var m sync.Map finished := make(chan struct{}, b.N) _, set := GetSetSyncMap(&m, finished) b.ResetTimer() for i := 0; i < b.N; i++ { go set(strconv.Itoa(i), "value") } for i := 0; i < b.N; i++ { <-finished } } func BenchmarkMultiInsertDifferent_1_Shard(b *testing.B) { runWithShards(benchmarkMultiInsertDifferent, b, 1) } func BenchmarkMultiInsertDifferent_16_Shard(b *testing.B) { runWithShards(benchmarkMultiInsertDifferent, b, 16) } func BenchmarkMultiInsertDifferent_32_Shard(b *testing.B) { runWithShards(benchmarkMultiInsertDifferent, b, 32) } func BenchmarkMultiInsertDifferent_256_Shard(b *testing.B) { runWithShards(benchmarkMultiGetSetDifferent, b, 256) } func runWithShards(bench func(b *testing.B), b *testing.B, shardsCount int) { oldShardsCount := SHARD_COUNT SHARD_COUNT = shardsCount bench(b) SHARD_COUNT = oldShardsCount }
go test -bench=InsertDifferent -benchtime 5s goos: linux goarch: amd64 pkg: concurrent-map BenchmarkMultiInsertDifferentSyncMap-8 560900 11996 ns/op BenchmarkMultiInsertDifferent_1_Shard-8 1000000 7499 ns/op BenchmarkMultiInsertDifferent_16_Shard-8 10377100 662 ns/op BenchmarkMultiInsertDifferent_32_Shard-8 10511775 603 ns/op BenchmarkMultiInsertDifferent_64_Shard-8 11624546 590 ns/op BenchmarkMultiInsertDifferent_128_Shard-8 11773946 578 ns/op BenchmarkMultiInsertDifferent_256_Shard-8 7914397 912 ns/op
Sync.map seems to perform worst when inserting different key s. When the number of partitions of concurrent map is set to 1, it can be considered that a global read-write lock is added to a single map, which is faster than sync.map. However, there is a big difference between sync.map and concurrent map with partition 1 in multiple tests. Sometimes sync.map is fast, and sometimes concurrent map with partition 1 is fast. No single is faster than a concurrent map with a partition of more than 16. Moreover, the larger the number of partitions, the faster. When the number of partitions is 256, the execution speed has begun to slow down.