unordered_map multi thread crash in find

Posted by aconite on Fri, 17 Dec 2021 11:37:00 +0100

collapse
Recently, there was a crash in the program. After query, it was found that the crash occurred in the query of STL container.

The screenshot of the crash is as follows:

About unordered_map
Unordered map container_ The map container does not sort the stored data like the map container.

unordered_ The bottom layer of the map container adopts the hash table storage structure. The structure itself does not have the function of sorting data, so the stored key value pairs will not be sorted by itself.

When an associated container deletes an element, the current iterator will fail, and other iterators will not fail. When an element is added, the iterator will not fail.

Thread safety assurance:

  • Multithreaded simultaneous reading
  • Single threaded write

That is, the map container does not guarantee thread safety for reading and writing.

If one thread writes while other threads read, there will be a concurrency problem, which may lead to a crash.

Test code
You can simply write the test program:

#include <unordered_map>
#include <map>
#include <future>
#include <string>
#include <vector>
#include <thread>
#include <iostream>
#include <unistd.h>
#include <sys/syscall.h>                 /* For SYS_xxx definitions */

using namespace std;

unordered_map<string, double> test_map;

string code = "000001.SZ";

void write_map()
{
    cout << "write_map start, threadid: " << syscall(SYS_gettid) << endl;

    while (true) {
        test_map[code] = 3;
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }
}

void read_map()
{
    cout << "read_map start, threadid: " << syscall(SYS_gettid) << endl;
    double price;
    while (true) {
        auto iter = test_map.find(code);
        if (iter == test_map.end()) {
            cout << "not found|" << endl;
            continue;
        }
        price = iter->second;
    }
}

int main()
{
    vector<thread> threads_;
    for (int i = 0; i < 1; ++i) {
        threads_.emplace_back(std::thread(write_map));
    }
    for (int i = 10; i < 20; ++i) {
        threads_.emplace_back(std::thread(read_map));
    }

    for (std::thread& t : threads_) {
        if (t.joinable()) {
            t.join();
        }
    }

    return 0;
}

Compile and run, open the core file dump, and run it several times to get the core file.

gdb debugging will see the debugging information at the beginning of this article.

std::shared_mutex
Using mutex can solve the problem of data competition, but it will affect the system performance.

STD:: shared is introduced into c++17_ Mutex, which is used to manage mutually exclusive objects that can transfer and share ownership.

It is applicable to a special scenario: one or more read threads read the shared resource at the same time, and only one write thread writes the resource. In this case, it can be read from shared_mutex gains performance advantages.

Using STD:: shared_ The example code of mutex is as follows:

// from https://en.cppreference.com/w/cpp/thread/shared_mutex
#include <iostream>
#include <mutex>
#include <shared_mutex>
#include <thread>
 
class ThreadSafeCounter {
 public:
  ThreadSafeCounter() = default;
 
  // Multiple threads/readers can read the counter's value at the same time.
  unsigned int get() const {
    std::shared_lock lock(mutex_);
    return value_;
  }
 
  // Only one thread/writer can increment/write the counter's value.
  unsigned int increment() {
    std::unique_lock lock(mutex_);
    return ++value_;
  }
 
  // Only one thread/writer can reset/write the counter's value.
  void reset() {
    std::unique_lock lock(mutex_);
    value_ = 0;
  }
 
 private:
  mutable std::shared_mutex mutex_;
  unsigned int value_ = 0;
};
 
int main() {
  ThreadSafeCounter counter;
 
  auto increment_and_print = [&counter]() {
    for (int i = 0; i < 3; i++) {
      std::cout << std::this_thread::get_id() << ' ' << counter.increment() << '\n';
 
      // Note: Writing to std::cout actually needs to be synchronized as well
      // by another std::mutex. This has been omitted to keep the example small.
    }
  };
 
  std::thread thread1(increment_and_print);
  std::thread thread2(increment_and_print);
 
  thread1.join();
  thread2.join();
}
 
// Explanation: The output below was generated on a single-core machine. When
// thread1 starts, it enters the loop for the first time and calls increment()
// followed by get(). However, before it can print the returned value to
// std::cout, the scheduler puts thread1 to sleep and wakes up thread2, which
// obviously has time enough to run all three loop iterations at once. Back to
// thread1, still in the first loop iteration, it finally prints its local copy
// of the counter's value, which is 1, to std::cout and then runs the remaining
// two loop iterations. On a multi-core machine, none of the threads is put to
// sleep and the output is more likely to be in ascending order.

Summary
For STL containers, thread safety needs to be carefully considered.

Using locks can simply solve this kind of data competition problem.

You can also consider using other methods, such as thread safe STL or boost thread safe container.

Topics: C++ Container