[Learning Notes - Java Collection - 8] Map - Concurrent HashMap Source Code Analysis

Posted by JackW on Sat, 17 Aug 2019 11:12:25 +0200

Delete elements

Like adding elements, deleting elements first finds the bucket in which the elements are located, then uses the idea of sectional locking to lock the whole bucket, and then carries on the operation.


public V remove(Object key) {
    // Call the Replacement Node Method
    return replaceNode(key, null, null);
}

final V replaceNode(Object key, V value, Object cv) {
    // Computing hash
    int hash = spread(key.hashCode());
    // spin
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0 ||
                (f = tabAt(tab, i = (n - 1) & hash)) == null)
            // If the bucket in which the target key is located does not exist, jump out of the loop and return to null
            break;
        else if ((fh = f.hash) == MOVED)
            // If expansion is under way, assist in expansion
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            // Has the tag been processed?
            boolean validated = false;
            synchronized (f) {
                // Verify again whether the first element of the current bucket has been modified
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        // FH > = 0 denotes a linked list node
                        validated = true;
                        // Traversing the linked list to find the target node
                        for (Node<K,V> e = f, pred = null;;) {
                            K ek;
                            if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                            (ek != null && key.equals(ek)))) {
                                // Find the target node
                                V ev = e.val;
                                // Check whether the old value of the target node is equal to cv
                                if (cv == null || cv == ev ||
                                        (ev != null && cv.equals(ev))) {
                                    oldVal = ev;
                                    if (value != null)
                                        // If value is not empty, replace the old value
                                        e.val = value;
                                    else if (pred != null)
                                        // If the leading node is not empty
                                        // Delete the current node
                                        pred.next = e.next;
                                    else
                                        // If the leading node is empty
                                        // The description is the first element in the bucket, delete it
                                        setTabAt(tab, i, e.next);
                                }
                                break;
                            }
                            pred = e;
                            // Traverse to the end of the list to find no elements, jump out of the loop
                            if ((e = e.next) == null)
                                break;
                        }
                    }
                    else if (f instanceof TreeBin) {
                        // If it is a tree node
                        validated = true;
                        TreeBin<K,V> t = (TreeBin<K,V>)f;
                        TreeNode<K,V> r, p;
                        // Traversing the tree to find the target node
                        if ((r = t.root) != null &&
                                (p = r.findTreeNode(hash, key, null)) != null) {
                            V pv = p.val;
                            // Check whether the old value of the target node is equal to cv
                            if (cv == null || cv == pv ||
                                    (pv != null && cv.equals(pv))) {
                                oldVal = pv;
                                if (value != null)
                                    // If value is not empty, replace the old value
                                    p.val = value;
                                else if (t.removeTreeNode(p))
                                    // Delete elements if value is empty
                                    // If the number of elements in the deleted tree is small, it will degenerate into a linked list.
                                    // t.removeTreeNode(p) This method returns true to indicate that the number of elements in the tree after deleting a node is small.
                                    setTabAt(tab, i, untreeify(t.first));
                            }
                        }
                    }
                }
            }
            // If processed, the element is returned regardless of whether it is found or not.
            if (validated) {
                // If an element is found, its old value is returned.
                if (oldVal != null) {
                    // If the value to be replaced is null, the number of elements is reduced by 1
                    if (value == null)
                        addCount(-1L, -1);
                    return oldVal;
                }
                break;
            }
        }
    }
    // No element found returns empty
    return null;
}
  1. hash is calculated.
  2. If the bucket does not exist, the target element is not found and returned.
  3. If expansion is under way, assist in deletion operation after expansion is completed.
  4. If it is stored in the form of a linked list, it traverses the entire list to find elements, and then deletes them.
  5. If it is stored in the form of a tree, it traverses the tree to find elements, and then deletes them.
  6. If it is stored in the form of a tree, the tree is smaller after deleting the elements, then it degenerates into a linked list.
  7. If the element is deleted, the number of map elements is reduced by 1 and the old value is returned.
  8. If no element is deleted, null is returned.

Get elements

The key point is to rewrite the find() method.


public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    // Computing hash
    int h = spread(key.hashCode());
    // If the bucket in which the element is located exists and there are elements in it
    if ((tab = table) != null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) != null) {
        // If the first element is the element to be found, return it directly
        if ((eh = e.hash) == h) {
            if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            // hash is less than 0, indicating a tree or expanding
            // Using find to find elements, find can be found in different ways according to different subclasses of Node.
            return (p = e.find(h, key)) != null ? p.val : null;

        // Traversing the entire list to find elements
        while ((e = e.next) != null) {
            if (e.hash == h &&
                    ((ek = e.key) == key || (ek != null && key.equals(ek))))
                return e.val;
        }
    }
    return null;
}
  1. hash to the barrel where the element is located;
  2. If the first element in the bucket is the element to be found, return it directly.
  3. If it is a tree or an element being migrated, the find() method of each Node subclass is called to find the element.
  4. If it is a linked list, traverse the entire list to find elements;
  5. The acquisition element is not locked;

Get the number of elements

The storage of the number of elements is also based on the idea of segmenting. When obtaining the number of elements, all segments need to be added up.


public int size() {
    // Call sumCount() to calculate the number of elements
    long n = sumCount();
    return ((n < 0L) ? 0 :
            (n > (long)Integer.MAX_VALUE) ? Integer.MAX_VALUE :
                    (int)n);
}

final long sumCount() {
    // Calculate the sum of all CounterCell segments and baseCount numbers
    CounterCell[] as = counterCells; CounterCell a;
    long sum = baseCount;
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            if ((a = as[i]) != null)
                sum += a.value;
        }
    }
    return sum;
}
  1. The number of elements exists in different segments according to different threads; (see addCounter() analysis)
  2. Calculate the sum of all CounterCell segments and baseCount numbers.
  3. The number of elements acquired is not locked;

summary

(1) Concurrent HashMap is a thread-safe version of HashMap;

(2) Concurrent HashMap uses the structure of array + linked list + red-black tree to store elements.

(3) Concurrent HashMap is much more efficient than the same thread-safe HashTable.

(4) Concurrent HashMap uses synchronized locks, CAS locks, spin locks, sectional locks, volatile locks, etc.

(5) There are no threshold s and loadFactor fields in Concurrent HashMap, but sizeCtl is used to control it.

(6) sizeCtl = 1, indicating that initialization is in progress;

(7) sizeCtl = 0, the default value, which indicates that the default capacity is used for subsequent real initialization;

(8) sizeCtl > 0, which stores the incoming capacity before initialization and the next threshold after initialization or expansion.

(9) sizeCtl = resizeStamp < 16 + (1 + nThreads) indicates that expansion is under way. The number of high-bit storage expansion postmarks and low-bit storage expansion threads is increased by 1.

(10) If expansion is under way during the update operation, the current thread assists expansion;

(11) The update operation uses synchronized locking to lock the first element of the current bucket, which is the idea of sectional locking.

(12) The whole process of expansion is controlled by CAS in the field of sizeCtl, which is very important.

(13) A Forwarding Node node will be placed in the barrel after migration to mark the completion of migration of the barrel.

(14) The storage of the number of elements is also based on the idea of segmentation, similar to the implementation of LongAdder.

(15) Updating the number of elements will hash different threads to different segments, reducing resource contention;

(16) If there are multiple threads updating a segment at the same time, the CounterCell will be expanded.

(17) The number of elements obtained is the sum of all segments (including baseCount and CountCell);

(18) Query operations are not locked, so Concurrent HashMap is not strongly consistent;

(19) Elements whose key or value is null cannot be stored in Concurrent HashMap;

Key Learning Points

Technology worth learning in Concurrent HashMap

(1) The idea of CAS + spin and optimistic lock can reduce the time of thread context switching.

(2) The idea of segmented lock can reduce the inefficiency caused by the same lock contention.

(3) CounterCell, which stores the number of elements in segments, reduces the inefficiency caused by multi-threading and updating a field at the same time;

(4) @sun.misc.Contend (Notes on CounterCell) to avoid pseudo-sharing;

(5) Multithread collaborative expansion;

Concurrent HashMap Concurrent Usage Problem

Look at the following examples:

private static final Map<Integer, Integer> map = new ConcurrentHashMap<>();

public void unsafeUpdate(Integer key, Integer value) {
    Integer oldValue = map.get(key);
    if (oldValue == null) {
        map.put(key, value);
    }
}

Concurrent HashMap cannot guarantee thread safety if multiple threads call unsafeUpdate() at the same time.

Because there may be other threads that have put() this element before get(), then put() overrides the element of that thread put().

So how to modify it?

Using the putIfAbsent() method, it ensures that elements are inserted only when they do not exist, as follows:

public void safeUpdate(Integer key, Integer value) {
    map.putIfAbsent(key, value);
}

So what if the oldValue above is not compared with null, but with a specific value such as 1? That is to say, the following:


public void unsafeUpdate(Integer key, Integer value) {
    Integer oldValue = map.get(key);
    if (oldValue == 1) {
        map.put(key, value);
    }
}

In this way, it is impossible to use the putIfAbsent() method.

In fact, Concurrent HashMap also provides another method called replace(K key, V oldValue, V newValue) to solve this problem.

replace(K key, V oldValue, V newValue) This method can not be misused, if the incoming new Value is null, the element will be deleted.

public void safeUpdate(Integer key, Integer value) {
    map.replace(key, 1, value);
}

So, what if if the if is followed not by a simple put() operation, but by other business operations, and then by put(), such as the following?

public void unsafeUpdate(Integer key, Integer value) {
    Integer oldValue = map.get(key);
    if (oldValue == 1) {
        System.out.println(System.currentTimeMillis());
        /**
         * Other business operations
         */
        System.out.println(System.currentTimeMillis());

        map.put(key, value);
    }
}

There is no way to use the method provided by Concurrent HashMap at this time, only the business itself to ensure thread safety, such as the following:


public void safeUpdate(Integer key, Integer value) {
    synchronized (map) {
        Integer oldValue = map.get(key);
        if (oldValue == null) {
            System.out.println(System.currentTimeMillis());
            /**
             * Other business operations
             */
            System.out.println(System.currentTimeMillis());

            map.put(key, value);
        }
    }
}

This is not very friendly, but at least it ensures that the business logic is correct.
Of course, the use of Concurrent HashMap here is of little significance, can be replaced by ordinary HashMap.

Topics: Java less