HashMap source code learning

Posted by Cailean on Wed, 15 Dec 2021 03:59:33 +0100

HashMap source code learning

brief introduction

HashMap is implemented by array + linked list. It adopts the key value pair of key/value. Each key corresponds to a unique value. The efficiency of query and modification is very fast. It can reach the average time complexity of O (1). It is non thread safe and can not guarantee the storage order of elements.

storage structure

Array + linked list is used. In case of hash conflict, linked list is used to solve hash conflict. hashmap defines an array variable transient node < K, V > [] table; Node is a static inner class that implements map Entry < K, V >, using a linked list to point to the next node.

        final int hash;
        final K key;
        V value;
        Node<K,V> next;


HashMap adopts array + linked list + red black tree, and an array subscript stores the Node linked list. When adding elements, the hash value will be calculated according to the key, and the subscript in the array will be calculated.
When the length of the linked list exceeds 8, it will be transformed into a red black tree and the elements will be reduced to 6. When the length of the linked list exceeds 6, it will turn the red black tree into a linked list to improve efficiency. The query efficiency of array is O(1), the linked list is O (n), and the query efficiency of red black tree is O(log n).

Attribute Variable

/**
 * The default array length is 16. The HashMap empty constructor is used for the first expansion
 */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;

/**
 * The maximum capacity is the 30th power of 2 and the maximum value of array length
 */
static final int MAXIMUM_CAPACITY = 1 << 30;

/**
 * The default load factor determines the capacity expansion when the capacity reaches.
 */
static final float DEFAULT_LOAD_FACTOR = 0.75f;

/**
 * When the length of the linked list of an array is greater than or equal to 8, it is transformed into a red black tree
 */
static final int TREEIFY_THRESHOLD = 8;

/**
 * When the linked list of an array is a red black tree structure but is less than or equal to 6 after subsequent deletion, the red black tree is converted to an ordinary linked list
 */
static final int UNTREEIFY_THRESHOLD = 6;

/**
 * When the length of the array is greater than
 */
static final int MIN_TREEIFY_CAPACITY = 64;

/**
 * Array, also known as bucket
 */
transient Node<K,V>[] table;

/**
 * Cache as entrySet()
 */
transient Set<Map.Entry<K,V>> entrySet;

/**
 * Number of elements
 */
transient int size;

/**
 * Modification times, which is used to execute the quick failure strategy during iteration
 */
transient int modCount;

/**
 * Expand the capacity when the number of buckets reaches, threshold = capacity * loadFactor
 */
int threshold;

/**
 * Loading factor
 */
final float loadFactor;

Inner class

The array of HashMap is a node array. Node is a typical single linked list node. Hash is used to store the hash value calculated by key.

    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
    }

As mentioned above, when the linked list will be transformed into a red black tree, HashMap defines the static internal class of TreeNode, which is a tree structure. As defined below

    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
    }

Construction method

Null construction method, and the property uses the default value

    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; //Specified expansion factor 0.75
    }

The constructor that specifies the container size and expansion factor. HashMap(int initialCapacity) also calls this function

    public HashMap(int initialCapacity, float loadFactor) {
        // Check whether the incoming initial capacity is legal
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        // Check whether the loading factor is legal
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

put(K key, V value)

Add element

    public V put(K key, V value) {
        //Call hash(key) to calculate the hash value of key
        return putVal(hash(key), key, value, false, true);
    }
    static final int hash(Object key) {
        int h;
        //If the key is null, the hash value is 0; Otherwise, use the hashcode of the key to obtain the hash, and let the hash and the high-order 16 bit XOR obtain the hash value
        //int type has 4 bytes and 32 bits, which is equivalent to making the high 16 bits and low 16 bits exclusive or to ensure that the calculated hash is more dispersed.
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
    //Add element to container
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        //tab array temporary variable, p for linked list lookup, n record array length, i numeric index temporary variable.
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        //When the array bit is empty or the array length is 0, expand the range to a new array length
        if ((tab = table) == null || (n = tab.length) == 0)
            //Call resize to expand the capacity
            n = (tab = resize()).length;
        //Hash value and array length and operation (n-1) & hash calculates the index bit of the current key. If the current index bit is empty, assign a value to the index bit directly
        if ((p = tab[i = (n - 1) & hash]) == null)
            //Assign a new node to the specified index bit
            tab[i] = newNode(hash, key, value, null);
        //If the current index bit has a value, it indicates that the key is a hash conflict, and the linked list logic should be handled
        else {
            Node<K,V> e; K k;
            //If the first value of the index bit is the same as the hash of the new key and the key value is the same, the node node is recorded as e for subsequent data coverage
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            //If the first element is a tree node, the putTreeVal of the tree node is called to insert the element
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                //Traverse the linked list of index bits, and bitcount is used to store the number of elements in the linked list
                for (int binCount = 0; ; ++binCount) {
                    //If the next node is empty, add a node after the p node (tail interpolation) and jump out of the current cycle
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        //After a new node is added, when the length of the linked list is greater than or equal to 8, it is transformed into a red black tree
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            //When the length of the linked list is greater than or equal to 8
                            //treeifyBin logic: first judge whether the array size is less than 64, and then call resize to expand the capacity; Otherwise, the linked list will be treelized.
                            treeifyBin(tab, hash);
                        break;
                    }
                    //Judge whether there are other same key s in the linked list and record the nodes
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            //When the same key value exists in the container, e is not null
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                //Determine whether the old value needs to be replaced
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                //Do something after the node is accessed, which is used in LinkedHashMap
                afterNodeAccess(e);
                return oldValue;
            }
        }
        //Description: the same key is not found. Modification times + 1
        ++modCount;
        //Number of storage elements + 1 and judge whether expansion is required
        if (++size > threshold)
            resize();
        // Add 1 to the number of elements to determine whether expansion is required
        afterNodeInsertion(evict);
        return null;
    }
    //When adding elements, the tree processing will be judged when the length of the linked list is greater than or equal to 8.
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        //Judge the length and size of the current array
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize(); //When the array length is less than 64, it is expanded directly
        //If the array index bit linked list is not empty, it is converted to a red black tree
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
                //Convert the Node linked list into a red black tree.
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

Summary HashMap add elements:

  1. The hash value is obtained by calling the hash (key) method according to the key, and the temporal part is obtained by using the high 16 bits and low 16 bits XOR of the key value
  2. Call the putVal(int hash, K key, V value, boolean onlyIfAbsent,boolean evict) method
  3. First, judge whether the array length is 0. If yes, expand the capacity.
  4. If the array length is not 0, the hash and array length sum operations determine the array subscript. If the following table does not exist, the values are assigned directly
  5. If the array bit has a value, judge whether the same key exists. If it exists, record the node of the old key and replace it; If it does not exist, add a node at the end of the linked list. After adding, judge the length of the linked list and whether it needs to be transformed into a red black tree
  6. After adding a new node, the container element + 1 and the expansion value determine whether to expand.

resize method

HashMap allows automatic capacity expansion. put() knows that HashMap can be expanded when the array length is 0 or the number of container elements is greater than the capacity expansion threshold.

    final Node<K,V>[] resize() {
        //Array before capacity expansion, old array
        Node<K,V>[] oldTab = table;
        //Old array length size
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        //The capacity expansion point of the old array is greater than this value
        int oldThr = threshold;
        int newCap, newThr = 0;
        //If the length of the old array is greater than 0
        if (oldCap > 0) {
            //If the length of the old array is greater than or equal to the maximum container value of 1 < < 30, the capacity will not be expanded
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //If the old array length * 2 is less than or equal to the maximum array length and the old array length is greater than or equal to 16, double the capacity expansion.
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshol
            //For a map created using a non default construction method, this is where the first element is inserted
            //If the old capacity is 0 and the old capacity expansion threshold is greater than 0, assign the new capacity to the old threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            //Call the default parameterless constructor, which will go here the first time you add an element
            //The new array length is set to the default value of 16, and the expansion threshold is: array length * expansion factor. The default is 16 * 0.75 = 12
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            // If the new capacity expansion threshold is 0, it is calculated as capacity * loading factor, but it cannot exceed the maximum capacity
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        //Set the new capacity expansion threshold to the hashmap attribute
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        //Create a new capacity array
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        //Assign the hashmap attribute table to newTable
        table = newTab;
        //If the old array is not empty, start array migration
        if (oldTab != null) {
            //Traversing the old array
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                //If the first element in the old array is not empty, it is assigned to e
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    //If the array subscript has only one node, the node is relocated and migrated to the new array.
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                         // If the first element is a tree node, break up the tree into two trees and insert them into a new bucket
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        //This shows that the linked list has more than one element and is not a red black tree
                        //Because it is double expanded, you need to split the linked list into two parts and insert it into the new table.
                        //For example, if the length of the old array is 4, the hash values are 3, 7, 11 and 15 at the position of array subscript 3.
                        //After capacity expansion, the length is 8, then 3 and 11 are still in position 3, and 7 and 16 are in position 7
                        //Four nodes are used to receive, and the head node is directly assigned to the array subscript
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        //The next temporary variable is used to traverse the linked list
                        Node<K,V> next;
                        do {
                            next = e.next;
                            //(e.hash & oldcap) = = 0 elements are placed in the low linked list, such as 3 & 4 = 0
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {//(e.hash & oldCap) !=  The elements of 0 are placed in the high-order linked list, such as 7 & 4= 0
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        //Assign values after traversing the linked list. When there are still values in the low linked list, the assignment is at the original subscript position of the old array
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        //During capacity expansion, the high-order linked list is assigned exactly to the position of the subscript of the old array + the length of the old array (that is, bit 3 is transferred to bit 7)
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

Summary of capacity expansion mechanism

1. Capacity expansion is performed when the number of container elements is greater than the capacity expansion threshold
2. If an empty constructor is used, it will be initialized to the default value when inserting an element for the first time, with a capacity of 16 and a capacity expansion threshold of 12;
3. If the construction method with parameters is used, the initialization capacity is equal to the expansion threshold when the element is inserted for the first time, and the expansion threshold is equal to the nth power of the nearest 2 upward of the incoming capacity in the construction method;
4 if the length of the old array is greater than 0, the new array is equal to 2 times of the old array, but does not exceed the 30th power of the maximum length of 2, and the new capacity expansion threshold is 2 times of the old capacity expansion threshold;
5 create a new array. The length of the array is the size after expansion
6. Migrate the data in the old array to the new array, and the linked list is divided into two parts. Hash & the low-order node with length 0 of the old array is saved in the original index bit, and the high-order node not 0 is saved in the original subscript bit + the subscript bit of the old array length.

get(Object key)

Obtain the value value according to the key value, and the time complexity is O (1).

    public V get(Object key) {
        Node<K,V> e;
        //Just as the put element is added, the hash(key) method is called to calculate the hash value of the key
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        //Get the subscript bit according to the hash & array length, and the subscript bit has a value
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            // The first element is not the element to check. If it is, it is returned directly
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            //If the first element is not the element to be found and the linked list has more than one element
            if ((e = first.next) != null) {
                //When the first node is a tree node, it is searched according to the tree structure
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                //Otherwise, it will traverse the whole linked list until it finds the value to be found. If not, it will return null.
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }
final TreeNode<K, V> getTreeNode(int h, Object k) {
    // Find from the root node of the tree
    return ((parent != null) ? root() : this).find(h, k, null);
}

final TreeNode<K, V> find(int h, Object k, Class<?> kc) {
    TreeNode<K, V> p = this;
    do {
        int ph, dir;
        K pk;
        //pl left child number. pr right word tree
        TreeNode<K, V> pl = p.left, pr = p.right, q;
        //If the hash value of the p node is greater than h, it is traversed from the left word tree
        if ((ph = p.hash) > h)
            p = pl;
        //If the hash value of the p node is less than h, it is traversed from the right node
        else if (ph < h)
            p = pr;
        // Direct return found
        else if ((pk = p.key) == k || (k != null && k.equals(pk)))
            return p;
         // The hash es are the same but the key s are different. The left subtree is empty. Check the right subtree
        else if (pl == null)
            p = pr;
        //The right subtree is empty. Check the left subtree
        else if (pr == null)
            p = pl;
        else if ((kc != null ||
                (kc = comparableClassFor(k)) != null) &&
                (dir = compareComparables(kc, k, pk)) != 0)
            // Compare the key value through the compare method to determine whether to use the left subtree or the right subtree
            p = (dir < 0) ? pl : pr;
        else if ((q = pr.find(h, k, kc)) != null)
            // If the above conditions fail, try to find it in the right subtree
            return q;
        else
            // If you can't find them, just look in the left subtree
            p = pl;
    } while (p != null);
    return null;
}

remove(Object key)

    public V remove(Object key) {
        Node<K,V> e;
        //Call the hash method to calculate the hash value of the key
        return (e = removeNode(hash(key), key, null, false, true)) == null ?
            null : e.value;
    }
    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
         // The array length is greater than 0 and the first element of the array subscript of the element to be deleted is not empty
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            Node<K,V> node = null, e; K k; V v;
             // The first element is exactly the element to be found. It is assigned to the node variable for subsequent deletion
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            //The subscript is a linked list with a length greater than 1
            else if ((e = p.next) != null) {
                //If the first element is a tree node, the node is found as a tree
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    // Traverse the entire linked list to find nodes
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        //Record the previous node to delete
                        p = e;
                    } while ((e = e.next) != null);
                }
            }
            //If the element is found, it depends on whether the parameter needs to match the value value. If it does not need to match, it is deleted directly. If it needs to match, it depends on whether the value value is equal to the passed in value
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                // If it is a tree node, call the delete method of the tree
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                //If the element to be deleted is the first element, move the second element to the first position
                else if (node == p)
                    tab[index] = node.next;
                else
                   //Delete the node. The subsequent nodes of p point to the processing nodes of node.
                    p.next = node.next;
                ++modCount;
                --size;
                // Post processing after deleting nodes
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

hashmap logic to delete according to key value

1 calculate the hash value of the key and calculate the subscript of the key in the array
2. Find the node application node in the key container. The tree node is implemented by calling the TreeNode class, and the linked list is searched by using the do while loop
3 delete node, container size-1.

Other common methods

  //wipe data 
  public void clear() {
        Node<K,V>[] tab;
        modCount++;
        if ((tab = table) != null && size > 0) {
            size = 0;
            //Traverse the array and directly assign null
            for (int i = 0; i < tab.length; ++i)
                tab[i] = null;
        }
    }
    //Whether to include the specified key. The implementation is the same as that of get(key)
    public boolean containsKey(Object key) {
        return getNode(hash(key), key) != null;
    }
    //Whether to include a value value, double-layer for loop. No linked list for comparison
    public boolean containsValue(Object value) {
        Node<K,V>[] tab; V v;
        if ((tab = table) != null && size > 0) {
            for (int i = 0; i < tab.length; ++i) {
                for (Node<K,V> e = tab[i]; e != null; e = e.next) {
                    if ((v = e.value) == value ||
                        (value != null && value.equals(v)))
                        return true;
                }
            }
        }
        return false;
    }

summary

  1. HashMap is a hash table, which adopts the storage structure of (array + linked list + red black tree);
  2. The default initial capacity of HashMap is 16 (1 < < 4), the default loading factor is 0.75f, and the capacity is expanded twice
  3. When the length of the array is greater than 64 and the number of linked list elements is greater than 8, tree
  4. When the length of the array is less than 64 and the number of linked list elements is greater than 8, the tree is not directly expanded
  5. When the number of elements in the linked list is less than 6, de tree is performed;
  6. HashMap is a non thread safe container.

I hope you guys can help point out mistakes and learn and make progress together

Topics: Java linked list