put and resize of HashMap

Posted by chawezul on Sun, 02 Jan 2022 19:32:06 +0100

brief introduction
Map collection is a collection class that we often use. Let's take a look at its inheritance diagram:

It stores data according to the hashCode value of the key. In most cases, it can directly locate its value, so it has fast access speed, but the traversal order is uncertain. HashMap only allows the key of one record to be null at most, and the value of multiple records to be null. HashMap is not thread safe, that is, multiple threads can write HashMap at any time, which may lead to inconsistent data. If you need to meet thread safety, you can use the synchronized map method of Collections to make HashMap thread safe, or use ConcurrentHashMap.
Storage structure:
The storage structure of hashMap is array + linked list + red black tree, as shown in the figure:

Node [] is what we often call the bit bucket array. Let's see what's in it!

hash: is to locate the index in the array
Key: it is the key in the map
Value: is the value in the map
Next: point to its next Node
Node is an internal class of HashMap, which implements map The entry interface is essentially a mapping (key value pair). Each black dot in the figure above is a node object.
HashMap is stored using Hash table. In order to solve the conflict of Hash table, open address method and chain address method can be used to solve the problem. HashMap in Java adopts chain address method. Chain address method, in short, is the combination of array and linked list. There is a linked list structure on each array element. When the data is hashed, the array subscript is obtained and the data is placed on the linked list of the corresponding subscript element. Closed Hash: also known as open addressing method. In case of Hash conflict, if the Hash table is not full, it means that there must be an empty position in the Hash table, then the key can be stored in the "next" empty position in the conflict position.
Before understanding the Hash and capacity expansion process, we must first understand several fields of HashMap. From the source code of the default constructor of HashMap, the constructor initializes the following fields. The source code is as follows:
int threshold; // Limit of key value pairs that can be accommodated
final float loadFactor; // Load factor
int modCount; // Modification times
int size; // Number of key value pairs stored in map
First, the initialization length of the Node[] table is length (the default value is 16), the load factor is the load factor (the default value is 0.75), and the threshold is the number of nodes (key value pairs) with the maximum amount of data that the HashMap can accommodate. threshold = length * Load factor. In other words, after the length of the array is defined, the larger the load factor, the more key value pairs can be accommodated.
put method in Map:

JDK1. The source code of the put method of 8hashmap is as follows

 Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)//If the table is empty, it needs to be expanded. It will be seen when the first key value pair is placed (debbug by itself)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)//Calculate the index and handle null
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash && // Node key exists, directly overwriting value
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)  //Judge that the chain is a red black tree
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else { // The chain is a linked list
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                    p = e;
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                return oldValue;
        if (++size > threshold)
         //Capacity expansion when the maximum capacity is exceeded
        return null;

resize() of Map:
Resizing is to recalculate the capacity and constantly add elements to the HashMap object. When the array inside the HashMap object cannot load more elements, the object needs to expand the length of the array so that more elements can be loaded. Of course, arrays in Java cannot be automatically expanded. The method is to use a new array to replace the existing array with small capacity, just like we use a small bucket of water. If we want to hold more water, we have to change a large bucket.
For ease of understanding, we use jdk1 7 code, easy to understand, essentially the difference is not big

void resize(int newCapacity) {   //Incoming new capacity
    Entry[] oldTable = table;    //Reference the Entry array before capacity expansion
    int oldCapacity = oldTable.length;         
    if (oldCapacity == MAXIMUM_CAPACITY) {  //If the array size before capacity expansion has reached the maximum (2 ^ 30)
        threshold = Integer.MAX_VALUE; //Modify the threshold value to the maximum value of int (2 ^ 31-1), so that the capacity will not be expanded in the future
    Entry[] newTable = new Entry[newCapacity];  //Initializes a new Entry array
    transfer(newTable);                         //!! Transfer the data to the new Entry array
    table = newTable;                           //The table property of HashMap refers to the new Entry array
    threshold = (int)(newCapacity * loadFactor);//Modify threshold