[source code] HashMap source code reading

Posted by richinsc on Mon, 24 Jan 2022 16:09:56 +0100

2020-02-02

The following is the source code analysis of HashMap in JDK 11. The analysis of the code will be mainly reflected in the form of comments.

1 Overview

1.1 main concepts of HashMap

  1. HashMap is a hash table implemented based on the Map interface, which realizes all operations in the Map interface. Moreover, HashMap allows the key to be null and the value to be null. The corresponding Hashtable cannot set the key and value to be null. HashMap cannot guarantee the order of elements. In particular, it cannot guarantee that the order will remain unchanged over time.

  2. HashMap provides constant time performance for basic operations (get and put), assuming that the hash function properly disperses elements in buckets. The time required for the iteration of the Collection is proportional to the "capacity" of the HashMap instance and its size. Therefore, if the performance of the iteration is important and demanding, it is very important not to set the initial capacity too high (or the load factor is too low).

  3. HashMap is a thread unsafe collection, that is, when multiple threads access, if it is impossible to ensure that only one thread modifies the HashMap at the same time, the HashMap will be destroyed and a ConcurrentModificationException will be thrown.

1.2 basic implementation of HashMap

HashMap uses a hash table (array + single linked list) at the bottom. When the linked list is too long, it will be transformed into a red black tree to find in O(logn) time complexity.
HashMap is defined as class HashMap < K, V > extends abstractmap < K, V > implements map < K, V >, Cloneable, serializable

1.3 HashMap internal class

1.Node
2.KeySet
3.Values
4.EntrySet
5.HashIterator
6.KeyIterator
7.ValueIterator
8.EntryIterator
9.HashMapSpliterator
10.KeySpliterator
11.ValueSpliterator
12.EntrySpliterator
13.TreeNode represents the red black tree node. The methods of red black tree operation in HashMap are all in this class

1.4 capacity expansion principle

HashMap adopts the capacity expansion strategy of doubling each time. In this way, the Entry in the original position is either still in the original position or in the original position + the original capacity position in the newly expanded array.

1.5 hash calculation

HashMap calculates the hash value through the hash() function (also known as "perturbation function"), and the method is key = = null? 0 : (h = key.hashCode()) ^ h >>> 16;, The calculated hash value is stored in node In hash.

The value calculation of hash is equivalent to XOR the high 16 bits and the bottom 16 bits. The result is that the high 16 bits remain unchanged and the bottom 16 bits become XOR. Why do you do this? The reason is that the array size before HashMap expansion is only 16, and the hash value cannot be used directly. In the length modulo operation, only the rightmost bits in the binary are taken, and the high-order binary information is not used. The result is that the hash results are not evenly distributed. After the high 16 bits and the bottom 16 bits are XOR, the low bit can be attached with the high information, increasing the randomness of the low bit.

After the high and low XOR operation of the hash value is completed, the length of the XOR result is modeled to obtain the final result. Specific reference What is the hash method principle of HashMap in JDK source code- Pangjun's answer - Zhihu

1.6 insertion principle

In the hash calculation, the null hash value is 0, and then it is inserted according to the normal putVal().

1.7 new HashMap()

From the source code (constructor below), we can see that new HashMap() costs very little and only confirms the loading factor. The actual operation of creating a table should be delayed as much as possible, which makes many operations in HashMap need to check whether the table is initialized. One advantage of this design is that it can create a large number of hashmaps at one time without worrying about the overhead of HashMap.

2 member variables of HashMap

public class HashMap<K, V> extends AbstractMap<K, V> implements Map<K, V>, Cloneable, Serializable {
	//For serialization
	private static final long serialVersionUID = 362498820763181265L;
	//The default capacity of HashMap is 16
	static final int DEFAULT_INITIAL_CAPACITY = 16;
	//The maximum capacity is 1073741824 (the 30th power of 2, i.e. 1 < < 30)
	static final int MAXIMUM_CAPACITY = 1073741824;
	//The default load factor is 0.75f
	static final float DEFAULT_LOAD_FACTOR = 0.75F;
	/*
	The threshold for converting the linked list into a red black tree is 8, that is, when the length of the linked list > = 8, the linked list is converted into a red black tree, that is, tree formation.
	Why tree? Think about why we use HashMap because we can find elements through the hash algorithm in the ideal case with time complexity O(1), which is particularly fast, but only in the ideal case. If we encounter hash collisions and frequent collisions, when we get a element, we locate the array, You also need to traverse the linked list once in this array to find the element to get. Has you lost the original heart of HashMap? (because you need to traverse the linked list, the time complexity is higher).
	Therefore, using the data structure of red black tree to solve the problem of long linked list can be understood as that red black tree traversal is faster than linked list traversal and has low time complexity.
	*/
	static final int TREEIFY_THRESHOLD = 8;
	//The threshold of converting red and black trees into linked lists is 6 (<6), which is called TreeNode. in the process of resize(). Split() implementation
	static final int UNTREEIFY_THRESHOLD = 6;
	/*
	Minimum treelization threshold. To tree is not just to surpass tree_ Threshold, and the capacity should exceed min_ TREEIFY_ Capability, if it only exceeds tree_ Threshold, the capacity will be expanded (call resize()). Why is capacity expansion rather than tree formation at this time?
	The reason is that the linked list is too long, or the array (bucket) is too short, that is, the capacity is too small. For example, if the length of the array is 1, then all the elements are crowded in the zero position of the array. At this time, even if the tree is just a temporary solution, it is not the root cause, because the root cause of the long linked list is that the array is too short.
	Therefore, before tree forming (linked list length > = 8), the array length will be checked. If the length is less than 64, the array will be expanded instead of tree forming.
	*/
	static final int MIN_TREEIFY_CAPACITY = 64;
	/*
	The array body definition of the hash table will not be initialized in the constructor during initialization, so it is always necessary to check whether the table is null in various operations.
	*/
	transient HashMap.Node<K, V>[] table;
	/*
	As an entrySet cache, use entrySet first to check whether it is null. If it is not null, use this cache. Otherwise, generate an entrySet and cache it here.
	*/
	transient Set<Entry<K, V>> entrySet;
	//Number of entries in HashMap
	transient int size;
	/*
	Record the number of internal structural modifications to achieve fail fast. The concurrent modificationexception is thrown by detecting this.
	*/
	transient int modCount;
	//Its value = capacity*loadFactor. When the size exceeds the threshold, it will be expanded once
	int threshold;
	//Loading factor
	final float loadFactor;

	......
}

3 HashMap method

3.1 constructor

  1. This constructor verifies the validity of the capacity and load factor, but does not store the capacity. It is only used to determine the capacity expansion threshold.
public HashMap(int initialCapacity, float loadFactor) {
	if (initialCapacity < 0) {
		throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
	} else {
		if (initialCapacity > 1073741824) {
			initialCapacity = 1073741824;
		}

		if (loadFactor > 0.0F && !Float.isNaN(loadFactor)) {
			this.loadFactor = loadFactor;
			this.threshold = tableSizeFor(initialCapacity);
		} else {
			throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
		}
	}
}
  1. Incoming only initial capacity
public HashMap(int initialCapacity) {
	this(initialCapacity, 0.75F);
}
  1. The parameterless constructor simply confirms the load factor
public HashMap() {
	this.loadFactor = 0.75F;
}
  1. When constructing a HashMap from a Map, use the default load factor and call putMapEntries to load the Map into the HashMap
	public HashMap(Map<? extends K, ? extends V> m) {
		this.loadFactor = 0.75F;
		this.putMapEntries(m, false);
	}

	final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
		//Get the actual length of the map
	    int s = m.size();
	    if (s > 0) {
	    	//Judge whether the table is initialized. If it is not initialized
	        if (this.table == null) {
	        	/**Calculate the required capacity, because the actual length = capacity * 0.75, + 1 is because the decimal is divided, and it will not be an integer. The capacity size
				If it cannot be decimal, it will be converted to int, and the extra decimal will be lost, so + 1. For example, if the actual length of the map is 29.3, the required capacity is 30.3
	        	*/
	            float ft = (float)s / this.loadFactor + 1.0F;
	            //Judge whether the capacity exceeds the upper limit
	            int t = ft < 1.07374182E9F ? (int)ft : 1073741824;
	            //Initialize the critical value. The tableSizeFor(t) method will return the nearest second power greater than the value of T. for example, if t is 29, the returned value is 32
	            if (t > this.threshold) {
	                this.threshold = tableSizeFor(t);
	            }
	        } else if (s > this.threshold) {  //If the table has been initialized, perform the capacity expansion operation. resize is the capacity expansion
	            this.resize();
	        }

	        Iterator var8 = m.entrySet().iterator();

	        //Traverse and transfer the data in the map to hashmap
	        while(var8.hasNext()) {
	            Entry<? extends K, ? extends V> e = (Entry)var8.next();
	            K key = e.getKey();
	            V value = e.getValue();
	            this.putVal(hash(key), key, value, false, evict);
	        }
	    }

	}

This constructor passes in a Map, and then converts the Map into a hashMap. The resize method will be explained in detail when adding elements below. Above, the entrySet method will return a set < Map Entry < K, V > >, the generic type is the internal class entry of Map, which is an instance storing key values, that is, each key value in the Map is an entry instance. Why use this method for traversal because of its high efficiency? putVal method saves each key value extracted into hashMap.

3.2 hash(Object key)

The hash function is responsible for generating hashcode. If it is empty, it returns 0. Otherwise, it returns the XOR result of the upper 16 bits and the bottom 16 bits of the key.

static final int hash(Object key) {
	int h;
	return key == null ? 0 : (h = key.hashCode()) ^ h >>> 16;
}

3.3 comparableClassFor(Object x)

This method is used to judge whether the incoming Object x implements the Comparable interface. If the incoming Object is a String Object, it naturally implements the Comparable interface and returns directly. However, for other classes, for example, we write a class Object ourselves and store it in HashMap. However, as far as HashMap is concerned, it does not know whether we have implemented the Comparable interface, or even whether we use generics in the Comparable interface, and which class the generics specifically use.

static Class<?> comparableClassFor(Object x) {
	if (x instanceof Comparable) {
		Class c;
		if ((c = x.getClass()) == String.class) {
			return c;
		}

		ype[] ts;
		if ((ts = c.getGenericInterfaces()) != null) {
			Type[] var5 = ts;
			int var6 = ts.length;

			for(int var7 = 0; var7 < var6; ++var7) {
				Type t = var5[var7];
				Type[] as;
				ParameterizedType p;
				if (t instanceof ParameterizedType && (p = (ParameterizedType)t).getRawType() == Comparable.class && (as = p.getActualTypeArguments()) != null && as.length == 1 && as[0] == c) {
					return c;
				}
			}
		}
	}
	return null;
}

3.4 compareComparables(Class<?> kc, Object k, Object x)

If x is null, return 0; If the type of X is kc, compareTo(x) is returned.

static int compareComparables(Class<?> kc, Object k, Object x) {
	return x != null && x.getClass() == kc ? ((Comparable)k).compareTo(x) : 0;
}

3.5 tableSizeFor(int cap)

This function is used to calculate the integer power greater than or equal to the minimum 2 of cap and to calculate the length of table. The numberOfLeadingZeros() method returns the number of zeros before the highest non-zero bit of unsigned integer i, including sign bits.

static final int tableSizeFor(int cap) {
	int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
	return n < 0 ? 1 : (n >= 1073741824 ? 1073741824 : n + 1);
}

3.6 put(K key, V value)

	public V put(K key, V value) {
		//Four parameters, the first hash value. The fourth parameter indicates that if the key has a value, if it is null, a new value will be inserted. The last parameter is useless in hashMap and can be ignored.
        return this.putVal(hash(key), key, value, false, true);
    }

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
    	//tab hash array. p is the first node of the hash bucket, n is the length of the hashMap, and i is the calculated array subscript
        HashMap.Node[] tab;
        int n;
        //Obtain the length and expand the capacity. Lazy loading is used. The table is not loaded at the beginning, and it will not be loaded until puthou
        if ((tab = this.table) == null || (n = tab.length) == 0) {
            n = (tab = this.resize()).length;
        }

        Object p;
        int i;
        //If the calculated position of the hash bucket has no value, the newly inserted key value is placed here. Even if the insertion is not successful, that is, the first node of the hash bucket will be given p in case of hash conflict
        if ((p = tab[i = n - 1 & hash]) == null) {
            tab[i] = this.newNode(hash, key, value, (HashMap.Node)null);
        } else {  //Several situations of hash conflict
        	//e is the function of a temporary node. k stores the key value of the current node
            Object e;
            Object k;
            //First, the hash value of the inserted key value. If the key is equal to the current node, e=p indicates the first node
            if (((HashMap.Node)p).hash == hash && ((k = ((HashMap.Node)p).key) == key || key != null && key.equals(k))) {
                e = p;
            }
            //Second, if the hash value is not equal to the first node, judge whether the p belongs to the node of the red black tree
            else if (p instanceof HashMap.TreeNode) {
            	/*
            	If the node is a red black tree, add it in the red black tree. If the node already exists, return the node (not null),
            	This value is very important to judge whether the put operation is successful. If the addition is successful, null will be returned
            	*/
                e = ((HashMap.TreeNode)p).putTreeVal(this, tab, hash, key, value);
            }
            //Third, the hash value is not equal to the first node. If it is not a red black tree node, it is a node of the linked list
            else {
            	//Traverse the linked list
                int binCount = 0;
                while(true) {
                	//If the tail is found, it indicates that the added key value is not repeated, and it is added in the tail.
                    if ((e = ((HashMap.Node)p).next) == null) {
                        ((HashMap.Node)p).next = this.newNode(hash, key, value, (HashMap.Node)null);
                        //Determine whether to convert to red black tree structure
                        if (binCount >= 7) {
                            this.treeifyBin(tab, hash);
                        }
                        break;
                    }

                    //If the linked list has duplicate key s, e is the current duplicate node, and the cycle ends.
                    if (((HashMap.Node)e).hash == hash && ((k = ((HashMap.Node)e).key) == key || key != null && key.equals(k))) {
                        break;
                    }

                    p = e;
                    ++binCount;
                }
            }
            //If e is not null, it indicates that there are duplicate key s, then overwrite with the value to be inserted and return the old value.
            if (e != null) {
                V oldValue = ((HashMap.Node)e).value;
                if (!onlyIfAbsent || oldValue == null) {
                    ((HashMap.Node)e).value = value;
                }

                this.afterNodeAccess((HashMap.Node)e);
                return oldValue;
            }
        }

        /*
        At this step, it indicates that the key value to be inserted has no duplicate key, because the value of the successfully inserted e node is null.
        Modification times + 1
        */
        ++this.modCount;
        //The actual length + 1, and judge whether it is greater than the critical value. If it is greater than the critical value, the capacity will be expanded
        if (++this.size > this.threshold) {
            this.resize();
        }

        this.afterNodeInsertion(evict);
        //Added successfully
        return null;
    }

3.7 resize()

Capacity expansion method (resize)

    final HashMap.Node<K, V>[] resize() {
    	//Call the hash array without inserting it oldTab
        HashMap.Node<K, V>[] oldTab = this.table;
        //Length of oldTab
        int oldCap = oldTab == null ? 0 : oldTab.length;
        //Critical value of oldTab
        int oldThr = this.threshold;
        //Initialize the length and critical value of new
        int newThr = 0;
        int newCap;
        //Oldcap > 0 means that it is not the first time to load, because hashMap uses lazy loading
        if (oldCap > 0) {
        	//If greater than the maximum
            if (oldCap >= 1073741824) {
            	//Set the threshold to the maximum value of an integer
                this.threshold = 2147483647;
                return oldTab;
            }
            //Location *. In other cases, the capacity is expanded twice, and the length after expansion should be less than the maximum value, and the length of old should also be greater than 16
            if ((newCap = oldCap << 1) < 1073741824 && oldCap >= 16) {
            	//The critical value should also be expanded to twice that of old
                newThr = oldThr << 1;
            }
        }
        /*
        If oldcap < 0, but has been initialized, such as after deleting the element, its critical value must still exist,
        If it is the first initialization, its critical value is 0
        */
        else if (oldThr > 0) {
            newCap = oldThr;
        }
        //First initialization, give default value
        else {
            newCap = 16;
            newThr = 12;  //Critical value equals capacity * 0.75
        }
        //The supplement of position *, that is, if the capacity is less than the default value of 16 during initialization, there is no assignment of newThr at this time
        if (newThr == 0) {
        	//Critical value of new
            float ft = (float)newCap * this.loadFactor;
            //Judge whether the new capacity is greater than the maximum value and whether the critical value is greater than the maximum value
            newThr = newCap < 1073741824 && ft < 1.07374182E9F ? (int)ft : 2147483647;
        }

        //The critical values analyzed in the above situations are really changed here, that is, the capacity and critical value are changed
        this.threshold = newThr;
        //initialization
        HashMap.Node<K, V>[] newTab = new HashMap.Node[newCap];
        //Assign current table
        this.table = newTab;
        //Here, naturally, the elements in old are traversed into new
        if (oldTab != null) {
            for(int j = 0; j < oldCap; ++j) {
            	//Temporary variable
                HashMap.Node e;
                //The position value of the current hash bucket is not null, that is, there is a value at the subscript of the array, because a value indicates that a conflict may occur
                if ((e = oldTab[j]) != null) {
                	//Setting the assigned variable to null is, of course, for recycling and freeing memory
                    oldTab[j] = null;
                    //If the node at the subscript does not have the next element
                    if (e.next == null) {
                    	//Store the value of this variable in newTab. E.hash & new cap-1 is not equal to j
                        newTab[e.hash & newCap - 1] = e;
                    }
                    //If the node is a red black tree structure, that is, there is a hash conflict, and there are multiple elements in the hash bucket
                    else if (e instanceof HashMap.TreeNode) {
                    	//Transfer this tree to newTab
                        ((HashMap.TreeNode)e).split(this, newTab, j, oldCap);
                    }
                    /*
                    This is a linked list structure. Similarly, transfer the linked list to the newTab, that is, after traversing the linked list, turn the value to the past, and then set it to null
                    */
                    else {
                        HashMap.Node<K, V> loHead = null;
                        HashMap.Node<K, V> loTail = null;
                        HashMap.Node<K, V> hiHead = null;
                        HashMap.Node hiTail = null;

                        HashMap.Node next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null) {
                                    loHead = e;
                                } else {
                                    loTail.next = e;
                                }

                                loTail = e;
                            } else {
                                if (hiTail == null) {
                                    hiHead = e;
                                } else {
                                    hiTail.next = e;
                                }

                                hiTail = e;
                            }

                            e = next;
                        } while(next != null);

                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }

                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        //Returns the hashmap after capacity expansion
        return newTab;
    }

3.8 remove(Object key)

Delete element

    public V remove(Object key) {
    	//Temporary variable
        HashMap.Node e;
        /*
        Call removeNode. The third value indicates that all the nodes of the key are deleted directly, and no value is needed,
        If it is set to a value, you need to find it.
        */
        return (e = this.removeNode(hash(key), key, (Object)null, false, true)) == null ? null : e.value;
    }

    /*
    If the first parameter is hash value, the second is key, the third is value, and the fourth is true, it means to delete it
    key The corresponding value does not delete the key. If the fourth value is false, it means that the node will not be moved after deletion.
    */
    final HashMap.Node<K, V> removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable) {
    	//tab hash array, p array subscript node, n length, index current array subscript
        HashMap.Node[] tab;
        HashMap.Node p;
        int n;
        int index;
        //The hash array is not null and its length is greater than 0. Then obtain the array subscript position of the node to delete the key
        if ((tab = this.table) != null && (n = tab.length) > 0 && (p = tab[index = n - 1 & hash]) != null) {
        	//Node stores the node to be deleted, e temporary variable, k key of the current node, v value of the current node
            HashMap.Node<K, V> node = null;
            Object k;
            //If the node of the array subscript is exactly the node to be deleted, assign the value to the temporary variable
            if (p.hash == hash && ((k = p.key) == key || key != null && key.equals(k))) {
                node = p;
            }
            //That is, for the node to be deleted, first judge whether it is a red black tree node in the linked list or red black tree
            else {
                HashMap.Node e;
                if ((e = p.next) != null) {
                    if (p instanceof HashMap.TreeNode) {
                    	//Traverse the red black tree, find the node and return
                        node = ((HashMap.TreeNode)p).getTreeNode(hash, key);
                    }
                    //If it is a linked list node, traverse to find the node
                    else {
                        label88: {
                            while(e.hash != hash || (k = e.key) != key && (key == null || !key.equals(k))) {
                            	//p is the previous node to delete
                                p = e;
                                if ((e = e.next) == null) {
                                    break label88;
                                }
                            }

                            //Node is the node to be deleted
                            node = e;
                        }
                    }
                }
            }

            Object v;
            /*
            After finding the node to delete, judge! matchValue, our normal remove deletion,! Matchvalues are all true
            */
            if (node != null && (!matchValue || (v = ((HashMap.Node)node).value) == value || value != null && value.equals(v))) {
                //If the deleted node is a red black tree node, it is deleted from the red black tree
                if (node instanceof HashMap.TreeNode) {
                    ((HashMap.TreeNode)node).removeTreeNode(this, tab, movable);
                }
                //If it is a linked list node and the deleted node is an array subscript node, that is, the head node, let the next node be the head directly.
                else if (node == p) {
                    tab[index] = ((HashMap.Node)node).next;
                }
                //For the linked list structure, the deleted node is in the linked list, and the deleted next node should be set as the next node of the previous node.
                else {
                    p.next = ((HashMap.Node)node).next;
                }

                //Modify counter
                ++this.modCount;
                //Length minus 1
                --this.size;
                this.afterNodeRemoval((HashMap.Node)node);
                //Returns the deleted node
                return (HashMap.Node)node;
            }
        }

        return null;
    }

Delete the clear method and set all array subscript elements to null.

3.9 get()

Let's take a look at the relatively simple acquisition element.

    public V get(Object key) {
        HashMap.Node e;
        //It is also done by calling the getNode method
        return (e = this.getNode(hash(key), key)) == null ? null : e.value;
    }

    final HashMap.Node<K, V> getNode(int hash, Object key) {
    	//tab hash array, first header node, n length, k is key
        HashMap.Node[] tab;
        HashMap.Node first;
        int n;
        //If the hash array is not null and the length is greater than 0, get the chain header where the key value is located and assign it to first
        if ((tab = this.table) != null && (n = tab.length) > 0 && (first = tab[n - 1 & hash]) != null) {
            Object k;
            //If it is a header node, the header node is returned directly.
            if (first.hash == hash && ((k = first.key) == key || key != null && key.equals(k))) {
                return first;
            }

            HashMap.Node e;
            //The result is not a header node
            if ((e = first.next) != null) {
            	//Judge whether it is a red black tree structure
                if (first instanceof HashMap.TreeNode) {
                	//Go to the red and black trees and return
                    return ((HashMap.TreeNode)first).getTreeNode(hash, key);
                }

                //It belongs to the linked list node. Traverse the linked list, find the node and return
                do {
                    if (e.hash == hash && ((k = e.key) == key || key != null && key.equals(k))) {
                        return e;
                    }
                } while((e = e.next) != null);
            }
        }

        return null;
    }

The hashMap source code is temporarily analyzed here, and the ability is limited. If there are errors in the content, you are welcome to point out.

Topics: Java source code