JDK source util package analysis -- HashMap source code

Posted by rubberjohn on Thu, 03 Mar 2022 04:34:01 +0100

HashMap source code analysis

Structure diagram of HashMap

Introduction to HashMap principle

Array: use a continuous storage unit to store data. For the search of the specified subscript, the time complexity is O(1); To search through a given value, you need to traverse the array and compare the given keywords and array elements one by one. The time complexity is O(n). Of course, for an ordered array, you can use binary search, interpolation search, Fibonacci search and other methods to improve the search complexity to O(logn); For the general insert and delete operation, it involves the movement of array elements, and its average complexity is also O(n)

Linear linked list: for operations such as adding and deleting linked lists (after finding the specified operation location), only the references between nodes need to be processed, and the time complexity is O(1), while the search operation needs to traverse the linked list for comparison one by one, and the complexity is O(n)

Binary tree: insert, search, delete and other operations on a relatively balanced ordered binary tree. The average complexity is O(logn).

Hash table: compared with the above data structures, the performance of adding, deleting, searching and other operations in the hash table is very high. Without considering the hash conflict, it can be completed only by one positioning, and the time complexity is O(1). Next, let's see how the hash table achieves the amazing constant order O(1). The hash table is composed of arrays and linked lists.

HashMap is realized by array + linked list + red black tree (the new red black tree in JDK1.8)

Data structure of HashMap

Linked list

Node is an internal class of HashMap, which implements map The entry interface is essentially a mapping (key value pair)

 static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;//The hash value is stored to locate the index position of the array
        final K key;//Store key value
        V value;//Store the value value corresponding to the key value
        Node<K,V> next;//Pointer to the next node

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }
		//The hash value is the bitwise XOR of the hash value of the hash value of the key
        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)//If o is the current object
                return true;
            if (o instanceof Map.Entry) {//o is map The type of entry or its subclasses
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))//Judge whether key and value are completely equal
                    return true;
            }
            return false;
        }
    }

Red black tree

Here is only a part of the code for the red black tree. The red black tree has four more variables than the linked list: parent node, left node, right node and prev's previous peer node.

//Red black tree
static final class TreeNode<k,v> extends LinkedHashMap.Entry<k,v> {
    TreeNode<k,v> parent;  // Parent node
    TreeNode<k,v> left; //Left subtree
    TreeNode<k,v> right;//Right subtree
    TreeNode<k,v> prev;    // Need to unlink next up deletion / / after deletion, you need to unlink the next step
    boolean red;    //color property
    TreeNode(int hash, K key, V val, Node<k,v> next) {
        super(hash, key, val, next);
    }
 
    //Returns the root node of the current node
    final TreeNode<k,v> root() {
        for (TreeNode<k,v> r = this, p;;) {
            if ((p = r.parent) == null)
                return r;
            r = p;
        }
    }
}

Bit bucket

A very important field in the HashMap class is Node[] table, that is, hash bucket array. Obviously, it is an array of nodes

transient Node<k,v>[] table;//An array that stores (bit buckets)

The underlying implementation of HashMap

 First, there is an array in which each element is a linked list. When adding an element( key-value)First, when calculating the element key of hash Value to determine where to insert into the array, but there may be the same hash The element of the value has been placed in the same position of the array, and then it is added to the same array hash After the element of value, they are in the same position of the array, but form a linked list, so the array stores a linked list. When the length of the linked list is too long, the linked list will be converted into a red black tree, which greatly improves the efficiency of search.

HashMap static utility

//Get the hash value of the key
static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
//If the class form of x is "class C implements comparable < C >", the class of x is returned; otherwise, null is returned
 static Class<?> comparableClassFor(Object x) {
        if (x instanceof Comparable) {  // Determine whether the Comparable interface is implemented
            Class<?> c; Type[] ts, as; Type t; ParameterizedType p;
            if ((c = x.getClass()) == String.class) 
                return c;   // If it is a String type, directly return String class
            if ((ts = c.getGenericInterfaces()) != null) {  // Determine whether there is a directly implemented interface
                for (int i = 0; i < ts.length; ++i) {   // Traverse directly implemented interfaces
                    if (((t = ts[i]) instanceof ParameterizedType) &&   // This interface implements generics
                        ((p = (ParameterizedType)t).getRawType() == // Gets the type object of the interface without the parameter part
                         Comparable.class) &&   //  The type is Comparable
                        (as = p.getActualTypeArguments()) != null &&    // Get generic parameter array
                        as.length == 1 && as[0] == c)   // There is only one generic parameter and the implementation type is the type itself
                        return c;   // Returns the type
                }
            }
        }
        return null;
    }

Parameters of HashMap

	// serial number
    private static final long serialVersionUID = 362498820763181265L;    
    // The default initial capacity is 16
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;   
    // Maximum capacity
    static final int MAXIMUM_CAPACITY = 1 << 30; 
    // Default fill factor
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    // When the number of elements in the linked list exceeds 8 and the length of the Node[] table array does not exceed min_ TREEIFY_ When the value of capability is 64, the array capacity is expanded, otherwise the linked list structure is transformed into a red black tree structure for structural optimization
    static final int TREEIFY_THRESHOLD = 8; 
    // When the number of nodes on the bucket is less than this value, the tree turns to the linked list
    static final int UNTREEIFY_THRESHOLD = 6;
    // The structure in the bucket is transformed into the minimum size of the table corresponding to the red black tree
    static final int MIN_TREEIFY_CAPACITY = 64;
    // An array of storage elements is always a power of 2
    transient Node<k,v>[] table; 
    // A set that holds concrete elements
    transient Set<map.entry<k,v>> entrySet;
    // size is the number of key value pairs that actually exist in the HashMap
    transient int size;
    // Counter for each expansion and change of map structure
    transient int modCount;   
    // threshold is the maximum number of key value pairs that a HashMap can hold. Critical value when the actual size (capacity * filling factor) exceeds the critical value, the capacity will be expanded to double the original capacity
    int threshold;
    // Filling factor
    final float loadFactor;

Constructor of HashMap

Topics: Java