JDK1. Detailed interpretation of 7hashmap source code

Posted by Redneckheaven on Tue, 18 Jan 2022 19:01:33 +0100

(the source code analysis of this article exists in the comments of the code block. Please watch it patiently)

Before we start, let's briefly understand the following HashMap. The backbone of HashMap is an entry array. Entry is the basic unit of HashMap. Each entry contains a key value pair. (in fact, the so-called Map is actually a collection that saves the mapping relationship between two objects)

Let's briefly understand the above figure. Suppose I have an array at present, and I want to put a k-v structure data in the array, such as ("xmc", "hansome"), which will calculate a hash code according to the data key, that is, xmc, and put it into the array according to the hash code. In this way, even if I save hundreds of data in the future, as long as I enter xmc, He calculates the hash code according to my xmc, so as to quickly locate the array subscript and get my "hansome". However, hash codes may conflict. For example, the "aaa" hash code is the same as my "xmc" hash code. In this case, you can add a node to the corresponding position of my array to solve the conflict.

In short, HashMap is composed of array + linked list. The array is the main body of HashMap, and the linked list mainly exists to solve hash conflicts. If the located array location does not contain a linked list (the next of the current entry points to null), the operations such as searching and adding are fast, and only one addressing is required; If the located array contains a linked list, the time complexity of the addition operation is O(n). First traverse the linked list and overwrite it if it exists, otherwise add it; For the search operation, you still need to traverse the linked list, and then compare and search one by one through the equals method of the key object. Therefore, in consideration of performance, the fewer linked lists in HashMap appear, the better the performance will be.

The HashMap class is as follows

    static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;//Store the reference to the next Entry, single linked list structure
        int hash;//The value obtained by hashing the hashcode value of the key is stored in the Entry to avoid repeated calculation

        /**
         * Creates new entry.
         */
        Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        } 

Some important parameters must be read before looking at the code.

/**Number of key value pairs actually stored*/
transient int size;

/**Threshold value. When table = = {}, this value is the initial capacity (the default is 16); When the table is filled, that is, after allocating memory space for the table,
threshold Generally, it is capacity*loadFactory. HashMap needs to refer to threshold when expanding capacity, which will be discussed in detail later*/
int threshold;

/**The load factor represents the filling degree of the table. The default is 0.75
 The reason for the existence of the load factor is to slow down the hash conflict. If the initial bucket is 16 and the capacity is expanded only when it is full of 16 elements, there may be more than one element in some buckets.
Therefore, the loading factor is 0.75 by default, that is, the HashMap with a size of 16 will expand to 32 when it reaches the 13th element.
*/
final float loadFactor;

/**HashMap The number of times changed. Because HashMap is not thread safe, when iterating over HashMap,
If the participation of other threads causes the structure of HashMap to change (such as put, remove and other operations),
The exception ConcurrentModificationException needs to be thrown*/
transient int modCount;

 public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);//Structural method with parameters
    }


    public HashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
//Call the default constructor, where DEFAULT_INITIAL_CAPACITY=64,DEFAULT_LOAD_FACTOR=0.75
    }

Let's take a look at the construction method of HashMap

public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)//Throw exceptions when the capacity is less than 0
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)//Greater than 0 is set to 1 < < 30
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
    //The load factor is less than 0 or is not a float type exception
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        this.loadFactor = loadFactor;
        threshold = initialCapacity;//The threshold is equal to the initial capacity
        init();//Empty method
    }

We can check the source code of the most commonly used put method below. It is worth noting that HashMap is a table array generated when data is put in for the first time!!

 public V put(K key, V value) {
        if (table == EMPTY_TABLE) {//If table is empty, the source code is shown below
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);//hashmap allows the key to be empty and will be placed in the position where the key is empty
        int hash = hash(key);//Calculate the hash value, you don't have to care
        int i = indexFor(hash, table.length);//Find the location of the source code, see the focus!
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            //Find the position in the entry array where the hash code is the same and the key is the same, and put it into the
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;//be careful! The put method returns the original value (if any)
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }


First, let's look at the case where the table is empty (focusing on the following two code segments, it also explains why the length of HashMap in java must be the nth power of 2, because the length of the Entry array needs to be combined with the hash code of the key)

private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        int capacity = roundUpToPowerOf2(toSize);
            //The roundUpToPowerOf2 method is to convert a number upward into a number to the n-power of 2. For example, if I pass in 7, it will be converted to 8, if I pass in 10, it will be converted to 16

        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];//Initialize the size of the Entry array to the capacity obtained above;
        initHashSeedAsNeeded(capacity);
    }
 static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);//This is where our data is stored, that is, h and length-1. We can see that length must be an integer power of 2. We already know how to operate in the above entry array initialization source code,
    }

Next, check addEntry(hash, key, value, i); method

    void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {//If the existing number is greater than the liability factor, the length of the expanded array is multiplied by 2 (it must be the nth power of 2)
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);//Head interpolation in new node
    }

When a hash conflict occurs and the size is greater than the threshold, the array needs to be expanded. When expanding, you need to create a new array with a length twice that of the previous array, and then transfer all the elements in the current entry array. The length of the new array after expansion is twice that of the previous array. Therefore, capacity expansion is a relatively resource consuming operation. For example, suppose that the length of the entry array is 4 and the load factor is 2. At this time, the first two positions of the entry and the data of a and B are available. The addEntry method will expand the entry length to 4, but I think a new node may be added under the first two ab.

You can view the resize method

void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }//If the original capacity is already the largest, it will be returned directly

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable, initHashSeedAsNeeded(newCapacity));//See the key points below for the source code
        table = newTable;
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

//transfer
void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {//Traverse the original entry array and put the contents of the original array into the new array
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {//Recalculate the hash and put it into a new array
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];//This part is the criticism of 1.7. If it is multithreaded, it is easy to form a ring
                newTable[i] = e;
                e = next;
            }
        }
    }
  public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }




final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

Let's take a look at the get method of HashMap. Because it is relatively simple, we won't analyze it

  public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();
    }




final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }

        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }

Topics: Java linked list HashMap