Deep understanding of Threadlocal

Posted by twinzen on Thu, 02 Jan 2020 06:25:07 +0100

Preface

Concurrency is an unavoidable topic in Java development.Modern processors are multi-core, and multi-threaded programming is essential for better drying machine performance, so thread security is a required course for every Java Engineer.

There are two general ways to address thread security issues:

Synchronization: Use the Synchronized keyword or the java.util.concurrent.locks.Lock tool class to lock critical resources.
Avoid resource contention: Place global resources in ThreadLocal variables to avoid concurrent access.

This article describes the second way: how ThreadLocal is implemented and why thread security is guaranteed.

ThreadLocal

Here is a common use scenario for ThreadLocal:

public class ThreadLocalTest {
    // ThreadLocal is generally defined as a static variable
    private static final ThreadLocal<dateformat> format = new ThreadLocal<dateformat>(){
        // Initialize ThreadLocal value
        protected DateFormat initialValue() {
            return new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        }
    };

    public static void main(String[] args) {
        // Start 20 threads
        for (int i = 0; i &lt; 20; i++) {
            new Thread(() -&gt; {
                try {
                    // Get a copy of SimpleDateFormat in this thread
                    DateFormat localFormat = format.get();
                    // Resolve date, no error here
                    Date date = localFormat.parse("2000-11-11 11:11:11");
                    System.out.println(date);
                } catch (ParseException e) {
                    e.printStackTrace();
                }
            }).start();
        }
    }
}

As you should all know, SimpleDateFormat in Java is not thread safe. Reference This article .However, the above code does not fail, indicating that ThreadLocal does guarantee concurrency security.

Source Parsing

Overview of ThreadLocal

In the example above, we call the initialValue and get methods of ThreadLocal and take a look at the implementation of the get method:

// Authors of this class are two gods, the former being the author of Effective Java and the latter being the concurrent master of Java and package.
/*
 * [@author](https://my.oschina.net/arthor)  Josh Bloch and Doug Lea
 * [@since](https://my.oschina.net/u/266547)   1.2
 */
public class ThreadLocal<t> {
    public T get() {
        // Get the current thread
        Thread t = Thread.currentThread();
        // Get a Map based on the current thread, and temporarily compare it to a HashMap key-value pair
        // You can see that this Map is related to this thread
        ThreadLocalMap map = getMap(t);
        if (map != null) {
            // By taking Entry from Map with this, the Key in Map is the ThreadLocal variable itself
            // value is the object saved in ThreadLocal
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        // If the Map is not initialized (map == null), or the current ThreadLocal variable is not initialized (e == null)
        // Then call this method to complete the initialization
        return setInitialValue();
    }

    // Originally, this ThreadLocalMap was just a member variable of the thread!
    ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }
}

public class Thread implements Runnable {
    // A global variable ThreadLocalMap is defined in the Thread class
    // Used to store all ThreadLocal type variables in this thread with an initial value of null
    ThreadLocal.ThreadLocalMap threadLocals = null;
}

From the get method, we can get the following information: > 1. The ThreadLocal variable is stored in a Map, which is a global variable of the Thread class.This is also a key point to ThreadLocal thread security: each thread has its own Map, and each thread operates on its own copy of ThreadLocal variables, without interacting with each other. > 2. ThreadLocalMap holds all ThreadLocal variables in the thread, the ThreadLocal variable is Key, and the ThreadLocal value corresponds to Value.(In the example at the beginning of this article, Key is the format variable and Value is the value returned by the initialValue method, new SimpleDateFormat("yyyy-MM-dd HH:mm:ss") > 3. ThreadLocal is lazy to load. When ThreadLocalMap or the current ThreadLocal variable is found to be uninitialized, the setInitialValue method is called for initialization.

ThreadLocal Other Methods

Continue to see what the setInitialValue method does:

    private T setInitialValue() {
        // Initialize by calling the initialValue method
        // This method is what we override when defining the ThreadLocal variable
        T value = initialValue();
        Thread t = Thread.currentThread();
        // Gets the ThreadLocalMap of the current thread
        ThreadLocalMap map = getMap(t);
        if (map != null)
            // If the Map is already initialized, initialize the current ThreadLocal variable directly:
            // set yourself (the current ThreadLocal variable) into Map as the key, the saved value as the value
            map.set(this, value);
        else
            // Initialize the Map if it is not already initialized
            createMap(t, value);
        return value;
    }

    // The default initialValue method is defined as protected, which is overridden for us
    protected T initialValue() {
        return null;
    }

    void createMap(Thread t, T firstValue) {
        // Create a new ThreadLocalMap and assign it to the current Thread
        t.threadLocals = new ThreadLocalMap(this, firstValue);
    }

There are also set and remove methods, which are simply not explained here.

Is ThreadLocal over here?Is it so simple?Of course not.Because ThreadLocalMap is a member variable of Thread, its lifecycle is as long as that of a thread.That is, as long as the threads have not been destroyed, Map will reside in memory and cannot be GC-enabled, which can easily cause memory leaks.So how does ThreadLocal solve this?

The answer is weak references, reference types in Java, you can refer to This article.

ThreadLocalMap

ThreadLocalMap is an internal class of ThreadLocal.Java has a ready-made HashMap-like class, and ThreadLocal has struggled to implement a ThreadLocalMap on its own to prevent memory leaks.

Let's take a closer look at ThreadLocalMap, which is different from a normal HashMap.

Data structure of ThreadLocalMap

static class ThreadLocalMap {

    // The inner class Entry inherits WeakReference
    static class Entry extends WeakReference<threadlocal<?>&gt; {
        // Value saved in ThreadLocal variable
        Object value;

        // As you can see, Entry is just a simple Key-Value, not like a chain table in HashMap
        Entry(ThreadLocal<!--?--> k, Object v) {
            super(k);
            value = v;
        }
    }
    // ThreadLocalMap default size
    private static final int INITIAL_CAPACITY = 16;
    // This Entry array is where all ThreadLocal s are stored
    private Entry[] table;
}

ThreadLocalMap maintains an Entry array (no list, unlike HashMap), which holds all ThreadLocal variables in a thread.Entry inherits WeakReference and associates ThreadLocal. When no other strong external reference points to a ThreadLocal object, the ThreadLocal object will be reclaimed by memory at the next GC, that is, the Key in Entry will be reclaimed, so you will see below how to clean up Entry whose key is null.

Set Operation

When HashMap encounters a hash conflict, it is resolved by building a chain table on the same Hash Key.Since ThreadLocalMap maintains only one Entry array, how does it resolve hash conflicts?Let's look at the source code for the set method:

    private void set(ThreadLocal<!--?--> key, Object value) {
        Entry[] tab = table;
        int len = tab.length;
        // Based on ThreadLocal hashcode, the slot (index) in the table is calculated
        int i = key.threadLocalHashCode &amp; (len-1);
        // Starting at position i, loop back one by one to find the first empty slot (condition e == null)
        for (Entry e = tab[i]; e != null; e = tab[i = nextIndex(i, len)]) {
            ThreadLocal<!--?--> k = e.get();
            // If key s are equal, simply overwrite the old value and replace it with the new value
            if (k == key) {
                // New values replace old values and return them
                e.value = value;
                return;
            }
            // key == null, indicating that the weak reference was previously reclaimed from memory, then set the value in this slot
            if (k == null) {
                // This method will be resolved later
                replaceStaleEntry(key, value, i);
                return;
            }
        }

        // When you get here, this i is the number after the hash slot where the key actually resides, the first non-empty slot
        // Wrap the value as an Entry and place it in position i
        tab[i] = new Entry(key, value);
        int sz = ++size;
        // Find out if Entry has been recycled
        // If an Entry is found to be recycled, or the size of the table is greater than the threshold, perform the rehash operation
        if (!cleanSomeSlots(i, sz) &amp;&amp; sz &gt;= threshold)
            rehash();
    }
    
    // Get the next index.It's actually i + 1.Return to zero when table length is exceeded
    private static int nextIndex(int i, int len) {
        return ((i + 1 &lt; len) ? i + 1 : 0);
    }

ThreadLocalMap is used to resolve hash conflicts with open addresses.If the target slot already has a value, first determine if the value is itself.If so, replace the old value; if not, determine if the slot value is valid (whether the ThreadLocal variable on the slot is garbage collected), and if not, set it directly to the slot and perform some cleanup operations.If the slot is a valid value, continue searching until an empty slot is found.The process is roughly as follows:

Clean up invalid Entry

Here, we should have a question: Weak references purge only the key in Entry, which is the ThreadLocal variable, while Entry itself still holds the slot in the table.Where does the code clean up these invalid Entries?Let's focus on two methods replaceStaleEntry and cleanSomeSlots that were not analyzed above

cleanSomeSlots

    // As the name implies, clear some slots, Default scan log(n) slots
    private boolean cleanSomeSlots(int i, int n) {
        boolean removed = false;
        Entry[] tab = table;
        int len = tab.length;
        do {
            i = nextIndex(i, len);
            Entry e = tab[i];
            // Note that invalid Entry is judged by e.get() == null
            // That is, the weak references saved in Entry have already been GC, which requires clearing the corresponding Entry
            if (e != null &amp;&amp; e.get() == null) {
                // If an invalid entry is found, n is reset to the length of the table
                // We will continue to look for log(n) slots to determine if there is an invalid Entry
                n = len;
                removed = true;
                // Call expungeStaleEntry method to clear slot in i position
                i = expungeStaleEntry(i);
            }
        // The loop condition is n to move one bit to the right, dividing by two.So the default is to cycle log(n) times
        } while ( (n &gt;&gt;&gt;= 1) != 0);
        // Return true if slot is cleared
        return removed;
    }

    private int expungeStaleEntry(int staleSlot) {
        Entry[] tab = table;
        int len = tab.length;
        // Empty slot position in position i
        tab[staleSlot].value = null;
        tab[staleSlot] = null;
        size--;

        Entry e;
        int i;
        // Continue checking back for invalid Entries until you encounter an empty slot tab[i]==null
        for (i = nextIndex(staleSlot, len); (e = tab[i]) != null; i = nextIndex(i, len)) {
            ThreadLocal<!--?--> k = e.get();
            // If Entry is invalid, clear it
            if (k == null) {
                e.value = null;
                tab[i] = null;
                size--;
            } else {
                // Recalculate hash value h
                int h = k.threadLocalHashCode &amp; (len - 1);
                // If the new hash value h is not equal to the slot value i of the current location, rehash is required in this case
                // Relocate a more reasonable slot for e at the current i position
                if (h != i) {
                    // Empty i position
                    tab[i] = null;
                    // Find the first empty slot backwards from position h
                    while (tab[h] != null)
                        h = nextIndex(h, len);
                    // Place e on the first empty slot
                    tab[h] = e;
                }
            }
        }
        // Returns the subscript for the next empty slot
        return i;
    }

The cleanSomeSlots method scans part of the slot to see if there is an invalid Entry.If no log(n) slots are found, only log(n) slots are scanned; if an invalid slot is found, the slot is cleared, additional log(n) slots are scanned, and so on. The job of emptying slots is done by the expungeStaleEntry method, which, in addition to clearing the Entry at the current location, checks for subsequent non-empty Entries and clears them of invalid values.It also judges and processes rehash.Why rehash here?Since invalid Entries have been cleared before, if the latter Entry is deferred to the latter because of hash conflict, the latter Entry can be moved to the empty position ahead to improve query efficiency.

CleaSomeSlots example

In the case of the picture above, we will discuss it in two cases:

If you start with i=2:
1. tab[2] is null, continue loop i=nextIndex(i, len)=nextIndex(2, 8)=3
2. The location of tab[3] (k3,v3) is valid, continue cycling i=nextIndex(i, len)=nextIndex(3, 4)=0
3. The location of tab[0] (k1,v1) is valid, continue cycling i=nextIndex(i, len)=nextIndex(0, 2)=1
4. The location of tab[1] (k2,v2) is valid, continue cycling i=nextIndex(i, len)=nextIndex(1, 1)=0
5. The location of tab[0] (k1,v1) is valid, n==0 ends
If you start with i=11:
1. The location of tab [11] (null,v7) is invalid, call expungeStaleEntry method, expungeStaleEntry method empties tab [11], and will loop back to determine.Because the tab[12] position (null,v8) is invalid, the tab[12] will also be emptied; if the tab[13] position (k9,v9) is valid, it will determine whether it is necessary to relocate the position (k9,v9).If rehash on K9 is still 12, it will not be processed; if rehash on K9 is 11, it means that the element was placed at 12 because of hash collisions, then it needs to be placed at tab[11].The expungeStaleEntry method returns the first null subscript 14, n reset to 16, i=nextIndex(i, len)=nextIndex(14, 16)=15
2. The location of tab[15] (k10,v10) is valid, continue cycling i=nextIndex(i, len)=nextIndex(15, 8)=0
3. The location of tab[0] (k1,v1) is valid, continue cycling i=nextIndex(i, len)=nextIndex(0, 2)=1
4. The location of tab[1] (k2,v2) is valid, continue cycling i=nextIndex(i, len)=nextIndex(1, 1)=0
5. The location of tab[0] (k1,v1) is valid, n==0 ends

replaceStaleEntry method

    private void replaceStaleEntry(ThreadLocal<!--?--> key, Object value, int staleSlot) {
        Entry[] tab = table;
        int len = tab.length;
        Entry e;
        // slotToExpunge records the subscript of the first invalid Entry on a contiguous segment containing staleSlot
        int slotToExpunge = staleSlot;
        // Traverse forward through the non-empty slots to find the first invalid Entry subscript, recorded as slotToExpunge
        for (int i = prevIndex(staleSlot, len); (e = tab[i]) != null; i = prevIndex(i, len))
            if (e.get() == null)
                slotToExpunge = i;
        // Walk back through the non-empty segments to find where the key is, that is, check to see if the key has been added before
        // Why until tab[i]==null?Because the hash value after the empty slot must be different
        for (int i = nextIndex(staleSlot, len); (e = tab[i]) != null; i = nextIndex(i, len)) {
            ThreadLocal<!--?--> k = e.get();
            if (k == key) {
                // If a key is found, then it has been added before, overwriting the old value directly
                // Because staleSlot is less than i, the values of the two slots need to be exchanged to improve query efficiency
                // And invalid Entry at i, which will be cleared later on by cleanSomeSlots
                e.value = value;
                tab[i] = tab[staleSlot];
                tab[staleSlot] = e;
                // If the slotToExpunge value does not change, no invalid Entry is found during the forward lookup process
                // Then start cleanSomeSlots with the current location
                if (slotToExpunge == staleSlot)
                    slotToExpunge = i;
                // Both methods have been analyzed and invalid Entries are cleaned up from slotToExpunge location
                cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                return;
            }

            // If no invalid Entry is found in the previous lookup and the Entry here is invalid (k==null)
            // This will indicate that i is the first invalid Entry, counting slotToExpunge as i
            if (k == null &amp;&amp; slotToExpunge == staleSlot)
                slotToExpunge = i;
        }
        // If the key is not found, indicating that this is a new Entry, create a new Entry directly and place it in the staleSlot location
        tab[staleSlot].value = null;
        tab[staleSlot] = new Entry(key, value);
        if (slotToExpunge != staleSlot)
            // Both methods have been analyzed and invalid Entries are cleaned up from slotToExpunge location
            cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
    }

This method is actually three steps:

Find out later if the key exists in the table.If it exists, that is, the key has been set before, you need to overwrite the old value and move the element where the key is located to the staleSlot location.(Why should I shift my position?Because the original element is positioned i, certainly after the staleSlot, moving the element forward on the staleSlot can improve query efficiency and avoid subsequent rehash operations.)
If the key does not exist, it means a new set operation. Create a new Entry directly and place it in the staleSlot location.
Call the cleanSomeSlots method to clear invalid Entries

Other methods

The rest of the methods are simpler. See the source notes for more information get method:

    // Method of get operation
    private Entry getEntry(ThreadLocal<!--?--> key) {
        int i = key.threadLocalHashCode &amp; (table.length - 1);
        Entry e = table[i];
        // i Location element is the element to find, returned directly
        if (e != null &amp;&amp; e.get() == key)
            return e;
        else
            // Otherwise call the getEntryAfterMiss method
            return getEntryAfterMiss(key, i, e);
    }

    private Entry getEntryAfterMiss(ThreadLocal<!--?--> key, int i, Entry e) {
        Entry[] tab = table;
        int len = tab.length;
        // Start at position i and iterate backwards until the slot is empty.Why to the empty slot?
        // According to the open address method, the element hash value after the empty slot must be different, there is no need to continue
        while (e != null) {
            ThreadLocal<!--?--> k = e.get();
            // key equals, which is the target element, returning directly
            if (k == key)
                return e;
            // key is null, is an invalid element, call expungeStaleEntry method to clear the element at i position
            if (k == null)
                expungeStaleEntry(i);
            else
                // Continue looking for the next element
                i = nextIndex(i, len);
            e = tab[i];
        }
        // No target element found, return null
        return null;
    }

remove method:

    private void remove(ThreadLocal<!--?--> key) {
        Entry[] tab = table;
        int len = tab.length;
        int i = key.threadLocalHashCode &amp; (len-1);
        // Or the same traversal logic
        for (Entry e = tab[i]; e != null; e = tab[i = nextIndex(i, len)]) {
            // Find the target element
            if (e.get() == key) {
                e.clear();
                // Call expungeStaleEntry method to clear element at i position
                expungeStaleEntry(i);
                return;
            }
        }
    }

resize method

    // resize is required when the number of elements is greater than threshold (default is 2/3 of the table length)
    private void resize() {
        Entry[] oldTab = table;
        int oldLen = oldTab.length;
        // The new table is twice as long as the old one
        int newLen = oldLen * 2;
        Entry[] newTab = new Entry[newLen];
        int count = 0;
        // Traversing through an old table
        for (int j = 0; j &lt; oldLen; ++j) {
            Entry e = oldTab[j];
            if (e != null) {
                ThreadLocal<!--?--> k = e.get();
                // If the key is null, this is an invalid Entry, skipping directly (setting the value to null is convenient for GC)
                if (k == null) {
                    e.value = null; // Help the GC
                } else {
                    // Recalculate hash values based on the length of the new table
                    int h = k.threadLocalHashCode &amp; (newLen - 1);
                    // First empty slot found from h according to open address method
                    while (newTab[h] != null)
                        h = nextIndex(h, newLen);
                    // Put the value at that location
                    newTab[h] = e;
                    count++;
                }
            }
        }
        // Set some parameters for the new table
        setThreshold(newLen);
        size = count;
        table = newTab;
    }

summary

This paper gives an in-depth introduction to the implementation of ThreadLocal from the code level. ThreadLocal keeps threads safe because it creates a copy of a variable for each thread.Each thread accesses its own internal variables without concurrency conflicts. As a thread internal variable, how is it different from a local variable?Typically, ThreadLocal is defined as static, meaning that only one copy is created per thread, and the lifecycle is the same as that of a thread.The life cycle of a local variable is the same as that of a method, where each method is called, a variable is created once, the method ends, and the object is destroyed.ThreadLocal avoids duplicate creation and destruction of large objects.

ThreadLocalMap's Entry inherits from WeakReference, which is recycled in the next GC when no other strong reference points to the ThreadLocal variable.For recycled ThreadLocal variables, instead of explicitly cleaning up, check to remove the Entry where these invalid ThreadLocal variables are located in the next get, set, remove operation to prevent possible memory leaks.</threadlocal<?></t></dateformat></dateformat>

Topics: Programming Java less REST

Programmer Think