Deep understanding of ThreadLocal (source + memory)

Posted by gigya on Sat, 11 May 2019 03:30:54 +0200

Deep Understanding of ThreadLocal

purpose

We usually use ThreadLocal to provide thread local variables. Thread local variables have a copy within each Thread, and Thread can only access its own copy. Text interpretation is always obscure. Let's take an example.

public class Test {

    private static ThreadLocal<String> threadLocal = new ThreadLocal<>();

    public static void main(String[] args) {
        Thread thread1 = new MyThread("lucy");
        Thread thread2 = new MyThread("lily");
        thread1.start();
        thread2.start();
    }

    private static class MyThread extends Thread {

        MyThread(String name) {
            super(name);
        }

        @Override
        public void run() {
            Thread thread = Thread.currentThread();
            threadLocal.set("i am " + thread.getName());
            try {
                //Sleep for two seconds to make sure that both thread lucy and thread lily call threadLocal's set method.
                Thread.sleep(2000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println(thread.getName() + " say: " + threadLocal.get());
        }
    }
}

This example is very simple, creating two threads, lucy and lily. Inside the thread, the set method of threadLocal is called to store a string, and after 2 seconds of sleep, the thread name and the string in threadLocal are output. Let's run this code and take a look at the output.

lucy say: i am lucy
lily say: i am lily

principle

The example above explains the role of ThreadLocal very well. Next, let's analyze how it works.

We locate the set method of ThreadLocal. The set method in source code is divided into several methods. In order to express conveniently, the author integrates these methods.

public void set(T value) {
    //Get the current thread
    Thread t = Thread.currentThread();
    //Get the ThreadLocalMap for the current thread
    ThreadLocalMap map = t.threadLocals;
    if (map != null)
        //Put the data into the ThreadLocalMap. The key is the current ThreadLocal object, and the value is the value we pass in.
        map.set(this, value);
    else
        //Initialize the ThreadLocalMap and store it in the map with the current ThreadLocal object as the Key and value as the value.
        t.threadLocals = new ThreadLocalMap(this, value);
}

As you can see from the above code, the set method of ThreadLocal is mainly implemented through the ThreadLocalMap of the current thread. ThreadLocalMap is a Map whose key is ThreadLoacl and value is Object.

I will not post the source code of the get method of TreadLocal. Generally, it is similar to the set method. First, I get the ThreadLocal Map of the current thread, and then I get the value with this as the key.

Here we basically understand the working principle of ThreadLocal. Let's summarize it.

There is a ThreadLocal Map inside each Thread instance. ThreadLocal Map is a Map whose key is ThreadLocal and value is Object.
The set method of ThreadLocal actually stores data into the ThreadLocalMap of the current thread, whose key is the current ThreadLocal object and value is the value passed in from the set method.
When using data, take the current ThreadLocal as the key and extract the data from the ThreadLocalMap of the current thread.

ThreadLocalMap

Above, we introduced that ThreadLocal is mainly implemented through ThreadLocalMap of threads.

    static class ThreadLocalMap {
        private ThreadLocal.ThreadLocalMap.Entry[] table;

        static class Entry extends WeakReference<ThreadLocal<?>> {
            Object value;

            Entry(ThreadLocal<?> var1, Object var2) {
                super(var1);
                this.value = var2;
            }
        }
    }

ThreadLocalMap is a Map that maintains an Entry [].

ThreadLocalMap actually wraps Key and Value as Entry and puts them into Entry arrays. Let's look at its set method.

        private void set(ThreadLocal<?> key, Object value) {
            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);

            for (Entry e = tab[i];
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {
                ThreadLocal<?> k = e.get();
              
                if (k == key) {
                  //If it already exists, replace value directly
                    e.value = value;
                    return;
                }

                if (k == null) {//If the key ThreadLocal at the current location is empty, replace the key and value. The following ThreadLocal memory analysis shows why this code exists.
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }

            tab[i] = new Entry(key, value);//The location has no data and is stored directly.
            int sz = ++size;
            if (!cleanSomeSlots(i, sz) && sz >= threshold) //Check for expansion
                rehash();
        }

        private static int nextIndex(int i, int len) {
            return ((i + 1 < len) ? i + 1 : 0);
        }

Here, if you know HashMap, you can see that ThreadLocalMap is a HashMap. However, it does not solve Hash conflicts by using array + linked list in java.util.HashMap, but by index backward.

Let's briefly analyze this code:

The array subscript i is calculated by ThreadLocal threadLocalHashCode and the length of the current Map.
Starting with i, traversing Entry arrays can be done in three ways:
- Entry's key is the ThreadLocal we want to set, which directly replaces the value in Entry.
- Entry's key is empty, replacing key and value directly.
- Hash collision occurred, the current location has data, look for the next available space.
Find the location where there is no data, and put the key and value in.
Check for expansion.

As we know, HashMap is a very efficient set of get and set, and its time complexity is only O(1). But if there are serious Hash conflicts, the efficiency of HashMap will be much lower. As we know from the previous code, ThreadLocalMap calculates Entry to store index by key. threadLocalHashCode & (len-1). Len is the current length of Entry [], and there's nothing to say. That seems to be the secret in threadLocal HashCode. Let's see how threadLocal HashCode came into being.

public class ThreadLocal<T> {
  
    private final int threadLocalHashCode = nextHashCode();
    
    private static AtomicInteger nextHashCode = new AtomicInteger();

    private static final int HASH_INCREMENT = 0x61c88647;

    private static int nextHashCode() {
        return nextHashCode.getAndAdd(HASH_INCREMENT);
    }
}

This code is very simple. There is a global counter called nextHashCode, which adds 0x61c88647 for every ThreadLocal generated counter, and assigns the current value to threadLocalHashCode. For the magic constant 0x61c88647, you can point Here.

ThreadLocal Memory Analysis

I don't know when the ThreadLocal memory leak began to circulate on the Internet. Let's start with the memory of ThreadLocal and analyze whether this statement is correct. Don't say much, just go to the picture.

Now let's assume that ThreadLocal has fulfilled its mission and has disconnected its reference from ThreadLocalRef. At this point, the memory graph becomes like this.

When system GC occurs, ThreadLocal memory is reclaimed because only weak references from key s are available in Heap.

At this point, value is left in Heap, and we can't access it by reference. Value will last until the end of the thread. If you don't want to depend on the lifetime of a thread, call the remove method to release the value's memory. Personally, this kind of design should also be the helpless action of JDK development bosses. From the source code, we can feel the efforts of these big guys to minimize the risk of memory leaks.

ThreadLocalMap.Entry refers to ThreadLocal softly, avoiding memory leaks from ThreadLocal.

Remember this code in the ThreadLocalMap set method?

private void set(ThreadLocal<?> key, Object value) {
         
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        ThreadLocal<?> k = e.get();
      
           ...
      
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return;
        }
    }
}

When the ThreadLocal get method is retrieved, there is a section of code that removes Entry and Entry.value if Entry's key is null.

private int expungeStaleEntry(int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;

    // expunge entry at staleSlot
    tab[staleSlot].value = null;
    tab[staleSlot] = null;
    size--;

    // Rehash until we encounter null
    Entry e;
    int i;
    for (i = nextIndex(staleSlot, len);
         (e = tab[i]) != null;
         i = nextIndex(i, len)) {
        ThreadLocal<?> k = e.get();
      //
        if (k == null) {
            e.value = null;
            tab[i] = null;
            size--;
        }
      ...
    }
    return i;
}

Topics: Java JDK

Programmer Think

Deep understanding of ThreadLocal (source + memory)