Application and principle of ThreadLocal

Posted by LennyG on Sun, 16 Jan 2022 08:12:15 +0100

What is ThreadLocal

ThreadLocal class can be understood as thread local variable as its name suggests. That is, if a ThreadLocal is defined,
Each thread's reading and writing to this ThreadLocal is thread isolated and will not affect each other. It provides a mechanism to realize thread closure by having its own independent copy of variable data through each thread.

practical application

In actual development, there are few scenarios where we really use ThreadLocal, and most of them are used in the framework. The most common usage scenario is to use it to solve database connection and Session management, so as to ensure that the database connection used in each thread is the same. Another commonly used scenario is to solve the problem of thread insecurity with SimpleDateFormat. However, now java8 provides DateTimeFormatter, which is thread safe. Interested students can go and have a look. You can also use it to pass parameters gracefully. When passing parameters, if the variables or parameters generated by the parent thread are directly passed to the child thread through ThreadLocal, the parameters will be lost. Another ThreadLocal will be introduced later to specifically solve this problem.

Introduction to ThreadLocal api

ThreadLocal has few APIs, just a few

Let's take a look at the use of these APIs, which are also super simple to use

private static ThreadLocal<String> threadLocal = ThreadLocal.withInitial(()->"java finance");
public static void main(String[] args) {
    System.out.println("Get initial value:"+threadLocal.get());
    threadLocal.set("Attention:[ java Finance]");
    System.out.println("Get the modified value:"+threadLocal.get());
    threadLocal.remove();
}

Output results:

Get initial value: java Finance
Get the modified value: attention: [java finance]

Is it easy to fry chicken? Just a few lines of code cover all APIs. Let's take a brief look at the source code of these APIs.

Member variable

    /**The initial capacity must be a power of 2
     * The initial capacity -- MUST be a power of two.
     */
    private static final int INITIAL_CAPACITY = 16;

    /** Entry Table, size must be a power of 2
     * The table, resized as necessary.
     * table.length MUST always be a power of two.
     */
    private Entry[] table;

    /**
     * The number of entries in the table.
     */
    private int size = 0;

    /**
     * The next size value at which to resize.
     */
    private int threshold; // Default to 0

Here is a question often asked in interviews: why must the size and initial capacity of the entry array be a power of 2? For # firstkey threadLocalHashCode & (INITIAL_CAPACITY - 1); And many source codes are used
Hashcode & ($2^n$-1) instead of hashCode% ^n $.
The advantages of this writing are as follows:

  • Use bit operation instead of modulo to improve computing efficiency.
  • In order to reduce the probability of collision between different {hash} values, make the elements hash evenly in the hash table as much as possible.

set method

public void set(T value) {

    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}

The set method is relatively simple. We can focus on the ThreadLocalMap in this method. Since it is a map (note that it should not be confused with java.util.map, which refers to the conceptual map), it must have its own key and value. According to the source code, we can see that its key is actually ThreadLocal, But in fact, ThreadLocal stores a weak reference to ThreadLocal, and its value is the value of our actual set

static class Entry extends WeakReference> {

        /** The value associated with this ThreadLocal. */
        Object value; // Actual stored value

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }

Entry is the node defined in ThreadLocalMap. It inherits the WeakReference class and defines a value of type Object to store the value stuffed into ThreadLocal. Let's take a look at where the ThreadLocalMap is located? We see that ThreadLocalMap is a variable located in the Thread, and our value is placed in ThreadLocalMap. In this way, we realize the isolation between each Thread. The structure of ThreadLocal is clearly introduced in the following two figures.


Next, let's look at the data structure in ThreadLocalMap. We know that HaseMap solves hash conflicts by linked list and red black tree (jdk1.8), but we see that ThreadLocalMap has only one array. How does it solve hash conflicts? ThreadLocalMap adopts linear detection. What is linear detection? It is to determine the position of the element in the table array according to the hashcode value of the initial key. If it is found that elements with other key values have been occupied at this position, use a fixed algorithm to find the next position with a certain step length and judge in turn until a position that can be stored is found. ThreadLocalMap solves hash conflicts by simply increasing or decreasing the step size by 1 to find the next adjacent location.

    /**
     * Increment i modulo len.
     */
    private static int nextIndex(int i, int len) {
        return ((i + 1 < len) ? i + 1 : 0);
    }

    /**
     * Decrement i modulo len.
     */
    private static int prevIndex(int i, int len) {
        return ((i - 1 >= 0) ? i - 1 : len - 1);
    }

In this way, if there are a large number of ThreadLocal in a thread, performance problems will occur, because the table needs to be traversed every time to clear invalid values. Therefore, when using ThreadLocal, we should use as few ThreadLocal as possible and do not create a large number of ThreadLocal in the thread. If we need to set different parameter types, we can store an Object's Map through ThreadLocal. In this way, the number of ThreadLocal created can be greatly reduced.
The pseudo code is as follows:

public final class HttpContext {

private HttpContext() {
}
private static final ThreadLocal<Map<String, Object>> CONTEXT = ThreadLocal.withInitial(() -> new ConcurrentHashMap(64));
public static <T> void add(String key, T value) {
    if(StringUtils.isEmpty(key) || Objects.isNull(value)) {
        throw new IllegalArgumentException("key or value is null");
    }
    CONTEXT.get().put(key, value);
}
public static <T> T get(String key) {
    return (T) get().get(key);
}
public static Map<String, Object> get() {
    return CONTEXT.get();
}
public static void remove() {
    CONTEXT.remove();
}

}

In this case, if we need to pass different parameters, we can directly use one ThreadLocal instead of multiple ThreadLocal.
If I don't want to play like this, I just want to create multiple ThreadLocal. That's what I need, and the performance needs to be better. Can I implement this column? You can use the FastThreadLocal of netty to solve this problem, but you need to cooperate to make the threads of FastThreadLocalThread or its subclasses more efficient. You can consult the information about its use.

Let's first look at its hash function

// The generated hash code gap is this magic number, so that the generated value or ThreadLocal ID can be evenly distributed in an array with a power of 2.
private static final int HASH_INCREMENT = 0x61c88647;

/**
 * Returns the next hash code.
 */
private static int nextHashCode() {
    return nextHashCode.getAndAdd(HASH_INCREMENT);
}

It can be seen that it adds a magic number 0x61c88647 to the ID/threadLocalHashCode of the last constructed ThreadLocal. The selection of this magic number is related to Fibonacci hash. 0x61c88647 corresponds to 1640531527 When we use the magic number 0x61c88647 to accumulate, assign each ThreadLocal with its own ID, that is, threadLocalHashCode, and then take the modulus with the power of 2 (the length of the array), the result distribution is very uniform. We can also demonstrate through this magic number

public class MagicHashCode {

private static final int HASH_INCREMENT = 0x61c88647;

public static void main(String[] args) {
    hashCode(16); //Initialization 16
    hashCode(32); //Subsequent 2x capacity expansion
    hashCode(64);
}

private static void hashCode(Integer length) {
    int hashCode = 0;
    for (int i = 0; i < length; i++) {
        hashCode = i * HASH_INCREMENT + HASH_INCREMENT;//Hash per increment_ INCREMENT
        System.out.print(hashCode & (length - 1));
        System.out.print(" ");
    }
    System.out.println();
}

}

Operation results:

7 14 5 12 3 10 1 8 15 6 13 4 11 2 9 0
7 14 21 28 3 10 17 24 31 6 13 20 27 2 9 16 23 30 5 12 19 26 1 8 15 22 29 4 11 18 25 0
7 14 21 28 35 42 49 56 63 6 13 20 27 34 41 48 55 62 5 12 19 26 33 40 47 54 61 4 11 18 25 32 39 46 53 60 3 10 17 24 31 38 45 52 59 2 9 16 23 30 37 44 51 58 1 8 15 22 29 36 43 50 57 0

I have to admire the author for using the Fibonacci hash method to ensure the dispersion of the hash table and make the results very uniform. Visible code to write well, mathematics is still indispensable. Other source code will not be analyzed. If you are interested, you can check it yourself.

ThreadLocal memory leak

Whether ThreadLocal will cause memory leakage is also a controversial issue. First, we need to know what is a memory leak?

In Java, memory leakage is the existence of some allocated objects. These objects have the following two characteristics. First, these objects are reachable, that is, in a directed graph, there are paths that can be connected to them; Second, these objects are useless, that is, they will not be used by the program in the future. If the objects meet these two conditions, these objects can be determined as memory leaks in Java. These objects will not be recycled by GC, but they occupy memory.

Memory leakage of ThreadLocal:

  • The life cycle of the thread is very long. When ThreadLocal is not strongly referenced by the outside, it will be recycled by the GC (leaving ThreadLocal empty): an Entry with null key will appear in ThreadLocalMap, but the value of this Entry will never be accessed (set, get and other methods cannot be operated in the future). If the thread has not ended, the Entry with null key is also strongly referenced (Entry.value), and the Entry is strongly referenced (Entry[] table) by the ThreadLocalMap of the current thread, resulting in the Entry Value can never be GC, causing a memory leak.
    Let's demonstrate this scenario

public static void main(String[] args) throws InterruptedException {

    ThreadLocal<Long []> threadLocal = new ThreadLocal<>();
    for (int i = 0; i < 50; i++) {
        run(threadLocal);
    }
    Thread.sleep(50000);
    // Remove strong references
    threadLocal = null;
    System.gc();
    System.runFinalization();
    System.gc();
}

private static void run(ThreadLocal<Long []> threadLocal) {
    new Thread(() -> {
        threadLocal.set(new Long[1024 * 1024 *10]);
        try {
            Thread.sleep(1000000000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }).start();
}

Through jconsole. jdk's own tool Exe will find that even if gc} is executed, the memory will not be reduced, because the key is also strongly referenced by the thread. The renderings are as follows:

  • For this situation
    This situation has been considered in the design of ThreadLocalMap. As long as you call the set(), get(), and remove() methods, you will call the cleanSomeSlots() and expungeStaleEntry() methods to clear the value with null key. This is a passive cleanup method, but if the set(), get(), remove() method of ThreadLocal is not called, it will lead to a memory leak of value. Its documentation recommends that we use static modified ThreadLocal, which results in the life cycle of ThreadLocal as long as the class holding it. Because ThreadLocal has strong references, it means that this ThreadLocal will not be GC. In this case, if we do not delete it manually, the key of the Entry will never be null, and the weak reference will lose its meaning. Therefore, we should try our best to form a good habit when using it, and manually call the remove method after use. In fact, in the actual production environment, we manually remove most cases not to avoid the case that the key is null, but more often to ensure the correctness of business and programs. For example, after placing a request, we build the context request information of the order through ThreadLocal, and then asynchronously update the user points through the thread pool. At this time, if the update is completed and no remove operation is performed, even if the original value will be overwritten by the next new order, it may lead to business problems.

If you don't want to clean manually, are there other ways to solve the following problems?
FastThreadLocal provides an automatic recycling mechanism.

  • In the scenario of thread pool, if the program does not stop and the thread is reused all the time, it will not be destroyed. In fact, the essence is the same as the above example. If the thread is not reused, it will be destroyed after use, and there will be no leakage. Because the jvm will actively call the exit method to clean up when the thread ends.

      /**
    • This method is called by the system to give a Thread
    • a chance to clean up before it actually exits.
      */

    private void exit() {

    if (group != null) {
        group.threadTerminated(this);
        group = null;
    }
    /* Aggressively null out all reference fields: see bug 4006245 */
    target = null;
    /* Speed the release of some of these resources */
    threadLocals = null;
    inheritableThreadLocals = null;
    inheritedAccessControlContext = null;
    blocker = null;
    uncaughtExceptionHandler = null;

    }

InheritableThreadLocal

At the beginning of the article, it is mentioned that the variable transfer between parent and child threads is lost. However, InheritableThreadLocal provides a data sharing mechanism between parent and child threads. Can solve this problem.

static ThreadLocal threadLocal = new ThreadLocal<>();

static InheritableThreadLocal<String> inheritableThreadLocal = new InheritableThreadLocal<>();

public static void main(String[] args) throws InterruptedException {
    threadLocal.set("threadLocal Value of the main thread");
    Thread.sleep(100);
    new Thread(() -> System.out.println("Sub thread acquisition threadLocal Main route value:" + threadLocal.get())).start();
    Thread.sleep(100);
    inheritableThreadLocal.set("inheritableThreadLocal Value of the main thread");
    new Thread(() -> System.out.println("Sub thread acquisition inheritableThreadLocal Main route value:" + inheritableThreadLocal.get())).start();

}

Output results

Thread gets the thread value of threadLocal: null
The child thread obtains the main thread value of inheritableThreadLocal: the value of the main thread of inheritableThreadLocal

However, there will be problems when the InheritableThreadLocal and thread pool are used, because the child thread will copy the data in the parent thread inheritableThreadLocals to its own inheritablethreadlocales only when the thread object is created. In this way, the context transfer of parent thread and child thread is realized. However, in the case of thread pool, threads will be reused, so there will be problems. What can be done to solve this problem? You can think about it or leave a message below. If you really don't want to think, you can refer to Alibaba's transmittable thread local.

summary

  • This paper briefly introduces the common usage of ThreadLocal, the general implementation principle, the memory leakage problem of ThreadLocal, the matters needing attention in using it, and how to solve the transmission between parent and child threads.
  • This paper introduces various usage scenarios of ThreadLocal, InheritableThreadLocal, FastThreadLocal and transmittable thread local, as well as matters needing attention.
  • This article focuses on ThreadLocal. If this is clarified, other ThreadLocal will be better understood.

Topics: Java