Deep analysis of Java thread safety principle

Posted by Chinese on Sun, 06 Mar 2022 12:06:25 +0100

Some novice or old java programmers know that accessing a shared variable under multithreading is non thread safe. The so-called non thread safe means that the modification of shared variables by thread a may not be visible to thread b. Direct example

// It is suggested that you can take a screenshot of the code and read it together with the text
public class Test {
    static long a= 0; //Shared variable
    //  volatile static long a= 0; 
    public static void main(String[] args) throws Exception {
        Thread work = new Thread(() -> {
            while (a== 0) {
            }
//            System.out.println("work finish");
        });
        work.start(); //Worker thread on
        Thread.sleep(100);
        a= 7;
        work.join();
    }
}

I didn't use this exampleTo print something, I annotated the print statement. Why? Because some students may have learned something about multithreading. They know that there are synchronous code blocks in println, but they don't know the details of synchronous code blocks. They only know that the latest information of shared variables can be read in synchronous code blocks, or synchronous code blocks can add read-write barriers. In order to ignore these details, I removed the print statements. When the running program finds that the program can't stop all the time, analyze the program: the main thread must have stopped, so it must be that the work thread can't stop. Why doesn't the work thread end this dead loop when judging the loop after I modify the value of sum? Some students may have added volatile keyword to shared variables and found that the program can stop. However, this paper does not mention this keyword here, but only analyzes the principle that the program can't stop

After setting the sleep time of the main thread to 1 or 0 milliseconds, it is found that the program can run normally and end

  • sleep (1) the program can stop, and the work thread {can feel the change of the main thread to the sum
  • The sleep (100) program cannot be stopped, and the work thread cannot feel the change of the main thread to the sum

This is due to different sleep time and different optimization degree

  • sleep (1) the JVM has not officially started JIT optimization for the code blocks executed in the work thread
  • sleep (100) JVM officially turns on {JIT optimization for code blocks executed in the work thread

"Oh, are you talking nonsense! How do you know if this optimization is turned on? Are you JIT or JIT's father?" Some students may want to ask. Run the program Xint by adding the virtual machine parameter and set the JVM execution engine to pure interpretation execution. At this time, it is found that even if the main sleeps for 10000 seconds, the program can end normally, which means that when the JVM execution engine is pure interpretation execution, the shared variables are visible among multiple threads, That is, the change of the shared variable by the main thread is visible to the work thread, so at this time, the work thread reads that the sum is 7, which does not meet the judgment conditions of the loop. The work thread jumps out of the loop and the program ends.

There are three modes of adjustment in the JVM execution engine - pure interpretation, pure compilation and mixed mode. The default is 64. The host is in mixed mode. So it just appears that sleeping for 1 millisecond can stop the program. It is because the code executed by the work thread has not been officially interpreted and executed by the JIT, but sleeping for 100 seconds, The circular code block executed by the work thread has been fully interpreted and executed by the JIT (because JIT has c1 and c2 execution engines, and in the default layered compilation mode, a piece of code block will be compiled in many ways. Here, we assume that JIT compilation is atomic, or there is no code compilation, or the code compilation and execution is enabled). Conclusion: the invisibility of shared variables between threads is a side effect of JIT compiler.

If you are interested in understanding more in-depth principles, you can look down~

public class Test {
    static long a = 0;//Shared variable
    public static void main(String[] args) throws Exception {
        Thread work = new Thread(() ->a++);work.start();
        a++;
    }
}
According to the abstraction of JMM, first a must be in the main memory, and then there is a copy of a in the working memory of the work thread and the working memory of the main thread. Each thread can only access its own working memory. The model is abstracted as shown in the figure below. It is recommended to view the description of the following article after snipaste mapping

  • Main: abbreviation of main thread (similarly to work)
  • Memory 1: abbreviation of working memory 1 (memory 2 is the same)
  • a1: abbreviation of copy a1

In the normal state, work reads a1, completes the change operation, returns the value to memory 1, and memory 1 receives the value returned by work

Complete the change to a1 and refresh the value of a1 to a. Main memory then brushes the value to memory 2, so main feels the modification of shared variable a by other threads. Therefore, when two threads give a shared variable a + +, the final result of a may be 1 or 2. But 2 the probability of manually running this result in the above code is too small, so I set the test program as follows:

// Add virtual machine parameters and turn off JIT. For the sake of experimental preciseness,
// VM parameters: - Xint
public class Test {
    static long a = 0;
    public static void main(String[] args) throws Exception {
        while (true){
            Thread work = new Thread(() ->a++);work.start();
            a++; if (a==2) break;
            work.join();a = 0; //Reset
        }
        System.out.println("end");
    }
}

Abnormal state: refers to the JIT optimization of the code (the repeatedly executed hot code is transformed into machine code), and the focus is to optimize the shared variables used in this code. Code optimization is easy to understand. For example, bytecode is compiled and stored as machine code. Machine code must run faster than bytecode. For example, JIT optimizes a piece of code for the work thread (this refers to the while loop block). The shared variable a is referenced in the code. This shared variable must have a copy value a1 in the working memory corresponding to the work thread. Originally, the main memory can update this copy, As long as the main thread changes a2 and then synchronizes to main memory a, main memory can update a1 synchronously. However, when the work thread is optimized by JIT, from this time point, the main memory cannot update the copy a1 of memory 1 (but on the contrary, memory 1 can update the shared variable a of main memory), as shown in the following figure

// It is the code used for the first time above. For convenience here
public class Test {
    static long a= 0; //Shared variable
//  volatile static long a= 0; 
    public static void main(String[] args) throws Exception {
        Thread work = new Thread(() -> {
            while (a== 0) {
            }
//            System.out.println("work finish");
        });
        work.start(); //Worker thread on
        Thread.sleep(100);
        a= 7;
        work.join();
    }
}

Do another experiment to verify this conclusion. The code is as follows

//First take a screenshot of the code, and then look at it together with the following text
// VM parameter: clear the VM parameter. Here we need to test JIT, so we must clear it
public class Test {
    static long a = 0; //Shared variable
    static long h = 0; //Shared variable

    public static void main(String[] args) throws Exception {
        new Thread(() -> {
            while (a <= 0x7ffffffffl) {h = a;} }).start();
        Thread work = new Thread(() -> {
            while (a <= 0x7ffffffffl) {a++;}});
        work.start();

        Thread.sleep(1000);
        //Sleeping for one second ensures that both threads have been aggressively optimized by JIT
        //Generally, this loop body can be optimized in a few milliseconds JIT
        System.out.println(h);//127914
        System.out.println(a);//1551985624
        Thread.sleep(100);
        System.out.println(h);//127914
        System.out.println(a);//1689629292

        work.join(); // The end of work takes about 10 seconds, so wait here
        System.out.println("work Thread end");
    }
}

First, clear the vm parameters, and then the virtual machine will execute with mixed compilation, so that the JIT can optimize the code.

The anonymous thread and the main thread are executed by JIT optimization after one second. At this time, the work thread can always update the copy a of its working memory, and then the copy a will be synchronized to main memory a, which is synchronized to the main thread that is not optimized by JIT

Copy a, so the main thread can print different a values.

The anonymous thread is JIT optimized at a certain time within 1 second. At this time, the shared variable a of main memory cannot be synchronized to the anonymous thread. At this time, the copy a of the anonymous thread will not change again. Therefore, the value assigned to the copy h of its own working memory has always been the copy a of its own working memory. Therefore, the printed H value is the value that is synchronized with the a value in main memory for the last time when the anonymous thread is JIT optimized.

This program cannot end normally, because the copy a of the anonymous thread cannot be updated by itself, nor can it be synchronized by the main memory. The copy a of the work thread cannot be synchronized by the main memory, but it will change its copy a value every time. After a certain number of cycles, it will jump out of the cycle.

I'm looking at an experiment

//First take a screenshot of the code, and then look at it together with the following text
// VM parameter: clear the VM parameter. Here we need to test JIT, so we must clear it
public class Test {
    static long a= 0; //Shared variable
    static long h= 0; //Shared variable
    public static void main(String[] args) throws Exception {
        Thread work = new Thread(() -> {
            while (a <= 0x7ffffffffl) {a++;}
        }); work.start();
        Thread.sleep(1000); // JIT radical optimization on

        new Thread(() -> {
            long now = a;
            long now1 = a;
            while (now == now1) { //Make sure that the update of a is visible in this newly opened thread
                now = a;
                now1 = a;
            }
            System.out.println("End of inner thread");
        }).start();
        
        long var = a;
        long var1 = a;
        while (var == var1) {//Make sure that the update of a is visible in the main thread
            var = a;
            var1 = a;
        }
        System.out.println("Externally visible");

        work.join();
        System.out.println("end!");

Using the just logic, one second later, the work thread is optimized by JIT. The work thread is updating the copy a of its working memory at all times, and the updated working memory will synchronize the updated copy a to a in the main memory (only the main memory can no longer assign a value to the copy a in the work memory). At this time, start an anonymous thread. Because it has just been started, It has not been optimized by JIT, so the copy a of the anonymous thread is synchronized with the main memory. Therefore, when reading the copy a of its own working memory twice, the reading results of the two times may be different. It should be that the work thread has been updating the copy a of the main memory, while the copy a of the anonymous thread keeps the same pace with the main memory, so the reading results of the two times may be different, and the anonymous thread can end normally

Although the main thread was created long ago, this code block needs to be executed in a loop many times before it can be optimized. It is not optimized by JIT. The copy a of the main thread is synchronized with the main memory, so the loop can be stopped.

Although the copy a of the work thread cannot be assigned by the main memory, it can update itself. Therefore, after a certain time (I am in 10 seconds), the work ends and the whole program terminates!

       

See the picture below,

There is no mention of cache consistency, JIT layered compilation, read-write lock content and other underlying content. I absolutely don't need it, and now the conclusion is enough.

If you still don't understand, you can mention it as much as possible and communicate with each other.

 

 

Topics: Java security