Java Concurrent Programming I (theoretical basis)

Posted by Zephyris on Sat, 04 Dec 2021 04:44:39 +0100

The knowledge of threads related to computer basics is omitted here

Thread safety

Relevant definitions are as follows:

When multiple threads access a class, no matter what scheduling mode the runtime environment adopts or how these threads will execute alternately, and no additional synchronization or cooperation is required in the code, this class can show correct behavior, so this class is called line safe

Common thread safe objects:

Stateless object

Stateless objects must be thread safe because there is no access to any property fields of the current object; For method calls, a self owned method stack will be used, which will make method calls thread safe

import java.util.Random;

// A simple stateless object with only one instance method generates a random number through the incoming seed
public class StateSafeClass {
    public long random(long seed) {
        Random random = new Random(seed);
        return random.nextLong();
    }
}

Atomic operation object

If the access to all properties of an object is atomic, the object will be "thread safe" when accessed by multiple threads at the same time

Common atomic operation objects include AtomicInteger, AtomicLong, AtomicBoolean, etc

When multiple atomic operation objects are used to construct "thread safe" objects by combination, only when all States are updated simultaneously in a single atomic operation can the constructed class be "thread safe"
Immutable object

Locking mechanism

Thread safety of code blocks can be achieved by locking. This is because partial order rules about "locks" are defined in the Java Memory Model:

Monitor lock rule: the unlocking operation on the monitor lock must be performed before the locking operation on the same monitor lock (explicit lock and built-in lock have the same memory semantics in locking and unlocking operations)

This is the fundamental reason why the thread safety of code can be realized by locking

Built in lock

The built-in lock uses the Java keyword synchronized keyword to surround the corresponding code block, so that the code block is thread safe

An example of using the synchronized keyword is as follows:

public class SynchronizedExample {
    // The object used to obtain the monitor lock. For details, see the MarkWord section in the structure of Java objects
    final Object object = new Object();
    
    // The state of the current object
    int cnt = 0;
    
    // Thread safety for status updates of the current object is maintained through the synchronized keyword
    public void plus() {
        // The object here is equivalent to the key of the lock. Only one thread can obtain the key at the same time
        synchronized (object) {
            cnt++;
        }
    }
}

The built-in lock in Java is "reentrant", which means that if a thread obtains a built-in lock, at any time later (if the thread still holds the lock), the thread can obtain the lock again and enter the corresponding code block.

The built-in lock is designed to be reentrant, which is inevitable. See the following example:

public class Parent {
    // synchronized when decorating a method, the "key" obtained is this, that is, the current object
    public synchronized void parentDo() {
        // do something.......
    }
}

/* 
	Since Child inherits the Parent, if the built-in lock is not reentrant, in childDo() 
	The parentDo() method that calls the parent class in the method will result in deadlock. 
*/
public class Child extends Parent {
    public synchronized void childDo() {
        // child do something.....
        super.parentDo();
    }
}

Because different JVM s have different implementations, the reentrant implementations of built-in locks are also different. A common implementation method is to associate each lock with an acquisition count value and an owner thread. When the count value is 0, it is considered that the lock is not held by any thread; Every time the thread that has acquired the lock acquires the lock again, the corresponding count value will be + 1. When exiting the lock, the count value will be - 1. When the count value is 0, the lock will be released

Explicit lock

ReentrantLock is a concrete implementation of java.util.concurrent.locks.Lock based on AQS. It is different from the built-in lock using synchronized keyword in that explicit lock provides unconditional, pollable, timed and interruptible lock acquisition operations. Both lock adding and lock releasing operations are explicit, so it is called "explicit lock"

Explicit locks have the same mutex and visibility (memory semantics) as built-in locks

Object sharing

visibility

It is still some partial order relationships defined in JMM:

Program order rule: if operation A in the program precedes operation B, operation A in the thread will be executed before operation B

Monitor lock rule: the unlocking operation on the monitor lock must be performed before the locking operation on the same monitor lock

Volatile variable rule: the write operation to a volatile variable must be performed before the read operation to the variable (atomic variables and volatile variables have the same semantics in read and write operations)

Thread start rule: a call to Thread.start() on a thread must be executed before any operation is performed in the thread

Thread end rule: any operation in a thread must be performed before other threads detect that the thread has ended, or successfully returned from Thread.join(), or false in calling Thread.isAlive()

Interrupt rule: when a thread calls interrupt on another thread, it must execute before the interrupted thread detects interrupt (or throw InterruptException, or call isInterrupted and interrupted)

Finalizer rule: the constructor of an object must be executed before starting the finalizer of the object

Transitivity: if operation A is performed before operation B and operation B is performed before operation C, operation A must be performed before operation C

In addition to the rules described above, in a Java program, unexpected adjustments may be made to the execution order of operations in the JVM, processor and runtime, which is also called "instruction reordering". The relevant partial order rules defined by JMM should be correctly used to make the program run according to normal logic

Minimum security

When the thread is not synchronized, it may read an invalid value, but the invalid value is at least the value set by a previous thread, not a random value. This guarantee is also known as "minimum security"

In most cases, "minimum security" always applies, but there is one exception: 64 bit numeric variables (double and long) that are not decorated with volatile

JMM requires that both read and write operations of variables must be atomic operations. However, for non volatile long and double variables, the JVM allows 64 bit write operations and read operations to be decomposed into two 32-bit numerical operations. In this case, it does not meet the "minimum security"

In order to solve this problem, you can modify the corresponding variable by using volatile keyword, or protect the state of the corresponding variable by locking, because volatile can ensure that the write operation to a variable occurs before other thread operations read it; Locking makes the operation on the variable happen before another thread operates on the variable. Locking is a more powerful volatile (JMM partial order relationship)

Object publishing and escaping

"Object publishing" refers to enabling an object to be used in code outside the current scope. For example, save a reference to the object where other code can access, return the reference in a non private method, or pass the reference to the methods of other classes.

"Object escape" means that an object that should not be published is published

Take the following example:

// An unsafe release example
public class UnsafePublisher {
    private String[] states = new String[] {
        "Apple", "Orange", "Strawberry", "Watermelon"
    };
    
    /* 
    	The internal states attribute can be obtained through the getStates() method. Any client calling this method can directly 
    	states The content of is modified, which is thread unsafe
    */
    public String[] getStates() {return this.states;}
}

The above example is an explicit "remove" case, and it is worth mentioning that this object (the current instance object) is implicitly published

public class ThisEscape {
    public ThisEscape(EventSource source) {
        /*
        	When the EventListener is published in the constructor, this is also implicitly published,
        	Because the inner class EventListener contains a reference to the current ThisEscape object
        	(The instance object of the non static internal class will have an external this reference. See Effective Java for more details)
        */
        
        /* 
        	The problem here is that the ThisEscape object has escaped during construction,
        	Therefore, external calls may access the uninitialized this object
        */
        source.registerListener(new EventListener() {
            public void onEvent(Event e) {
                doSomething(e);
            }
        });
    }

    void doSomething(Event e) {}
    interface EventSource {void registerListener(EventListener e);}
    interface EventListener {void onEvent(Event e);}
    interface Event {}
}

A common error is to start a thread in the constructor, which will cause this object to be shared by the newly created thread. The newly started thread will see an instance object that is not fully constructed!!!

Similarly, if a rewritable instance method (neither a private method nor a termination method) is invoked in the constructor, it will cause the escape of the this object.

If you want to start a thread in the constructor or register event listening, the best solution is to design the constructor of the current object as private. You can effectively avoid this problem by defining a factory method to obtain a new instance object:

public class SafeListener {
    private final EventListener listener;

    private SafeListener() {
        listener = new EventListener() {
            public void onEvent(Event e) {
                doSomething(e);
            }
        };
    }
    
    // In this way, the half constructed instance object will not be published elsewhere
    public static SafeListener newInstance(EventSource source) {
        SafeListener safe = new SafeListener();
        source.registerListener(safe.listener);
        return safe;
    }
    
    // Omit the definition of some interfaces....
}

Thread closure

When accessing shared variable data, locks are usually used for synchronization to maintain the visibility of data state changes. In addition to using locks to synchronize variable data, another way to protect variable data is to make the data not shared and each thread accesses its own data, which fundamentally solves the problem of unsafe access of data among multiple threads. This technology is also known as "thread closure", which is one of the simplest ways to achieve thread safety

When an object is enclosed in a thread, "thread closure" will automatically realize thread safety, even if the enclosed object itself is not thread safe

Example: the JDBC Connection object. The JDBC specification does not require the Connection object to be thread safe, because only one thread will get the Connection at the same time, and the Connection will be released after processing the corresponding task. Only one thread will participate in the whole process, Therefore, the implementation of Connection is not required to be thread safe. Of course, if a thread pool is used, the thread pool must be thread safe, because the thread pool will always be accessed by multiple threads at the same time

Ad hoc thread closure

Ad hoc thread closure means that the responsibility of maintaining thread closure is entirely borne by the program implementation

Ad hoc thread closure is generally fragile because there is no language feature that can close objects to the target thread. When using ad hoc thread closure technology, it is usually necessary to design the subsystem of a system as a single thread subsystem, and the simplicity provided by the single thread subsystem is better than the vulnerability of ad hoc thread closure

Thread closure of volatile variables: if you can ensure that only a single thread can write to shared volatile variables, you can safely perform the "read write read" operation between these shared variables. Please recall the partial order rules provided by JMM. The write operation to volatile variables will occur before the read operation to the variable, Therefore, in the context of only a single thread, it is equivalent to using "thread closure"

Due to the vulnerability of ad hoc thread closure, this thread closure method should not be used in general

Stack closure

Stack closure is a special case of thread closure. In stack closure, objects can only be accessed through local variables. One of the inherent properties of local variables is that they are enclosed in the execution thread. They are located in the stack of the execution thread, and other threads cannot access the stack

A specific example is as follows:

// The source code comes from "Java Concurrent Programming Practice" 3-9

/*
	Since all local variables of the whole method are enclosed in the method, each thread accessing the method will have an independent execution stack to execute the corresponding logic
	Since a corresponding copy of the passed in parameter object is generated, there is a cache consistency problem caused by multiple threads modifying the parameter object at the same time
*/
public int loadTheArk(Collection<Animal> candidates) {
    SortedSet<Animal> animals;
    int numPairs = 0;
    Animal candidate = null;
    
    /*
    	Due to the existence of stack closure, even if TreeSet is not thread safe, the loadTheArk method is still thread safe because there will only be one thread in the currently executed method stack to access this object
    	If animals escape the scope of the current method, the stack closure will be destroyed, thus losing the guarantee of thread safety
    */
    animals = new TreeSet<>(new SpeciesGenderComparator());
    
    /*
    	Put the passed in parameters into a new container, which can prevent accessing the parameter object and modifying the corresponding state
    	Maintain thread security
    */
    animals.addAll(candidates);
    for (Animal a : animals) {
        if (candidate == null || !candidate.isPotentialMate(a))
            candidate = a;
        else {
            ark.load(new AnimalPair(candidate, a));
            ++numPairs;
            candidate = null;
        }
    }
    return numPairs;
}

ThreadLocal object

A better way to maintain thread closure is to use the threadload object. This class enables a thread to be associated with the object that saves the value. The ThreadLocal object provides get and set methods. These methods store an independent copy for each thread that uses this variable, Therefore, the get method can always get the latest value set by the most recently executed thread when calling the set method

Usage scenario:

Prevent sharing of mutable singleton objects or local variables
When frequent operations require a temporary object, and you want to avoid re creating an instance object every time

Immutable object

Another way to meet synchronization requirements is to use immutable objects, also known as value objects in domain driven design

If an object cannot be modified after being created, it is called an "immutable object"

Whether an object is immutable or not requires the following three conditions:

Objects cannot be modified after they are created
All fields of the object are decorated with the final keyword (except String)
The object is built normally (this does not escape during object creation)

Publish objects securely

An instance of an unsafe publishing object:

public class Holder {
    private int n;

    public Holder(int n) {
        this.n = n;
    }

    public void assertSanity() {
        if (n != n)
            throw new AssertionError("This statement is false.");
    }
}

At first glance, there is nothing wrong with this class, but in some cases, AssertionError will be thrown when the assertSanity() method is executed. This is because before constructing the Holder object, the n read for the first time is inconsistent with the n read for the second time due to the lack of sufficient visibility mechanism, thus throwing an exception

If you want to publish an object safely, you can consider the following options:

Initialize an object reference in a static code block (completed when the JVM initializes the Class object)
Use volatile to decorate the reference of the object or in the AtomicReference object
Save the reference of the object to the final type field of a correctly constructed object
Save the reference of the object to a domain protected by a lock

The above schemes (except static code blocks) are based on JMM's partial order rules to ensure the visibility of instantiated objects. The way of using static code blocks is to ensure the visibility through the underlying JVM.

An example is as follows:

public class Holder {
    private final static Object object = new Object();
}

The above example realizes the visibility of objects by initializing objects in static code blocks. The corresponding real codes are as follows:

Compiled from "Holder.java"
public class Holder {
  public Holder();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  static {};
    Code:
       0: new           #2                  // class java/lang/Object
       3: dup
       4: invokespecial #1                  // Method java/lang/Object."<init>":()V
       7: putstatic     #3                  // Field object:Ljava/lang/Object;
      10: return
}

reference resources:

[1] Practice of Java Concurrent Programming Brain Goetz, Tim Peierl, etc

Topics: Java

Programmer Think