This article is the third article on the bottom implementation of dead Synchronized, which is about the implementation of heavy lock.
This series of articles will comprehensively analyze the synchronized lock implementation of HotSpot, including bias lock, lightweight lock, heavy lock lock lock lock, unlock, lock upgrade process and source code analysis, hoping to give some help to students on the synchronized road.
Heavy Expansion and Locking Process
When multiple threads compete for locks at the same time, they enter the synchronizer.cpp#slow_enter method
void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) { markOop mark = obj->mark(); assert(!mark->has_bias_pattern(), "should not see bias pattern here"); //If it is unlocked if (mark->is_neutral()) { lock->set_displaced_header(mark); if (mark == (markOop) Atomic::cmpxchg_ptr(lock, obj()->mark_addr(), mark)) { TEVENT (slow_enter: release stacklock) ; return ; } // Fall through to inflate() ... } else //In case of lightweight lock reentry if (mark->has_locker() && THREAD->is_lock_owned((address)mark->locker())) { assert(lock != mark->locker(), "must not re-lock the same lock"); assert(lock != (BasicLock*)obj->mark(), "don't relock with same BasicLock"); lock->set_displaced_header(NULL); return; } ... //At this point, you need to expand to a heavy lock. Before expanding, set Displaced Mark Word as a special value to indicate that the lock is using a monitor for a heavy lock. lock->set_displaced_header(markOopDesc::unused_mark()); //Invoke inflate to expand to a heavyweight lock, which returns an ObjectMonitor object, and then call its enter method ObjectSynchronizer::inflate(THREAD, obj())->enter(THREAD); }
The expansion process is completed in inflate.
ObjectMonitor * ATTR ObjectSynchronizer::inflate (Thread * Self, oop object) { ... for (;;) { const markOop mark = object->mark() ; assert (!mark->has_bias_pattern(), "invariant") ; //mark is one of the following states: //* Inflated (Heavy Lock Status) - Direct Return //* Stack-locked (Lightweight Lock State) - Expansion //* INFLATING - Wait until the expansion is complete //* Neutral (unlocked state) - Expansion //* BIASED (biased lock) - Illegal status, which will not occur here // CASE: inflated if (mark->has_monitor()) { //It's already in the heavyweight lock state and returns directly ObjectMonitor * inf = mark->monitor() ; ... return inf ; } // CASE: inflation in progress if (mark == markOopDesc::INFLATING()) { //Expansion, indicating that another thread is expanding the lock, continue retry TEVENT (Inflate: spin while INFLATING) ; //In this method, spin/yield/park and other operations are performed to complete the spin action. ReadStableMark(object) ; continue ; } if (mark->has_locker()) { //Current lightweight lock state, first assign an ObjectMonitor object and initialize the value ObjectMonitor * m = omAlloc (Self) ; m->Recycle(); m->_Responsible = NULL ; m->OwnerIsThread = 0 ; m->_recursions = 0 ; m->_SpinDuration = ObjectMonitor::Knob_SpinLimit ; // Consider: maintain by type/class //Set the mark word of the lock object to the INFLATING (0) state markOop cmp = (markOop) Atomic::cmpxchg_ptr (markOopDesc::INFLATING(), object->mark_addr(), mark) ; if (cmp != mark) { omRelease (Self, m, true) ; continue ; // Interference -- just retry } //Displad mark word in stack markOop dmw = mark->displaced_mark_helper() ; assert (dmw->is_neutral(), "invariant") ; //Setting the field of monitor m->set_header(dmw) ; //owner is Lock Record m->set_owner(mark->locker()); m->set_object(object); ... //Set the lock object header to a heavy lock state object->release_set_mark(markOopDesc::encode(m)); ... return m ; } // CASE: neutral //Assignment and initialization of ObjectMonitor objects ObjectMonitor * m = omAlloc (Self) ; // prepare m for installation - set monitor to initial state m->Recycle(); m->set_header(mark); //owner is NULL m->set_owner(NULL); m->set_object(object); m->OwnerIsThread = 1 ; m->_recursions = 0 ; m->_Responsible = NULL ; m->_SpinDuration = ObjectMonitor::Knob_SpinLimit ; // consider: keep metastats by type/class //Replace mark word of object header with CAS as heavy lock state if (Atomic::cmpxchg_ptr (markOopDesc::encode(m), object->mark_addr(), mark) != mark) { //Failure indicates that another thread is executing inflate, releasing monitor objects m->set_object (NULL) ; m->set_owner (NULL) ; m->OwnerIsThread = 0 ; m->Recycle() ; omRelease (Self, m, true) ; m = NULL ; continue ; // interference - the markword changed - just retry. // The state-transitions are one-way, so there's no chance of // live-lock -- "Inflated" is an absorbing state. } ... return m ; } }
In inflate is a for loop, which is mainly used to deal with multi-threaded simultaneous invocation of inflate. Then different processing is performed according to the state of the lock object:
1. It is already a heavyweight state, indicating that the expansion has been completed and returned directly.
2. Expansion is required for lightweight locks
3. If it is expanding, wait busily.
4. Expansion operation is required if there is no lock
Lightweight lock and unlocked state need expansion operation. The expansion process of lightweight lock is as follows:
1. Calling omAlloc to assign an ObjectMonitor object (monitor for short) in the omAlloc method first assigns objects from the thread-private monitor collection omFreeList. If there is no monitor object in the omFreeList, then assigns a batch of monitors from the JVM global gFreeList to the omFreeList.
2. Initialize monitor object
3. Set the state to INFLATING
4. Set monitor's header field to displaced mark word, owner field to Lock Record, and obj field to lock object.
5. Set the mark word of the lock object header to the heavyweight lock state and point to the monitor object allocated in the first step.
The expansion process in the unlocked state is as follows:
1. Call omAlloc to assign an ObjectMonitor object (hereinafter referred to as monitor)
2. Initialize monitor object
3. Set monitor's header field to mark word, owner field to null, and obj field to lock object
4. Set the mark word of the lock object header to the heavyweight lock state and point to the monitor object allocated in the first step.
As for why lightweight locks need an INFLATING state, the comments in the code are:
// Why do we CAS a 0 into the mark-word instead of just CASing the // mark-word from the stack-locked value directly to the new inflated state? // Consider what happens when a thread unlocks a stack-locked object. // It attempts to use CAS to swing the displaced header value from the // on-stack basiclock back into the object header. Recall also that the // header value (hashcode, etc) can reside in (a) the object header, or // (b) a displaced header associated with the stack-lock, or (c) a displaced // header in an objectMonitor. The inflate() routine must copy the header // value from the basiclock on the owner's stack to the objectMonitor, all // the while preserving the hashCode stability invariants. If the owner // decides to release the lock while the value is 0, the unlock will fail // and control will eventually pass from slow_exit() to inflate. The owner // will then spin, waiting for the 0 value to disappear. Put another way, // the 0 causes the owner to stall if the owner happens to try to // drop the lock (restoring the header from the basiclock to the object) // while inflation is in-progress. This protocol avoids races that might // would otherwise permit hashCode values to change or "flicker" for an object. // Critically, while object->mark is 0 mark->displaced_mark_helper() is stable. // 0 serves as a "BUSY" inflate-in-progress indicator.
I don't quite understand it. Students who know it can point it out.~
When the expansion is complete, the enter method is called to obtain the lock.
void ATTR ObjectMonitor::enter(TRAPS) { Thread * const Self = THREAD ; void * cur ; //The owner represents the unlocked state for null, and if CAS is set successfully, the current thread gets the lock directly cur = Atomic::cmpxchg_ptr (Self, &_owner, NULL) ; if (cur == NULL) { ... return ; } //In case of reentry if (cur == Self) { // TODO-FIXME: check for integer overflow! BUGID 6557169. _recursions ++ ; return ; } //Current threads are threads that previously held lightweight locks. Expansion by a lightweight lock and the first call to the enter method, cur is a pointer to Lock Record if (Self->is_lock_owned ((address)cur)) { assert (_recursions == 0, "internal state error"); //Reentry count reset to 1 _recursions = 1 ; //Set the owner field to the current thread (previously owner was a pointer to Lock Record) _owner = Self ; OwnerIsThread = 1 ; return ; } ... //Before invoking the synchronous operation of the system, try spinning to get the lock if (Knob_SpinEarly && TrySpin (Self) > 0) { ... //When the lock is acquired during spinning, it returns directly. Self->_Stalled = 0 ; return ; } ... { ... for (;;) { jt->set_suspend_equivalent(); //Call system synchronization operation in this method EnterI (THREAD) ; ... } Self->set_current_pending_monitor(NULL); } ... }
If the current state is unlocked, the lock reentry, and the current thread is a thread that previously held a lightweight lock, it returns after a simple operation.
First spin to try to get the lock, the purpose is to reduce the overhead of synchronizing the operating system.
Call the EnterI method to get a lock or block
The EnterI method is rather long. Before looking at it, let's first elaborate on its general principles:
An ObjectMonitor object includes several key fields: cxq (Content List in the figure below), Entry List, WaitSet, owner.
Among them, cxq, Entry List and WaitSet are all linked list structures of Object Waiter, and owner points to threads holding locks.
When a thread tries to acquire a lock, if the lock has been occupied, it encapsulates the thread as an ObjectWaiter object and inserts it into the head of the queue of cxq, and then calls the park function to suspend the current thread. On linux system, the bottom of park function calls pthread_cond_wait of gclib library, and the bottom of ReentrantLock of JDK hangs threads with this method. More details can be found in my previous two articles: Consideration on Synchronization,linux kernel-level synchronization mechanism-futex
When a thread releases a lock, it picks a thread from cxq or Entry List to wake up. The selected thread is called Heir Presumption, which assumes that the heir will try to acquire the lock after it is awakened, but synchronized is unfair, so it assumes that the heir is not. Locks must be available (which is why they are called "hypothetical" heirs).
If the thread calls the Object wait method after it gets the lock, it will add the thread to the WaitSet. When it is awakened by Object notify, it will move the thread from the WaitSet to cxq or Entry List. It should be noted that when a lock object wait or notify method is invoked, if the current lock state is biased to a lock or a lightweight lock, it will first expand to a heavyweight lock.
The synchronized monitor lock mechanism is very similar to JDK ReentrantLock and Condition. ReentrantLock also has a list of threads waiting for acquisition of locks. Condition also has a set similar to WaitSet to store threads calling await. If you had a good understanding of ReentrantLock before, it would be easy to understand monitor.
Back to the code, start analyzing the EnterI method:
void ATTR ObjectMonitor::EnterI (TRAPS) { Thread * Self = THREAD ; ... //Attempt to acquire locks if (TryLock (Self) > 0) { ... return ; } DeferredInitialize () ; //Spin if (TrySpin (Self) > 0) { ... return ; } ... //Encapsulate threads into node nodes ObjectWaiter node(Self) ; Self->_ParkEvent->reset() ; node._prev = (ObjectWaiter *) 0xBAD ; node.TState = ObjectWaiter::TS_CXQ ; //Insert node nodes into the head of the _cxq queue, where CXQ is a one-way linked list ObjectWaiter * nxt ; for (;;) { node._next = nxt = _cxq ; if (Atomic::cmpxchg_ptr (&node, &_cxq, nxt) == nxt) break ; //If CAS fails, try again to get the lock, which reduces the frequency of insertion into the _cxq queue if (TryLock (Self) > 0) { ... return ; } } //SyncFlags defaults to 0, and if there are no other waiting threads, set _Responsible to itself if ((SyncFlags & 16) == 0 && nxt == NULL && _EntryList == NULL) { Atomic::cmpxchg_ptr (Self, &_Responsible, NULL) ; } TEVENT (Inflated enter - Contention) ; int nWakeups = 0 ; int RecheckInterval = 1 ; for (;;) { if (TryLock (Self) > 0) break ; assert (_owner != Self, "invariant") ; ... // park self if (_Responsible == Self || (SyncFlags & 1)) { //When the current thread is _Responsible, the park with a time parameter is called TEVENT (Inflated enter - park TIMED) ; Self->_ParkEvent->park ((jlong) RecheckInterval) ; // Increase the RecheckInterval, but clamp the value. RecheckInterval *= 8 ; if (RecheckInterval > 1000) RecheckInterval = 1000 ; } else { //Otherwise, call park directly to suspend the current thread TEVENT (Inflated enter - park UNTIMED) ; Self->_ParkEvent->park() ; } if (TryLock(Self) > 0) break ; ... if ((Knob_SpinAfterFutile & 1) && TrySpin (Self) > 0) break ; ... //When the lock is released, _succ is set to a thread in EntryList or _cxq if (_succ == Self) _succ = NULL ; // Invariant: after clearing _succ a thread *must* retry _owner before parking. OrderAccess::fence() ; } //Go here and say you've got the lock. assert (_owner == Self , "invariant") ; assert (object() != NULL , "invariant") ; //Remove the node of the current thread from cxq or Entry List UnlinkAfterAcquire (Self, &node) ; if (_succ == Self) _succ = NULL ; if (_Responsible == Self) { _Responsible = NULL ; OrderAccess::fence(); } ... return ; }
The main steps are three steps:
Insert the current thread into the head of the cxq queue
Then the current thread of park
When awakened, try to get the lock again
In particular, the roles of the _Responsible and _succ fields are described here:
When competition occurs, a thread is selected as _Responsible, and the _Responsible thread calls a park method with time constraints to prevent stranding.
_ succ threads are set to release locks on threads, which means Heir presumptive, the assumed heir we mentioned above.
Release of Heavy Lock
The code released by the heavyweight lock is in ObjectMonitor::exit:
void ATTR ObjectMonitor::exit(bool not_suspended, TRAPS) { Thread * Self = THREAD ; //If _owner is not the current thread if (THREAD != _owner) { //Current threads are threads that previously held lightweight locks. Since the enter method has not been invoked since the lightweight lock has expanded, the _owner will be a pointer to Lock Record. if (THREAD->is_lock_owned((address) _owner)) { assert (_recursions == 0, "invariant") ; _owner = THREAD ; _recursions = 0 ; OwnerIsThread = 1 ; } else { //Exception: Currently not a thread holding a lock TEVENT (Exit - Throw IMSX) ; assert(false, "Non-balanced monitor enter/exit!"); if (false) { THROW(vmSymbols::java_lang_IllegalMonitorStateException()); } return; } } //If the reentrant counter is not zero, the counter-1 returns if (_recursions != 0) { _recursions--; // this is simple recursive enter TEVENT (Inflated exit - recursive) ; return ; } //_ Responsible is set to null if ((SyncFlags & 4) == 0) { _Responsible = NULL ; } ... for (;;) { assert (THREAD == _owner, "invariant") ; //Knob_ExitPolicy defaults to 0 if (Knob_ExitPolicy == 0) { //code 1: Release the lock first, and then get the lock if other threads enter the synchronization block OrderAccess::release_store_ptr (&_owner, NULL) ; // drop the lock OrderAccess::storeload() ; // See if we need to wake a successor //code 2: If there are no waiting threads or there are assumed heirs if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != NULL) { TEVENT (Inflated exit - simple egress) ; return ; } TEVENT (Inflated exit - complex egress) ; //code 3: The operation to be performed requires a lock to be retrieved, i.e. to set _owner as the current thread if (Atomic::cmpxchg_ptr (THREAD, &_owner, NULL) != NULL) { return ; } TEVENT (Exit - Reacquired) ; } ... ObjectWaiter * w = NULL ; //code 4: Different wake-up strategies depend on QMode, default 0 int QMode = Knob_QMode ; if (QMode == 2 && _cxq != NULL) { //QMode== 2: Threads in cxq have higher priority and wake up the head thread of cxq directly w = _cxq ; assert (w != NULL, "invariant") ; assert (w->TState == ObjectWaiter::TS_CXQ, "Invariant") ; ExitEpilog (Self, w) ; return ; } if (QMode == 3 && _cxq != NULL) { //Insert elements in cxq at the end of Entry List w = _cxq ; for (;;) { assert (w != NULL, "Invariant") ; ObjectWaiter * u = (ObjectWaiter *) Atomic::cmpxchg_ptr (NULL, &_cxq, w) ; if (u == w) break ; w = u ; } assert (w != NULL , "invariant") ; ObjectWaiter * q = NULL ; ObjectWaiter * p ; for (p = w ; p != NULL ; p = p->_next) { guarantee (p->TState == ObjectWaiter::TS_CXQ, "Invariant") ; p->TState = ObjectWaiter::TS_ENTER ; p->_prev = q ; q = p ; } // Append the RATs to the EntryList // TODO: organize EntryList as a CDLL so we can locate the tail in constant-time. ObjectWaiter * Tail ; for (Tail = _EntryList ; Tail != NULL && Tail->_next != NULL ; Tail = Tail->_next) ; if (Tail == NULL) { _EntryList = w ; } else { Tail->_next = w ; w->_prev = Tail ; } // Fall thru into code that tries to wake a successor from EntryList } if (QMode == 4 && _cxq != NULL) { //Insert cxq into the head of the Entry List w = _cxq ; for (;;) { assert (w != NULL, "Invariant") ; ObjectWaiter * u = (ObjectWaiter *) Atomic::cmpxchg_ptr (NULL, &_cxq, w) ; if (u == w) break ; w = u ; } assert (w != NULL , "invariant") ; ObjectWaiter * q = NULL ; ObjectWaiter * p ; for (p = w ; p != NULL ; p = p->_next) { guarantee (p->TState == ObjectWaiter::TS_CXQ, "Invariant") ; p->TState = ObjectWaiter::TS_ENTER ; p->_prev = q ; q = p ; } // Prepend the RATs to the EntryList if (_EntryList != NULL) { q->_next = _EntryList ; _EntryList->_prev = q ; } _EntryList = w ; // Fall thru into code that tries to wake a successor from EntryList } w = _EntryList ; if (w != NULL) { //If EntryList is not empty, wake up the head element of EntryList directly assert (w->TState == ObjectWaiter::TS_ENTER, "invariant") ; ExitEpilog (Self, w) ; return ; } //EntryList is null, which handles elements in cxq w = _cxq ; if (w == NULL) continue ; //Since the element of cxq is moved to Entry List later, the cxq field is set to null here for (;;) { assert (w != NULL, "Invariant") ; ObjectWaiter * u = (ObjectWaiter *) Atomic::cmpxchg_ptr (NULL, &_cxq, w) ; if (u == w) break ; w = u ; } TEVENT (Inflated exit - drain cxq into EntryList) ; assert (w != NULL , "invariant") ; assert (_EntryList == NULL , "invariant") ; if (QMode == 1) { //QMode = 1: Transfer the elements in cxq to Entry List and reverse the order ObjectWaiter * s = NULL ; ObjectWaiter * t = w ; ObjectWaiter * u = NULL ; while (t != NULL) { guarantee (t->TState == ObjectWaiter::TS_CXQ, "invariant") ; t->TState = ObjectWaiter::TS_ENTER ; u = t->_next ; t->_prev = u ; t->_next = s ; s = t; t = u ; } _EntryList = s ; assert (s != NULL, "invariant") ; } else { // QMode == 0 or QMode == 2' //Transfer elements in cxq to Entry List _EntryList = w ; ObjectWaiter * q = NULL ; ObjectWaiter * p ; for (p = w ; p != NULL ; p = p->_next) { guarantee (p->TState == ObjectWaiter::TS_CXQ, "Invariant") ; p->TState = ObjectWaiter::TS_ENTER ; p->_prev = q ; q = p ; } } //_ succ is not null, indicating that there is already an inheritor, so it does not need the current thread to wake up, reducing the rate of context switching if (_succ != NULL) continue; w = _EntryList ; //Wake up the first element of Entry List if (w != NULL) { guarantee (w->TState == ObjectWaiter::TS_ENTER, "invariant") ; ExitEpilog (Self, w) ; return ; } } }
After making the necessary lock reentry judgment and spin optimization, we enter the main logic:
Cod1 sets owner to null, which releases the lock, at which point other threads can acquire the lock. Here is an optimization of an unfair lock;
Cod2: If there are no currently waiting threads, just return directly, because no other threads need to be waked up. Or if succ is not null, it means that there is currently a "waking" successor thread, then the current thread does not need to wake up any threads;
Cod3: The current thread regains the lock because it then operates on cxq and EntryList queues and wakes up the thread.
Cod4 implements different wake-up strategies according to different QMode s.
According to the different QMode, there are different ways of handling:
QMode = 2 and cxq is not empty: Take the ObjectWaiter object at the head of the cxq queue and call the ExitEpilog method, which wakes up the thread of the ObjectWaiter object and returns immediately. The later code will not execute.
QMode = 3 and cxq is not empty: insert cxq queue at the end of Entry List;
QMode = 4 and cxq is not empty: insert cxq queue into the head of Entry List;
QMode = 0: Don't do anything for the time being, keep looking down.
Only QMode=2 will return in advance, and 0, 3, 4 will continue to execute:
1. If the first element of the EntryList is not empty, the ExitEpilog method is taken out and called, which wakes up the thread of the ObjectWaiter object and returns immediately.
2. If the first element of EntryList is empty, all elements of cxq are put into EntryList, and then the first element of the queue is removed from EntryList to execute the ExitEpilog method, and returned immediately.
QMode defaults to 0, and with the above process we can see this demo:
public class SyncDemo { public static void main(String[] args) { SyncDemo syncDemo1 = new SyncDemo(); syncDemo1.startThreadA(); try { Thread.sleep(100); } catch (InterruptedException e) { e.printStackTrace(); } syncDemo1.startThreadB(); try { Thread.sleep(100); } catch (InterruptedException e) { e.printStackTrace(); } syncDemo1.startThreadC(); } final Object lock = new Object(); public void startThreadA() { new Thread(() -> { synchronized (lock) { System.out.println("A get lock"); try { Thread.sleep(500); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("A release lock"); } }, "thread-A").start(); } public void startThreadB() { new Thread(() -> { synchronized (lock) { System.out.println("B get lock"); } }, "thread-B").start(); } public void startThreadC() { new Thread(() -> { synchronized (lock) { System.out.println("C get lock"); } }, "thread-C").start(); } }
By default, after A releases the lock, the C thread must acquire the lock first. Because when acquiring a lock, the current thread is inserted into the head of cxq, and when releasing the lock, the default strategy is: if EntryList is empty, the elements in CXQ are inserted into EntryList in the original order, and the first thread is awakened. That is, when the Entry List is empty, the subsequent thread acquires the lock first. The Lock mechanism in JDK is different.
The difference between Synchronized and ReentrantLock
The principle is clear. By the way, the differences between Synchronized and ReentrantLock are summarized.
Synchronized is a JVM level lock implementation, and ReentrantLock is a JDK level lock implementation.
Synchronized lock status can not be judged directly in the code, but ReentrantLock can be judged by ReentrantLock#isLocked.
Synchronized is an unfair lock, and ReentrantLock can be fair or unfair.
Synchronized can't be interrupted, and the ReentrantLock#lockInterruptibly method can be interrupted.
Synchronized automatically releases locks when an exception occurs (automatically implemented at compile time of javac), while ReentrantLock requires the developer to display the release locks in the final block.
ReentrantLock acquires locks in many ways: it is more flexible to return to successful tryLock() immediately, and to wait for acquisition for a specified length of time;
Synchronized acquires locks first for threads that are already waiting (as mentioned above), while ReentrantLock must acquire locks first for threads that are already waiting.
End
Overall, the implementation of Synchronized Heavy Lock and ReentrantLock is similar, including its data structure, suspension thread mode and so on. In daily use, Synchronized is enough without special requirements. It's easier for you to get a better understanding of one of the two implementations and the other or other locking mechanisms, which is also the technical commonality we often talk about.