problem
(1) Why add LongAdder in Java 8?
(2) How to implement LongAdder?
(3) Comparison between Long Adder and Atomic Long?
brief introduction
LongAdder is a new atomic class in Java 8. In multithreaded environments, LongAdder performs much better than AtomicLong, especially in more-written scenarios.
How did it come about? Let's study together.
principle
The principle of LongAdder is that when there is no competition at first, only the value of base is updated. When there is multi-threaded competition, different threads update different segments through the idea of segmentation. Finally, the value stored by LongAdder is obtained by adding these segments together.
Source code analysis
LongAdder inherits from Striped64 Abstract class, which defines Cell internal classes and important attributes.
Major internal classes
// Internal classes in Striped64, using the @sun.misc.Contended annotation, show that the values in Striped64 eliminate pseudo-sharing @sun.misc.Contended static final class Cell { // Store element values, using volatile modifiers to ensure visibility volatile long value; Cell(long x) { value = x; } // CAS updates the value of value final boolean cas(long cmp, long val) { return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val); } // Unsafe example private static final sun.misc.Unsafe UNSAFE; // Offset of value field private static final long valueOffset; static { try { UNSAFE = sun.misc.Unsafe.getUnsafe(); Class<?> ak = Cell.class; valueOffset = UNSAFE.objectFieldOffset (ak.getDeclaredField("value")); } catch (Exception e) { throw new Error(e); } } }
The Cell class uses the @sun.misc.Contended annotation to avoid pseudo-sharing.
Update the value with Unsafe CAS, where the value is modified with volatile to ensure visibility.
For an introduction to Unsafe, please check it out[ Unsafe Analysis of Dead java Magic].
For an introduction to pseudo-sharing, see[ What is false sharing?].
Main attributes
// All three attributes are in Striped64 // cells array, which stores the values of each segment transient volatile Cell[] cells; // What was used initially when there was no competition was also a special segment. transient volatile long base; // Mark whether threads are currently creating or expanding cells, or creating Cell s // Updating this value through CAS is equivalent to a lock transient volatile int cellsBusy;
Initially, no competition or other threads used base to update values when creating cell arrays, and cells to update values when there was competition.
Initially no competition means that there is no competition between threads at first, but it may also be multithreaded, except that these threads do not update the value of base at the same time.
Competition refers to the use of cells to update values whenever there is competition, whether there is competition or not. The rule is that different threads have been updated to different cells to reduce competition.
add(x) method
The add(x) method is the main method of LongAdder. It can increase the value stored in LongAdder by x, which can be positive or negative.
public void add(long x) { // as is the cell attribute in Striped64 // b is the base attribute in Striped64 // v is the value stored in Cell from the current thread hash // m is the length of cells minus 1, used as a mask when hash // a is the Cell to which the current thread hash arrives Cell[] as; long b, v; int m; Cell a; // Conditions 1: Cells are not empty, indicating that competition has occurred and cells have been created // Conditional 2: The cas e operation base fails, indicating that other threads have modified the base one step at a time and are competing if ((as = cells) != null || !casBase(b = base, b + x)) { // true says the competition is not intense // false indicates intense competition, multiple threads hash to the same Cell, which may need to be expanded. boolean uncontended = true; // Conditions 1: cells are empty, indicating that competition is occurring. The above is from condition 2. // Conditions 2: Should not appear // Conditional 3: The current thread is empty, indicating that the current thread has not updated Cell, so a Cell should be initialized. // Conditional 4: The failure of updating Cell where the current thread is located indicates that the competition is fierce now. Multiple threads hash to the same Cell, which should be expanded. if (as == null || (m = as.length - 1) < 0 || // The getProbe() method returns the threadLocalRandomProbe field in the thread // It is a value generated by a random number, which is fixed for a given thread. // Unless it is deliberately modified (a = as[getProbe() & m]) == null || !(uncontended = a.cas(v = a.value, v + x))) // Calling Method Processing in Striped64 longAccumulate(x, null, uncontended); } }
(1) only update base when there is no competition at first;
(2) Create cell arrays until update base fails;
(3) When multiple threads compete fiercely for the same Cell, they may need to expand.
longAccumulate() method
final void longAccumulate(long x, LongBinaryOperator fn, boolean wasUncontended) { // Store thread probe values int h; // If the getProbe() method returns 0, the random number is not initialized if ((h = getProbe()) == 0) { // Forced initialization ThreadLocalRandom.current(); // force initialization // Retrieve the probe value h = getProbe(); // It's not initialized. There's definitely no competition yet. wasUncontended = true; } // Is there a collision? boolean collide = false; // True if last slot nonempty for (;;) { Cell[] as; Cell a; int n; long v; // cells have been initialized if ((as = cells) != null && (n = as.length) > 0) { // Cell, where the current thread is, is not initialized if ((a = as[(n - 1) & h]) == null) { // No other threads are currently creating or expanding cells, and no threads are creating Cell s. if (cellsBusy == 0) { // Try to attach new Cell // Create a new Cell with the current value to be added Cell r = new Cell(x); // Optimistically create // Check cellsBusy again and try to update it to 1 // Equivalent to the current thread lock if (cellsBusy == 0 && casCellsBusy()) { // Whether it was created successfully or not boolean created = false; try { // Recheck under lock Cell[] rs; int m, j; // Retrieve cells and locate the current thread hash in the cells array // It's important to retrieve cells here because as is not locked in // Maybe it has been expanded. Here we need to retrieve it. if ((rs = cells) != null && (m = rs.length) > 0 && rs[j = (m - 1) & h] == null) { // Place the new Cell above at the j location of the cells. rs[j] = r; // Create success created = true; } } finally { // Equivalent to release lock cellsBusy = 0; } // Create successfully and return // Value has been put in the new Cell if (created) break; continue; // Slot is now non-empty } } // Markup does not currently conflict collide = false; } // Cell where the current thread is located is not empty and the update failed // Let's simply set it to true, which is equivalent to simply spinning once. // Modify the thread's probe and try again with the following statement else if (!wasUncontended) // CAS already known to fail wasUncontended = true; // Continue after rehash // Try CAS again to update the Cell value of the current thread, and return if it succeeds. else if (a.cas(v = a.value, ((fn == null) ? v + x : fn.applyAsLong(v, x)))) break; // If the cell array length reaches the CPU core, or the cell is expanded // Set collide to false and modify thread probe s with the following statement to try again else if (n >= NCPU || cells != as) collide = false; // At max size or stale // The last elseif update failed, and the previous condition did not hold, indicating that there was a conflict. else if (!collide) collide = true; // Conflict is clear, try to occupy locks, and expand else if (cellsBusy == 0 && casCellsBusy()) { try { // Check if other threads have been expanded if (cells == as) { // Expand table unless stale // The new array is twice the original Cell[] rs = new Cell[n << 1]; // Copy old array elements into new arrays for (int i = 0; i < n; ++i) rs[i] = as[i]; // Reassign cells to a new array cells = rs; } } finally { // Release lock cellsBusy = 0; } // Conflict resolved collide = false; // Re-attempt with a new expanded array continue; // Retry with expanded table } // Update failed or reached CPU core number, rebuild probe, and try again h = advanceProbe(h); } // cells array is not initialized, try to occupy the lock and initialize the cell array else if (cellsBusy == 0 && cells == as && casCellsBusy()) { // Successful initialization boolean init = false; try { // Initialize table // Check if other threads have been initialized if (cells == as) { // Create a new Cell array of size 2 Cell[] rs = new Cell[2]; // Find the location of the current thread hash in the array and create its corresponding Cell rs[h & 1] = new Cell(x); // Assignment to cells Array cells = rs; // Successful initialization init = true; } } finally { // Release lock cellsBusy = 0; } // Successful Initialization Direct Return // Because the added value has been created in Cell at the same time if (init) break; } // If there are other threads in the initialization cell array, try to update the base // If successful, return else if (casBase(v = base, ((fn == null) ? v + x : fn.applyAsLong(v, x)))) break; // Fall back on using base } }
(1) If the cell array is not initialized, the current thread will try to occupy the cell Busy lock and create the cell array;
(2) If the current thread tries to create an array of cells and finds that other threads have already been created, it tries to update the base and returns if it succeeds.
(3) Find which Cell in the Cell array should be updated by the thread's probe value.
(4) If the Cell where the current thread is located is not initialized, it occupies the cellsBusy lock and creates a Cell in the corresponding location.
(5) Attempt CAS to update the Cell where the current thread is located, and return if successful. Failure indicates a conflict.
(5) When the current thread fails to update Cell, it does not expand immediately, but try to update the probe value and try again.
(6) If the update fails at retry, it will be expanded.
(7) When expanding capacity, the current thread occupies the cell Busy lock, expands the array capacity to twice, and then migrates elements from the original cell array to the new array.
(8) cellsBusy is used in creating Cell arrays, creating Cell arrays and expanding Cell arrays.
sum() method
The sum() method is to get the size of the real stored value in LongAdder by adding the base and all segments together.
public long sum() { Cell[] as = cells; Cell a; // sum is initially equal to base long sum = base; // If cells are not empty if (as != null) { // Traveling through all Cell s for (int i = 0; i < as.length; ++i) { // If the Cell is not empty, add its value to the sum if ((a = as[i]) != null) sum += a.value; } } // Return to sum return sum; }
As you can see, the sum() method adds the base and the values of all segments. So, here's a question. If the value of Cell that has been accumulated on sum has been modified, can't it be calculated?
That's the answer, so LongAdder can say that it's not strong consistency, it's final consistency.
LongAdder VS AtomicLong
Code directly:
public class LongAdderVSAtomicLongTest { public static void main(String[] args){ testAtomicLongVSLongAdder(1, 10000000); testAtomicLongVSLongAdder(10, 10000000); testAtomicLongVSLongAdder(20, 10000000); testAtomicLongVSLongAdder(40, 10000000); testAtomicLongVSLongAdder(80, 10000000); } static void testAtomicLongVSLongAdder(final int threadCount, final int times){ try { System.out.println("threadCount: " + threadCount + ", times: " + times); long start = System.currentTimeMillis(); testLongAdder(threadCount, times); System.out.println("LongAdder elapse: " + (System.currentTimeMillis() - start) + "ms"); long start2 = System.currentTimeMillis(); testAtomicLong(threadCount, times); System.out.println("AtomicLong elapse: " + (System.currentTimeMillis() - start2) + "ms"); } catch (InterruptedException e) { e.printStackTrace(); } } static void testAtomicLong(final int threadCount, final int times) throws InterruptedException { AtomicLong atomicLong = new AtomicLong(); List<Thread> list = new ArrayList<>(); for (int i=0;i<threadCount;i++){ list.add(new Thread(() -> { for (int j = 0; j<times; j++){ atomicLong.incrementAndGet(); } })); } for (Thread thread : list){ thread.start(); } for (Thread thread : list){ thread.join(); } } static void testLongAdder(final int threadCount, final int times) throws InterruptedException { LongAdder longAdder = new LongAdder(); List<Thread> list = new ArrayList<>(); for (int i=0;i<threadCount;i++){ list.add(new Thread(() -> { for (int j = 0; j<times; j++){ longAdder.add(1); } })); } for (Thread thread : list){ thread.start(); } for (Thread thread : list){ thread.join(); } } }
The results are as follows:
threadCount: 1, times: 10000000 LongAdder elapse: 158ms AtomicLong elapse: 64ms threadCount: 10, times: 10000000 LongAdder elapse: 206ms AtomicLong elapse: 2449ms threadCount: 20, times: 10000000 LongAdder elapse: 429ms AtomicLong elapse: 5142ms threadCount: 40, times: 10000000 LongAdder elapse: 840ms AtomicLong elapse: 10506ms threadCount: 80, times: 10000000 LongAdder elapse: 1369ms AtomicLong elapse: 20482ms
You can see that when there is only one thread, AtomicLong has higher performance. With more and more threads, AtomicLong's performance decreases dramatically, while LongAdder's performance has little impact.
summary
(1) LongAdder stores values through base and cells arrays;
(2) Different threads will hash to different cell s to update, reducing competition;
(3) LongAdder has very high performance and will eventually reach a non-competitive state.
Egg
In the long Accumulate () method, there is a condition that n >= NCPU will not go to the expansion logic, and N is a multiple of 2. Does that mean that the maximum cell array can only reach the minimum 2nd power greater than or equal to NCPU?
The answer is clear. Because the same CPU core only runs one thread at the same time, and the failure of update indicates that two different cores update the same Cell, then the probe value of the thread that failed to update will be reset, so that next time the Cell in which it is located will change greatly. If run long enough, eventually all threads of the same core will have hash to the same. A Cell (Probability, but not necessarily all on one Cell) is updated, so the length of the cells array here does not need to be too long, enough to reach the CPU core.
For example, the author's computer is 8 cores, so the maximum number of cells here will only be 8, up to 8 will not expand.
Welcome to pay attention to my public number "Tong Ge Reads Source Code". Check out more articles about source code series and enjoy the sea of source code with Tong Ge.