Java CAS Principle Analysis, Share Two Ali P7 Extremely Difficult Algorithmic Questions

Posted by lynn897 on Sun, 12 Sep 2021 08:28:59 +0200

The flow of CAS operations described above is not difficult. But the above instructions are not enough. Next, I will introduce a little more background knowledge. With this background knowledge, we can better understand the following content.

2. Background introduction

As we all know, the CPU transmits data through bus and memory. In a multi-core era, multiple cores communicate through the same bus and memory as well as other hardware. The following figure:

[External chain picture transfer failed, source station may have anti-theft chain mechanism, it is recommended to save the picture and upload it directly (img-TSXZnVe1-1631373698813). (https://blog.csdn.net/)]

Source of picture: Deep understanding of computer systems

The figure above is a simpler computer structure diagram, which is simple but sufficient to illustrate the problem. In the figure above, the CPU communicates with memory through two buses labeled with blue arrows. Consider a problem, the CPUMultiple cores operate on the same piece of memory at the same time, what kind of errors would result if they were not controlled? Here is a brief explanation. Assuming Core 1 writes 64 bits of data to memory via a 32-bit bandwidth bus, Core 1 writes twice to complete the operation. If Core 1 writes 32 bits of data for the first time, Core 2 reads from the memory location that Core 1 writes.64-bit data. Since Core 1 has not written all 64-bit data into memory yet, Core 2 starts reading data from that memory location, so the read data must be confusing.

But don't worry about this. From the Intel Developer's Manual, we can see that beginning with the self-Pentium processor, the Intel processor guarantees atomically read and write quadword aligned at 64-bit boundaries.

From the instructions above, we can conclude that Intel processors can guarantee that single-access memory-aligned instructions execute atomically. But what if it's a two-access instruction? The answer is no guarantee. Incremental instruction inc dword ptr [...], which is equivalent to DEST = DEST + 1. The directive contains three operations, read->change->write, involving two accesses to memory. Consider the case where a value of 1 is stored at a specified location in memory. Now the CPU two cores execute the directive simultaneously. The two cores execute alternately as follows:

  1. Core 1 reads the value 1 from the specified location in memory and loads it into a register

  2. Core 2 reads value 1 from the specified location in memory and loads it into a register

  3. Core 1 decreases register values by 1

  4. Core 2 decreases register values by 1

  5. Core 1 writes the modified values back to memory

  6. Core 2 writes the modified values back to memory

After executing the above process, the final value in memory is 2, and we expect 3, which is a problem. To address this problem, you need to avoid two or more cores operating on the same memory area at the same time. How can you avoid this? This introduces the lock prefix, the leading actor of this article. For a detailed description of this directive, you can refer to the Intel Developer Manual Volume 2Instruction Set Reference, Chapter 3 Instruction Set Reference A-L. I refer to one of them here, as follows:

LOCK—Assert LOCK# Signal Prefix

Causes the processor's LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted.

The emphasis described above has been shown in black. In a multiprocessor environment, lock#signals ensure that the processor exclusively uses some shared memory. Locks can be added before the following instructions:

ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG.

You can make an inc instruction atomic by prefixing it with a lock prefix. When multiple cores execute the same inc instruction at the same time, they execute it in a serial manner, which avoids the situation mentioned above. Then there is another question, how does the lock prefix guarantee exclusive memory for the core? The answer is as follows:

In IntelIn a processor, there are two ways to ensure that a core of the processor exclusively occupies a certain area of memory. The first way is to make a core exclusively use the bus by locking the bus, which is too costly. When the bus is locked, other cores will no longer be able to access memory and may cause other cores to stop working for a short time. The second way is to lock the cache, if the number of memory is somewhereThe data is cached in the processor cache. LOCK#issued by the processorSignals do not lock the bus, but the memory area corresponding to the cache row. Other processors cannot operate on this memory area while it is locked. The cost of locking the cache is significantly less than locking the bus. For a more detailed description of bus locks and cache locks, refer to the Intel Developer Manual Volume 3 Software Developer'sManual, Chapter 8 Multiple-Processor Management.

3. Source Code Analysis

With that background in mind, we can now take the time to read the source code for CAS. This chapter will analyze the compareAndSet method in AtomicInteger, an atomic class under the java.util.concurrent.atomic package, as follows:

`public class AtomicInteger extends Number implements java.io.Serializable {



    // setup to use Unsafe.compareAndSwapInt for updates

    private static final Unsafe unsafe = Unsafe.getUnsafe();

    private static final long valueOffset;



    static {

        try {

            // Calculate the offset of the variable value in the class object

            valueOffset = unsafe.objectFieldOffset

                (AtomicInteger.class.getDeclaredField("value"));

        } catch (Exception ex) { throw new Error(ex); }

    }



    private volatile int value;

    

    public final boolean compareAndSet(int expect, int update) {

        /*  * compareAndSet Actually it's just a shell, and the main logic is encapsulated in the Unsafe * compareAndSwapInt method  */

        return unsafe.compareAndSwapInt(this, valueOffset, expect, update);

    }

    

    // ......

}



public final class Unsafe {

    // compareAndSwapInt is a native-type method, keep looking down

    public final native boolean compareAndSwapInt(Object o, long offset, int expected, int x);

    // ......

}`

`// unsafe.cpp

/*  * This doesn't look like a function, but don't worry, it's not the point. Both UNSAFE_ENTRY and UNSAFE_END are macros, * which are replaced with real code during precompilation. Some of the following jboolean, jlong, and jint types are also macros: * * jni.h * typedef unsigned char jboolean;*Typedef unsigned short jchar;*Typedef short jshort;*Typedef float jfloat;*Typedef double jdouble; * *Jni_md.h * typedef int jint; * #Ifdef _LP64 /* 64-bit */

 *     typedef long jlong;

 *     #else

 *     typedef long long jlong;

 *     #endif

 *     typedef signed char jbyte;

 */

UNSAFE_ENTRY(jboolean, Unsafe_CompareAndSwapInt(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, jint e, jint x))

  UnsafeWrapper("Unsafe_CompareAndSwapInt");

  oop p = JNIHandles::resolve(obj);

  // Calculates the address of the value based on the offset. The offset here is the valueOffset in AtomaicInteger

  jint* addr = (jint *) index_oop_from_field_offset_long(p, offset);

  // Call the function cmpxchg in Atomic, which is declared in Atomic.hpp

  return (jint)(Atomic::cmpxchg(x, addr, e)) == e;

UNSAFE_END



// atomic.cpp

unsigned Atomic::cmpxchg(unsigned int exchange_value,

                         volatile unsigned int* dest, unsigned int compare_value) {

  assert(sizeof(unsigned int) == sizeof(jint), "more work to do");

  /*  * Depending on the type of operating system that calls overloaded functions under different platforms, this compiler determines which platform to call overloaded* functions during precompilation. The relevant precompilation logic is as follows: * * atomic.inline.hpp:* #include "runtime/atomic.hpp"* * // Linux * #ifdef TARGET_OS_ARCH_linux_x86 * # "atomic_linux_x86.inline.hpp"* #endif * * //Omit partial code* * // Windows * #ifdef TARGET_OS_ARCH_windows_x86 * # include "atomic_windows_x86.inline.hpp" * #endif * // BSD * #ifdef TARGET_OS_ARCH_bsd_x86 * # include "atomic_bsd_x86.inline.hpp"* #endif * * Next analyze the cmpxchg function implementation in atomic_windows_x86.inline.hpp  */

  return (unsigned int)Atomic::cmpxchg((jint)exchange_value, (volatile jint*)dest,

                                       (jint)compare_value);


> **Java Disk: pan.baidu.com/s/1MtPP4d9Xy3qb7zrF4N8Qpg
> Extraction Code: 2 p8n**



**Summary of interview data**

![Successfully jumped from a small company into the ants ranking P7,Only because I brushed the interviews seven times](https://img-blog.csdnimg.cn/img_convert/1c39c35125624e522e26a6c6c65814da.png)

![Successfully jumped from a small company into the ants ranking P7,Only because I brushed the interviews seven times](https://img-blog.csdnimg.cn/img_convert/0c374614d957b6855b65eff536cb01f4.png)

These interview questions were brushed more than seven times by my friend before he entered Ali. Because there are many interview documents and more content, there is no way to show them to everyone one by one, so I have to select a part for you to refer to on everybody's day.

**[CodeChina Open Source Project: [Large Front Line Factory] Java Interview Question Analysis+Core Summary Learning Notes+Latest Explanation Video)](

)**

The essence of an interview is not an exam, it tells the interviewer what you are going to do, so the techniques mentioned in these interview materials should also be learned, otherwise you will cool down if you change a little

mL-1631373698815)]

These interview questions were brushed more than seven times by my friend before he entered Ali. Because there are many interview documents and more content, there is no way to show them to everyone one by one, so I have to select a part for you to refer to on everybody's day.

**[CodeChina Open Source Project: [Large Front Line Factory] Java Interview Question Analysis+Core Summary Learning Notes+Latest Explanation Video)](

)**

The essence of an interview is not an exam, it tells the interviewer what you are going to do, so the techniques mentioned in these interview materials should also be learned, otherwise you will cool down if you change a little

**Here I wish you all the best offer!**

Topics: Java Algorithm Back-end Programmer stm32