1. Preface
Unsafe is located in sun A class under misc package mainly provides some methods for performing low-level and unsafe operations, such as direct access to system memory resources, self-management of memory resources, etc. these methods play a great role in improving Java operation efficiency and enhancing the operation ability of Java language underlying resources. However, the unsafe class enables the Java language to operate the memory space like the C language pointer, which undoubtedly increases the risk of pointer related problems in the program. Excessive and incorrect use of unsafe classes in programs will increase the probability of program errors and make Java, a safe language, no longer "safe". Therefore, the use of unsafe must be cautious.
Note: sun misc. Unsafe public API functions and related application scenarios are introduced.
2. Basic introduction
As shown in the Unsafe source code below, the Unsafe class is a single instance implementation. It provides a static method getUnsafe to obtain the Unsafe instance. It is only legal when and only when the class calling the getUnsafe method is loaded by the boot class loader. Otherwise, a SecurityException exception is thrown.
public final class Unsafe { // Singleton object private static final Unsafe theUnsafe; private Unsafe() { } @CallerSensitive public static Unsafe getUnsafe() { Class var0 = Reflection.getCallerClass(); // It is only valid when the bootstrap classloader 'bootstrap classloader' loads if(!VM.isSystemDomainLoader(var0.getClassLoader())) { throw new SecurityException("Unsafe"); } else { return theUnsafe; } } }
If you want to use this class, how do you get its instance? There are two feasible schemes as follows.
First, starting from the use restrictions of getUnsafe method, add the jar package path of class a calling Unsafe related methods to the default bootstrap path through the Java command line command - Xbootclasspath/a, so that a is loaded by the boot class loader and passes Unsafe The getUnsafe method safely obtains an Unsafe instance.
java -Xbootclasspath/a: ${path} / / where path is the jar package path of the class calling Unsafe related methods
Second, obtain the singleton object theUnsafe through reflection
private static Unsafe reflectGetUnsafe() { try { Field field = Unsafe.class.getDeclaredField("theUnsafe"); field.setAccessible(true); return (Unsafe) field.get(null); } catch (Exception e) { log.error(e.getMessage(), e); return null; } }
3. Function introduction
As shown in the figure above, the API s provided by Unsafe can be roughly divided into memory operation, CAS, Class related, object operation, thread scheduling, system information acquisition, memory barrier, array operation, etc. the relevant methods and application scenarios will be described in detail below
4. Memory operation
-
This part mainly includes the methods of allocation, copy, release, given address value operation and so on.
//Allocate memory, equivalent to malloc function of C + + public native long allocateMemory(long bytes); //Extended memory public native long reallocateMemory(long address, long bytes); //Free memory public native void freeMemory(long address); //Sets a value in a given memory block public native void setMemory(Object o, long offset, long bytes, byte value); //Memory Copy public native void copyMemory(Object srcBase, long srcOffset, Object destBase, long destOffset, long bytes); //Gets the given address value, ignoring the access restrictions of the modifier qualifier. Similar operations include: getInt, getDouble, getLong, getChar, etc public native Object getObject(Object o, long offset); //Set a value for a given address and ignore the access restrictions of the modifier qualifier. Similar operations include putInt,putDouble, putLong, putChar, etc public native void putObject(Object o, long offset, Object x); //Gets the byte type value of the given address (if and only if the memory address is allocated by allocateMemory, the result of this method is determined) public native byte getByte(long address); //Set the value of byte type for the given address (the result of this method is determined if and only if the memory address is allocated by allocateMemory) public native void putByte(long address, byte x);
Generally, the objects we create in Java are in heap memory. Heap memory is java process memory controlled by the JVM, and they follow the memory management mechanism of the JVM. The JVM will uniformly manage heap memory by using garbage collection mechanism. In contrast, out of heap memory exists in memory areas outside the control of the JVM. The operation of out of heap memory in Java depends on the native method for operating out of heap memory provided by Unsafe.
-
Reasons for using off heap memory
Improvement of garbage collection pause. Because the off heap memory is directly managed by the operating system rather than the JVM, when we use off heap memory, we can maintain a small scale of on heap memory. So as to reduce the impact of recovery pause on the application during GC.
Improve the performance of program I/O operations. Generally, in the process of I/O communication, there will be data copying from in heap memory to out of heap memory. It is recommended to store the temporary data that needs frequent inter memory data copying and has a short life cycle in out of heap memory.
-
Typical application
DirectByteBuffer is an important class used by Java to implement out of heap memory. It is usually used as a buffer pool in the communication process. For example, it is widely used in NIO frameworks such as Netty and MINA. The logic of DirectByteBuffer for creating, using and destroying off heap memory is implemented by the off heap memory API provided by Unsafe.
The following figure shows the DirectByteBuffer constructor. When creating DirectByteBuffer, use unsafe Allocatememory allocates memory, unsafe Setmemory initializes the memory, and then constructs a Cleaner object to track the garbage collection of the DirectByteBuffer object, so that when the DirectByteBuffer is garbage collected, the allocated out of heap memory is released together.
So how to free the off heap memory by building the garbage collection tracking object Cleaner?
Cleaner inherits from the virtual reference PhantomReference, one of the four reference types of Java (as we all know, it is impossible to obtain the object instance associated with it through the virtual reference, and when the object is only referenced by the virtual reference, it can be recycled at any time of GC). Generally, PhantomReference is used in combination with the reference queue ReferenceQueue, It can realize the functions of system notification and resource cleaning when the virtual reference associated object is garbage collected. As shown in the following figure, when an object referenced by the cleaner will be recycled, the JVM garbage collector will put the reference of the object into the pending linked list in the object reference and wait for the reference handler to handle it. Among them, reference handler is a daemon thread with the highest priority. It will continuously process the object references in the pending linked list and execute the clean method of cleaner to clean up the related work.
Therefore, when DirectByteBuffer is only referenced by Cleaner (i.e. Virtual Reference), it can be recycled in any GC period. When the DirectByteBuffer instance object is recycled, in the reference handler thread operation, the clean method of the Cleaner will be called to release the out of heap memory according to the Deallocator passed in when creating the Cleaner.
5. CAS related
-
As shown in the following source code interpretation, this part mainly refers to the methods of CAS related operations.
/* CAS * @param o Contains the object to modify the field * @param offset The offset of a field in the object * @param expected expected value * @param update Update value * @return true | false */ public final native boolean compareAndSwapObject(Object o, long offset, Object expected, Object update); public final native boolean compareAndSwapInt(Object o, long offset, int expected,int update); public final native boolean compareAndSwapLong(Object o, long offset, long expected, long update);
What is CAS? That is, compare and replace, a technology commonly used in the implementation of concurrent algorithms. The CAS operation contains three operands -- the memory location, the expected original value, and the new value. When performing CAS operation, compare the value of the memory location with the expected original value. If it matches, the processor will automatically update the location value to the new value. Otherwise, the processor will not do any operation. As we all know, CAS is a CPU atomic instruction (cmpxchg instruction), which will not cause the so-called data inconsistency. The underlying implementation of CAS methods (such as compareAndSwapXXX) provided by Unsafe is the CPU instruction cmpxchg.
-
Typical application
CAS in Java util. concurrent. Atomic related classes, Java AQS, CurrentHashMap and other implementations are widely used. As shown in the figure below, in the implementation of AtomicInteger, the static field valueOffset is the memory offset address of the field value. When AtomicInteger is initialized, the value of valueOffset is obtained in the static code block through the objectFieldOffset method of Unsafe. In the thread safe method provided in AtomicInteger, you can locate the memory address of value in the AtomicInteger object through the value of the field valueOffset, so you can realize the atomic operation on the value field according to CAS.
The following figure shows the memory diagram of an AtomicInteger object before and after the autoincrement operation. The base address of the object is baseAddress = "0x110000", and the memory address of value is valueAddress = "0x11000c" obtained through baseAddress+valueOffset; Then perform atomic update operation through CAS, and return if successful. Otherwise, continue to retry until the update is successful.
6. Thread scheduling
-
This part includes thread suspension, recovery, locking mechanism and other methods.
//Unblock thread public native void unpark(Object thread); //Blocking thread public native void park(boolean isAbsolute, long time); //Get object lock (reentrant lock) @Deprecated public native void monitorEnter(Object o); //Release object lock @Deprecated public native void monitorExit(Object o); //Attempt to acquire object lock @Deprecated public native boolean tryMonitorEnter(Object o);
In the above source code description, the methods Park and unpark can realize the suspension and recovery of threads. The suspension of a thread is realized through the park method. After calling the park method, the thread will be blocked until timeout or interruption occurs; Unpark can terminate a suspended thread and restore it to normal.
-
Typical application
AbstractQueuedSynchronizer, the core class of the Java lock and synchronizer framework, calls LockSupport park() and LockSupport Unpark() implements thread blocking and wake-up, while LockSupport's park and unpark methods are actually implemented by calling Unsafe's park and unpark methods.
7. Class related
-
This section mainly provides methods related to the operation of Class and its static fields, including static field memory location, Class definition, anonymous Class definition, check & ensure initialization, etc.
//Gets the memory address offset of the given static field, which is unique and fixed for the given field public native long staticFieldOffset(Field f); //Gets the object pointer of the given field in a static class public native Object staticFieldBase(Field f); //To determine whether a class needs to be initialized, it is usually used when obtaining the static properties of a class (because if a class is not initialized, its static properties will not be initialized). false is returned if and only if the ensurecalassinitialized method does not take effect. public native boolean shouldBeInitialized(Class<?> c); //Detects whether the given class has been initialized. It is usually used when obtaining the static properties of a class (because if a class is not initialized, its static properties will not be initialized). public native void ensureClassInitialized(Class<?> c); //Define a class. This method will skip all security checks of the JVM. By default, ClassLoader and ProtectionDomain instances come from the caller public native Class<?> defineClass(String name, byte[] b, int off, int len, ClassLoader loader, ProtectionDomain protectionDomain); //Define an anonymous class public native Class<?> defineAnonymousClass(Class<?> hostClass, byte[] data, Object[] cpPatches);
-
Typical application
Starting with Java 8, JDK uses invokedynamic and VM Anonymous Class to implement Lambda expressions at the Java language level.
Invokedynamic: invokedynamic is a new virtual machine instruction introduced by Java 7 in order to run the dynamic language on the JVM. It can dynamically resolve the method referenced by the call point qualifier at run time, and then execute the method. The dispatch logic of invokedynamic instruction is determined by the guidance method set by the user.
VM Anonymous Class: it can be regarded as a template mechanism. When a program dynamically generates many classes with the same structure but different constants, you can first create a template Class containing constant placeholders, and then use unsafe When the defineanonymousclass method defines a specific Class, fill in the placeholder of the template to generate a specific anonymous Class. The generated anonymous Class is not explicitly hung under any ClassLoader. As long as there is no instance object of the Class and there is no strong reference to the Class object of the Class, the Class will be recycled by GC. Therefore, compared with the anonymous internal classes at the Java language level, VM Anonymous Class does not need to be loaded through ClassLoader and is easier to recycle.
In the implementation of Lambda expression, the calling point is generated by calling the bootstrap method through the invokedynamic instruction. In this process, the bytecode is dynamically generated through ASM, and then the anonymous class that implements the corresponding functional interface is defined by using the defineAnonymousClass method of Unsafe, and then the anonymous class is instantiated, And return the call point associated with the method handle of the functional method in this anonymous class; Then you can call the corresponding Lambda expression definition logic through this call point. The following is an example of the Test class shown in the following figure.
The decompiled results of the class file compiled by the Test class are shown in Figure 1 below (the parts that are meaningless to this description are deleted). We can see the instruction implementation of the main method, the bootstrap methods called by the invokedynamic instruction, and the static method lambda$main lambda $main $0 (which implements the string printing logic in lambda expression), etc. During the execution of the boot method, it will pass unsafe Defineanonymousclass generates an anonymous class that implements the Consumer interface as shown in Figure 2 below. The accept method implements the logic defined in the lambda expression by calling the static method lambda$main lambda $main $0 in the Test class. Then execute the statement Consumer Accept ("lambda") actually calls the accept method of the anonymous class shown in Figure 2 below.
8. Object operation
-
This part mainly includes operations related to object member attributes and unconventional object instantiation methods.
//Returns the offset of the object member property at the memory address relative to the memory address of the object public native long objectFieldOffset(Field f); //Get the value of the specified address offset of the given object. Similar operations include: getInt, getDouble, getLong, getChar, etc public native Object getObject(Object o, long offset); //The specified address offset setting value of a given object. Similar operations include putInt, putDouble, putLong, putChar, etc public native void putObject(Object o, long offset, Object x); //Get the reference of the variable from the specified offset of the object, and use the loading semantics of volatile public native Object getObjectVolatile(Object o, long offset); //Store the reference of the variable to the specified offset of the object, and use the storage semantics of volatile public native void putObjectVolatile(Object o, long offset, Object x); //The orderly and delayed version of putObjectVolatile method does not guarantee that the change of value will be seen by other threads immediately. Valid only if the field is decorated with the volatile modifier public native void putOrderedObject(Object o, long offset, Object x); //Bypass constructors and initialization code to create objects public native Object allocateInstance(Class<?> cls) throws InstantiationException;
-
Typical application
Conventional object instantiation method: the method we usually use to create objects is essentially to create objects through the new mechanism. However, a feature of the new mechanism is that when a class only provides a parameterized constructor and does not display a declaration of a nonparametric constructor, it must use a parameterized constructor for object construction. When using a parameterized constructor, it must pass a corresponding number of parameters to complete object instantiation.
Unconventional instantiation: the allocateInstance method is provided in Unsafe. This kind of instance object can be created only through the Class object, and there is no need to call its constructor, initialization code, JVM security check, etc. It suppresses modifier detection, that is, even if the constructor is private modified, it can be instantiated through this method, and the corresponding object can be created by mentioning the Class object. Because of this feature, allocateInstance is in Java Lang. invoke, Objenesis (providing an object generation method that bypasses the Class constructor) and Gson (used in deserialization) have corresponding applications.
As shown in the following figure, when Gson deserializes, if the class has a default constructor, it creates an instance by calling the default constructor through reflection. Otherwise, it constructs the object instance through UnsafeAllocator. UnsafeAllocator instantiates the object by calling the allocateInstance of Unsafe, so as to ensure that the deserialization does not have enough impact when the target class has no default constructor.
9. Array correlation
-
This section mainly introduces the arrayBaseOffset and arrayIndexScale methods related to data operation. When they are used together, they can locate the position of each element in the array in memory.
//Returns the offset address of the first element in the array public native int arrayBaseOffset(Class<?> arrayClass); //Returns the size occupied by an element in the array public native int arrayIndexScale(Class<?> arrayClass);
-
Typical application
These two methods related to data operation are in Java util. concurrent. There are typical applications in the AtomicIntegerArray under the atomic package (which can realize the atomic operation of each element in the Integer array), as shown in the AtomicIntegerArray source code in the figure below. The offset base of the first element of the array and the size factor scale of a single element are obtained through the arrayBaseOffset and arrayIndexScale of Unsafe. Subsequent atomic operations depend on these two values to locate the elements in the array. The getAndAdd method shown in Figure 2 below obtains the offset address of an array element through the checkedByteOffset method, and then implements atomic operations through CAS.
10. Memory barrier
-
It is introduced in Java 8 to define the memory barrier (also known as memory barrier, memory barrier, barrier instruction, etc.), which is a kind of synchronous barrier instruction. It is a synchronization point in the operation of random access to memory by CPU or compiler, so that all read and write operations before this point can be executed before the operation after this point can be started), so as to avoid code reordering.
//Memory barrier that prevents the load operation from reordering. Load operations before the barrier cannot be reordered behind the barrier, and load operations after the barrier cannot be reordered before the barrier public native void loadFence(); //Memory barrier to prevent store operation reordering. Store operations before the barrier cannot be reordered behind the barrier, and store operations after the barrier cannot be reordered before the barrier public native void storeFence(); //Memory barrier, which prohibits the reordering of load and store operations public native void fullFence();
-
Typical application
A new lock mechanism, StampedLock, is introduced into Java 8, which can be regarded as an improved version of read-write lock. StampedLock provides an implementation of optimistic read lock. This optimistic read lock is similar to the operation without lock. It will not block the write thread to obtain the write lock at all, so as to alleviate the "hunger" of the write thread when reading more and writing less. Since the optimistic read lock provided by StampedLock does not block the write thread from obtaining the read lock, there will be data inconsistency when the thread shared variable is load ed from the main memory to the thread working memory. Therefore, when using the optimistic read lock of StampedLock, it is necessary to follow the pattern used in the use case below to ensure data consistency.
As shown in the use case above, calculate the coordinate Point object, including the Point movement method move and the distance from this Point to the origin method distanceFromOrigin. In the method distanceFromOrigin, first, obtain the optimistic read tag through the tryOptimisticRead method; Then load the coordinate value (x,y) of the Point from the main memory; Then verify the lock status through the validate method of StampedLock, and judge whether the value of the main memory has been modified by other threads through the move method when the coordinate Point (x,y) is loaded from the main memory to the thread working memory. If the value returned by validate is true, it proves that the value of (x,y) has not been modified and can participate in subsequent calculations; Otherwise, add pessimistic read lock, load the latest value of (x,y) from the main memory again, and then calculate the distance. Among them, verifying the lock state is very important. You need to judge whether the lock state has changed, so as to judge whether the value previously copied to the thread working memory is inconsistent with the value of the main memory.
The following figure shows stampedlock The source code implementation of the validate method verifies the lock state by bit operation and comparison between the lock mark and relevant constants. Before verifying the logic, a load memory barrier will be added through the loadFence method of Unsafe to avoid steps ② and stampedlock in the use case above The lock state verification operation in validate is reordered, resulting in inaccurate lock state verification.
11. System related
-
This part contains two methods to obtain system related information.
//Returns the size of the system pointer. The return value is 4 (32-bit system) or 8 (64 bit system). public native int addressSize(); //The size of the memory page to the power of 2. public native int pageSize();
-
Typical application
The code fragment shown in the figure below is Java The static method for calculating the number of memory pages required for the memory to be applied in the tool class Bits under NiO depends on the pageSize method in Unsafe to obtain the system memory page size and realize the subsequent calculation logic.
12. Conclusion
This paper focuses on sun. Net in Java misc. The usage and application scenarios of Unsafe are introduced. We can see that Unsafe provides many convenient and interesting API methods. Even so, because Unsafe contains a large number of methods for autonomously operating memory, if used improperly, it will bring many uncontrollable disasters to the program. Therefore, we need to be cautious about its use.