[super hard core] interpretation of JVM source code: the Java method main is interpreted and executed on the virtual machine

Posted by amargharat on Thu, 11 Nov 2021 11:17:34 +0100

This article is compiled and published by jiumo (Ma Zhi), chief lecturer of HeapDump performance community

Chapter 1 - about the Java virtual machine HotSpot, the opening point is simple

Starting with the Java runtime, this article covers some simple contents. How is the main() method in our main class called by the Java virtual machine? Some methods in the Java class will be called by the C/C + + function of the HotSpot virtual machine written in C/C + +. However, because the calling convention of Java methods and C/C + + functions is different, they cannot be called directly. The function JavaCalls::call() is required for auxiliary calling. (I call functions written in C/C + + and methods written in Java, which will be used later) as shown in the following figure.

Some of the Java methods that are invoked from the C/C++ function are:

(1) main() method in Java main class;

(2) When loading the Java main class, call the JavaCalls::call() function to execute the checkAndLoadMain() method;

(3) in the initialization process of class, call the Java class initialization method implemented by JavaCalls::call() function, you can see JavaCalls::call_default_constructor() function, with calling logic for the method;

(4) Let's omit the execution process of the main method (in fact, the execution of the main method starts a JavaMain thread first, and the routine is the same). Let's just look at the startup process of a JavaThread. The start of java thread is finally completed through a native method java.lang.Thread#start0(), which is native to the interpreter_ The entry is called to the JVM_StartThread() function. Static void thread_ Javacalls:: call will be called in the entry (JavaThread * thread, traps) function_ Virtual() function. JavaThread will eventually pass JavaCalls::call_virtual() function to call run() method in bytecode;

(5) In SystemDictionary::load_instance_class() is a function that can reflect parental delegation. If the classloader object is not empty, it will call the loadClass() function of the classloader (called through the call_virtual() function) to load the class.

Of course, there are other methods, which are not listed here. Through JavaCalls::call(), JavaCalls:: call_ Functions such as helper () call Java methods. These functions are defined in the JavaCalls class. The definition of this class is as follows:

Some of the Java methods that are invoked from the C/C++ function are:

(1) main() method in Java main class;

(2) When loading the Java main class, call the JavaCalls::call() function to execute the checkAndLoadMain() method;

(3) in the initialization process of class, call the Java class initialization method implemented by JavaCalls::call() function, you can see JavaCalls::call_default_constructor() function, with calling logic for the method;

(4) Let's omit the execution process of the main method (in fact, the execution of the main method starts a JavaMain thread first, and the routine is the same). Let's just look at the startup process of a JavaThread. The start of java thread is finally completed through a native method java.lang.Thread#start0(), which is native to the interpreter_ The entry is called to the JVM_StartThread() function. Static void thread_ Javacalls:: call will be called in the entry (JavaThread * thread, traps) function_ Virtual() function. JavaThread will eventually pass JavaCalls::call_virtual() function to call run() method in bytecode;

(5) In SystemDictionary::load_instance_class() is a function that can reflect parental delegation. If the classloader object is not empty, it will call the loadClass() function of the classloader (called through the call_virtual() function) to load the class.

Of course, there are other methods, which are not listed here. Through JavaCalls::call(), JavaCalls:: call_ Functions such as helper () call Java methods. These functions are defined in the JavaCalls class. The definition of this class is as follows:

Source code location: openjdk/hotspot/src/share/vm/runtime/javaCalls.hpp

class JavaCalls: AllStatic {
  static void call_helper(JavaValue* result, methodHandle* method, JavaCallArguments* args, TRAPS);
 public:

  static void call_default_constructor(JavaThread* thread, methodHandle method, Handle receiver, TRAPS);

  // Use the following functions to call some special methods in Java, such as class initialization method < clinit >
  // Receiver represents the receiver of the method. For example, in A.main() call, A is the receiver of the method
  static void call_special(JavaValue* result, KlassHandle klass, Symbol* name,Symbol* signature, JavaCallArguments* args, TRAPS);
  static void call_special(JavaValue* result, Handle receiver, KlassHandle klass,Symbol* name, Symbol* signature, TRAPS); 
  static void call_special(JavaValue* result, Handle receiver, KlassHandle klass,Symbol* name, Symbol* signature, Handle arg1, TRAPS);
  static void call_special(JavaValue* result, Handle receiver, KlassHandle klass,Symbol* name, Symbol* signature, Handle arg1, Handle arg2, TRAPS);

  // Use the following function to call some methods of dynamic dispatch
  static void call_virtual(JavaValue* result, KlassHandle spec_klass, Symbol* name,Symbol* signature, JavaCallArguments* args, TRAPS);
  static void call_virtual(JavaValue* result, Handle receiver, KlassHandle spec_klass,Symbol* name, Symbol* signature, TRAPS); 
  static void call_virtual(JavaValue* result, Handle receiver, KlassHandle spec_klass,Symbol* name, Symbol* signature, Handle arg1, TRAPS);
  static void call_virtual(JavaValue* result, Handle receiver, KlassHandle spec_klass,Symbol* name, Symbol* signature, Handle arg1, Handle arg2, TRAPS);

  // Call a Java static method using the following function
  static void call_static(JavaValue* result, KlassHandle klass,Symbol* name, Symbol* signature, JavaCallArguments* args, TRAPS);
   static void call_static(JavaValue* result, KlassHandle klass,Symbol* name, Symbol* signature, TRAPS);
  static void call_static(JavaValue* result, KlassHandle klass,Symbol* name, Symbol* signature, Handle arg1, TRAPS);
  static void call_static(JavaValue* result, KlassHandle klass,Symbol* name, Symbol* signature, Handle arg1, Handle arg2, TRAPS);

  // For lower level interfaces, some of the above functions may eventually call the following functions
  static void call(JavaValue* result, methodHandle method, JavaCallArguments* args, TRAPS);
};

The above functions are self explanatory. We can see the functions of these functions through their names. The JavaCalls::call() function is a lower level general interface. There are five bytecode instructions defined in the Java virtual machine specification, including invokestatic, invokedynamic, invokestatic, invokespecial and invokevirtual method call instructions. These calls_ static(),call_ The call() function is called internally in the virtual() function. In this section, we will not introduce the specific implementation of each method. The next article will cover it in detail.   

We choose an important main() method to see the specific call logic. The contents of R are basically copied as follows, but I have made some modifications as follows:

Suppose that the class name of our Java main class is JavaMainClass. Next, in order to distinguish between C/C + + main() in java launcher and main() in Java layer program, the latter is written as JavaMainClass.main() method.
Start with the main() function of C/C + +:

The main logic executed by the thread that starts and calls the main() function of the HotSpot virtual machine is as follows:

main()
-> //... do some parameter checks
-> //... start a new thread as the main thread and let it execute from the JavaMain() function; The thread waits for the main thread to finish executing

In the above thread, another thread will be started to execute JavaMain() function, as follows:

JavaMain()
-> //... find the specified JVM
-> //... load and initialize the JVM
-> //... load JavaMainClass according to the class name specified by main class
-> //... find a method named "main" in the JavaMainClass class, with the signature "([Ljava/lang/String;)V", and the modifier is a public static method
-> (*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs); // Call JavaMainClass.main() method through JNI

The above steps are still under the control of the java launcher; when the control is transferred to the JavaMainClass.main() method, there is nothing to do with the java launcher. After the JavaMainClass.main() method returns, the java launcher takes over to clean up and close the JVM.

Let's take a look at the main methods and the main logic that will be executed when calling the main() method of the Java main class, as follows:

// The implementation of JNI's callstaticvoid method in HotSpot VM. Pay attention to the parameters to be passed to the Java method
// Pass in the variable length parameter of C, and this function packages its collection as a JNI_ArgumentPusherVaArg object
-> jni_CallStaticVoidMethod()

     // Here, the parameters to be passed to Java are further converted into JavaCallArguments objects and passed down    
     -> jni_invoke_static()

        // The beginning of the real underlying implementation. This method is just a layer skin. JavaCalls::call_helper()
        // Use os::os_exception_wrapper() is wrapped to set exception handling at the C + + level of HotSpot VM
        -> JavaCalls::call()   

           -> JavaCalls::call_helper()
              -> //... check whether the target method is empty. If yes, return directly
              -> //... check whether the target method "must be compiled before the first execution". If so, call the JIT compiler to compile the target method
              -> //... get the interpretation mode entry of the target method_ interpreted_ Entry, which is called entry below_ point
              -> //... ensure that the Java stack overflow checking mechanism starts correctly
              -> //... create a JavaCallWrapper to manage the allocation and release of JNIHandleBlock,
                 // And saving and restoring Java frame pointer/stack pointer before and after calling Java methods

              //... StubRoutines::call_stub() returns a function pointer to call stub,
              // Then call this call stub and pass in the previously obtained entry_point and parameters to be passed to Java methods
              -> StubRoutines::call_stub()(...) 
                 // call stub is generated during VM initialization. The corresponding code is in
                 // StubGenerator::generate_ call_ In the stub() function
                 -> //... adjust the state of the relevant register to the state required by the interpreter
                 -> //... unpack the parameters to be passed to the Java method from the JavaCallArguments object to the interpretation module
                    // Position required by calling convention
                 -> //... jump to the previously passed in entry_point, that is, the from of the target method_ interpreted_ entry

                    -> //... in the - Xcomp mode, the i2c adapter stub actually jumps in, which will explain the mode calling convention
                       // Move the passed in parameters to the location required by the calling convention in the compilation mode
                           -> //... jump to the JIT compiled code of the target method, that is, jump to the position pointed by the VEP of nmethod
                                -> //... officially start executing the JIT compiled code of the target method < - here is the "real entrance to the main() method"

The next three steps are in the compilation and execution mode, but we will start with the interpretation and execution, so we need to configure the - Xint option for the virtual machine. With this option, the main() method of the Java main class will interpret and execute.

In the process of calling the main() method of the Java main class, we see that the virtual machine indirectly calls the main() method through the JavaCalls::call() function. In the next article, we will study the specific call logic.

Part 2 - Java virtual machine calls the main() method of Java main class in this way

In the first part of the previous article, about the Java virtual machine HotSpot, call was introduced in a simpler way at the beginning_ static(),call_virtual() and other functions, which will call the JavaCalls::call() function. Let's look at the call of main() method in Java class. The call stack is as follows:

JavaCalls::call_helper() at javaCalls.cpp    
os::os_exception_wrapper() at os_linux.cpp    
JavaCalls::call() at javaCalls.cpp
jni_invoke_static() at jni.cpp    
jni_CallStaticVoidMethod() at jni.cpp    
JavaMain() at java.c
start_thread() at pthread_create.c
clone() at clone.S

This is the call stack on Linux through javacalls:: call_ The helper () function to execute the main() method. The starting function of the stack is clone(). This function will create a separate stack space for each process (Linux process corresponds to Java thread), as shown in the following figure.

On Linux operating system, the stack address extends to the low address, so the unused stack space is below the used stack space. Each small blue box in the figure represents the stack frame of the corresponding method, and the stack is composed of stack frames one by one. The stack frames of native method, Java interpretation stack frame and Java compilation stack frame will be allocated in the yellow area, so they are parasitic in the host stack. These different stack frames are close together, so there will be no problems such as space debris, and such layout is very conducive to stack traversal. The call stack given above is obtained by traversing stack frames, and the traversal process is also the process of stack expansion. Subsequent operations such as exception handling, running jstack to print the thread stack, and GC finding the root reference will expand the stack, so stack expansion must be introduced later.

Let's continue to look at JavaCalls::call_helper() function. There is a very important call in this function, as follows:

// do call
{
    JavaCallWrapper link(method, receiver, result, CHECK);
    {
      HandleMark hm(thread);  // HandleMark used by HandleMarkCleaner
      StubRoutines::call_stub()(
         (address)&link,
         result_val_address,              
         result_type,
         method(),
         entry_point,
         args->parameters(),
         args->size_of_parameters(),
         CHECK
      );

      result = link.result();  
      // Preserve oop return value across possible gc points
      if (oop_result_flag) {
        thread->set_vm_result((oop) result->get_jobject());
      }
    }
}

Call stubbroutines:: call_ The stub () function returns a function pointer, and then calls the function pointed to by the function pointer through the function pointer. Calling through function pointer is the same as calling through function name. Here, we need to be clear that the target function to be called is still a C/C + + function. Therefore, when calling another C/C + + function by C/C + + function, we should abide by the calling convention. This calling convention specifies how to pass parameters to the Callee function and where the return value of the Callee function will be stored.

Let's briefly talk about the C/C + + function calling convention under Linux X86 architecture. Under this Convention, the following registers are used to pass parameters:

The first parameter: rdi c_rarg0
Second parameter: rsi c_rarg1
The third parameter: rdx c_rarg2
The fourth parameter: rcx c_rarg3
5th parameter: r8 c_rarg4
6th parameter: r9 c_rarg5

When calling a function, 6 and less than 6 are passed in the following registers through the more understandable alias C in HotSpot_ RARG * to use the corresponding register. If there are more than six parameters, the program will use the call stack to pass those additional parameters.

Count how many parameters were passed when we called through the function pointer? 8, then the next 2 need to be passed through the Caller stack. These two parameters are args - > size_ of_ Parameters () and CHECK (this is a macro. After extension, it is the transfer thread object).

Therefore, when calling the function pointed to by the function pointer, our call stack changes to the following state:

On the right is the specific call_ For the contents of the helper () stack frame, we press the thread and parameter size into the call stack. In fact, in the process of calling the objective function, we will open up a new stack frame, and press the return address and the bottom of the call stack after the parameter size. We will introduce it in detail in the next article. Let's introduce javacalls:: call first_ The implementation of helper () function is introduced in three parts.

1. Check whether the target method "must be compiled before the first execution". If so, call the JIT compiler to compile the target method;

The code implementation is as follows:

void JavaCalls::call_helper(
 JavaValue* result, 
 methodHandle* m, 
 JavaCallArguments* args, 
 TRAPS
) {
  methodHandle method = *m;
  JavaThread* thread = (JavaThread*)THREAD;
  ...

  assert(!thread->is_Compiler_thread(), "cannot compile from the compiler");
  if (CompilationPolicy::must_be_compiled(method)) {
    CompileBroker::compile_method(method, InvocationEntryBci,
                                  CompilationPolicy::policy()->initial_compile_level(),
                                  methodHandle(), 0, "must_be_compiled", CHECK);
  }
  ...
}

For the main() method, if the - Xint option is configured, it is executed in the interpretation mode, so the logic of the above compile_method() function will not be followed. Later, we will study the compilation and execution, which can be enforced, and then view the execution process.

2. Get the interpretation mode entry from_interpreted_entry of the target method, that is, the value of entry_point. The obtained entry_point is to prepare the stack frame for Java method calls and point the code call pointer to the memory address of the first bytecode of the method. Entry_point is equivalent to the encapsulation of the method. Different method types have different entry_points.

Next, look at the code implementation of the call_helper() function, as follows:

address entry_point = method->from_interpreted_entry();

Call from of method_ interpreted_ The entry() function gets the information in the method instance_ from_ interpreted_ The value of the entry attribute. Where is this value set? We will introduce it in detail later.

3. Call_ The stub () function needs to pass 8 parameters. This code was given earlier and will not be given here. Let's introduce these parameters in detail as follows:

(1) link the type of this variable is JavaCallWrapper. This variable is very important for the stack expansion process, which will be described in detail later;

(2)result_ val_ The address function returns the value address;

(3)result_ The type function returns the type;

(4) method() the method currently to be executed. Through this parameter, you can obtain all metadata information of the Java method, including the most important bytecode information, so that you can interpret and execute the method according to the bytecode information;

(5)entry_ Every time point HotSpot calls a Java function, it must call the CallStub function pointer. The value of this function pointer is taken from_ call_stub_entry, HotSpot_ call_stub_entry points to the address of the called function. Before calling the function, you must go through the entry_point, HotSpot is actually through entry_point gets the first bytecode command corresponding to the Java method from the method() object, which is also the call entry of the whole function;

(6) Args - > parameters() describes the input parameter information of Java functions;

(7)args->size_ of_ The memory size in words that the parameters () parameter needs to occupy

(8) CHECK the current thread object.

The most important thing here is entry_point, which is also the content to be introduced in the next article.

Chapter 3 - creation of CallStub new stack frame

In Article 2 of the previous article - JVM virtual machine calls the main() method of Java main class in this way, we introduced the call_ In the helper() function, a function is called through a function pointer, as follows:

StubRoutines::call_stub()(
         (address)&link,
         result_val_address,              
         result_type,
         method(),
         entry_point,
         args->parameters(),
         args->size_of_parameters(),
         CHECK
);

Which calls StubRoutines:: call_ The stub () function will return a function pointer. Finding out the implementation of the function pointed to by the function pointer is the focus of this article. Called call_ The stub() function is implemented as follows:

Source code location: openjdk / hotspot / SRC / share / VM / Runtime / stubbroutes.hpp

static CallStub  call_stub() { 
    return CAST_TO_FN_PTR(CallStub, _call_stub_entry); 
}

call_ The stub () function returns a function pointer to a specific method that depends on the operating system and cpu architecture. The reason is very simple. To execute native code, you have to see what cpu architecture is to determine the register and what os is to determine the ABI.

Including CAST_TO_FN_PTR is a macro, which is defined as follows:

Source code location: / src/share/vm/runtime/utilities/globalDefinitions.hpp

#define CAST_TO_FN_PTR(func_type, value) ((func_type)(castable_address(value)))

Yes, call_ After macro replacement and expansion of stub() function, it will change to the following form:

static CallStub call_stub(){
    return (CallStub)( castable_address(_call_stub_entry) );
}

CallStub is defined as follows:

Source code location: openjdk / hotspot / SRC / share / VM / Runtime / stubbroutes.hpp

typedef void (*CallStub)(
    // Connector
    address   link,    
    // Function return value address
    intptr_t* result, 
    //Function return type 
    BasicType result_type, 
    // Java method objects represented inside the JVM
    Method*   method, 
    // The JVM calls the routine entry of the Java method. Each segment within the JVM
    // Routines are a piece of machine instructions generated in advance during JVM startup.
    // To call a Java method, you must go through this routine, 
    // That is, you need to execute this machine instruction before you can jump to the Java method
    // The machine instruction corresponding to the bytecode is executed
    address   entry_point, 
    intptr_t* parameters,
    int       size_of_parameters,
    TRAPS
);

A function pointer type is defined above. The function pointed to declares 8 formal parameters.  

On call_ The castable_ in the stub () function is called The address() function is defined in the globalDefinitions.hpp file. The specific implementation is as follows:

inline address_word  castable_address(address x)  { 
    return address_word(x) ; 
}

address_word is a user-defined type. It is defined in globalDefinitions.hpp file as follows:

typedef   uintptr_t     address_word;

Where uintptr_t is also a custom type. It uses global definitions under the operating system of the Linux kernel_ The definitions in gcc.hpp file are as follows:

typedef  unsigned int  uintptr_t;

Call like this_ The stub() function is actually equivalent to the following implementation form:

static CallStub call_stub(){
    return (CallStub)( unsigned int(_call_stub_entry) );
}

Will_ call_ stub_ The entry is cast to the unsigned int type and then to the CallStub type. CallStub is a function pointer, so_ call_stub_entry should also be a function pointer, not an ordinary unsigned integer.

On call_ In the stub() function_ call_stub_entry is defined as follows:

address StubRoutines::_call_stub_entry = NULL;

_ call_ stub_ The initialization of entry is in openjdk / hotspot / SRC / CPU / x86 / VM / stubgenerator_ x86_ Generate under 64.cpp file_ The call chain of initial() function is as follows:

StubGenerator::generate_initial()   stubGenerator_x86_64.cpp    
StubGenerator::StubGenerator()      stubGenerator_x86_64.cpp
StubGenerator_generate()            stubGenerator_x86_64.cpp    
StubRoutines::initialize1()         stubRoutines.cpp    
stubRoutines_init1()                stubRoutines.cpp    
init_globals()                      init.cpp
Threads::create_vm()                thread.cpp
JNI_CreateJavaVM()                  jni.cpp
InitializeJVM()                     java.c
JavaMain()                          java.c

The StubGenerator class is defined in the StubGenerator in the openjdk/hotspot/src/cpu/x86/vm directory_ x86_ 64. Cpp file, generate in this file_ The initial () method initializes call_ stub_ The entry variable is as follows:

StubRoutines::_call_stub_entry = generate_call_stub(StubRoutines::_call_stub_return_address);

Now we have finally found the implementation logic of the function pointed to by the function pointer. This logic is realized by calling generate_ call_ The stub () function.

However, after checking, we found that the function pointer does not point to a C + + function, but a machine instruction fragment. We can see it as an instruction fragment generated by a C + + function compiled by a C + + compiler. In generate_ call_ The stub() function contains the following call statements:

__ enter();
__ subptr(rsp, -rsp_after_call_off * wordSize);

These two pieces of code directly generate machine instructions, but in order to view the machine instructions, we decompile them into more readable assembly instructions with the help of HSDB tool. As follows:

push   %rbp         
mov    %rsp,%rbp
sub    $0x60,%rsp

These three assemblers are very typical instructions for opening up new stack frames. Previously, we introduced the stack state before calling through the function pointer, as follows:

After running the assembly in the above three items, the stack state will change to the following state:

What we need to pay attention to is the pointing of old% RBP and old% RSP when new stack frame (CallStub() stack frame) is not opened, and the pointing of new% RBP and new% RSP when new stack frame (CallStub() stack frame) is opened. In addition, note that saved rbp saves old% RBP, which is very important for stack expansion, because it can traverse upward and finally find all stack frames.

Let's move on to generate_ call_ The stub() function is implemented as follows:

address generate_call_stub(address& return_address) {
    ...
    address start = __ pc();
 

    const Address rsp_after_call(rbp, rsp_after_call_off * wordSize);
 
    const Address call_wrapper  (rbp, call_wrapper_off   * wordSize);
    const Address result        (rbp, result_off         * wordSize);
    const Address result_type   (rbp, result_type_off    * wordSize);
    const Address method        (rbp, method_off         * wordSize);
    const Address entry_point   (rbp, entry_point_off    * wordSize);
    const Address parameters    (rbp, parameters_off     * wordSize);
    const Address parameter_size(rbp, parameter_size_off * wordSize);
 
    const Address thread        (rbp, thread_off         * wordSize);
 
    const Address r15_save(rbp, r15_off * wordSize);
    const Address r14_save(rbp, r14_off * wordSize);
    const Address r13_save(rbp, r13_off * wordSize);
    const Address r12_save(rbp, r12_off * wordSize);
    const Address rbx_save(rbp, rbx_off * wordSize);
 
    // Open up new stack frames
    __ enter();
    __ subptr(rsp, -rsp_after_call_off * wordSize);
 
    // save register parameters
    __ movptr(parameters,   c_rarg5); // parameters
    __ movptr(entry_point,  c_rarg4); // entry_point
 
 
    __ movptr(method,       c_rarg3); // method
    __ movl(result_type,  c_rarg2);   // result type
    __ movptr(result,       c_rarg1); // result
    __ movptr(call_wrapper, c_rarg0); // call wrapper
 
    // save regs belonging to calling function
    __ movptr(rbx_save, rbx);
    __ movptr(r12_save, r12);
    __ movptr(r13_save, r13);
    __ movptr(r14_save, r14);
    __ movptr(r15_save, r15);
 
    const Address mxcsr_save(rbp, mxcsr_off * wordSize);
    {
      Label skip_ldmx;
      __ stmxcsr(mxcsr_save);
      __ movl(rax, mxcsr_save);
      __ andl(rax, MXCSR_MASK);    // Only check control and mask bits
      ExternalAddress mxcsr_std(StubRoutines::addr_mxcsr_std());
      __ cmp32(rax, mxcsr_std);
      __ jcc(Assembler::equal, skip_ldmx);
      __ ldmxcsr(mxcsr_std);
      __ bind(skip_ldmx);
    }

    // ... omitting the next operation
}

We have introduced the logic of opening up new stack frames. The following is the call_ The six parameters in the register passed by helper () are stored in the CallStub() stack frame. In addition to storing these parameters, you also need to store the values in other registers, because the next operation of the function is to prepare parameters for the Java method and call the Java method. We don't know whether the Java method will destroy the values in these registers, so we need to save them, Restore after the call is completed.

The generated assembly code is as follows:

mov      %r9,-0x8(%rbp)
mov      %r8,-0x10(%rbp)
mov      %rcx,-0x18(%rbp)
mov      %edx,-0x20(%rbp)
mov      %rsi,-0x28(%rbp)
mov      %rdi,-0x30(%rbp)
mov      %rbx,-0x38(%rbp)
mov      %r12,-0x40(%rbp)
mov      %r13,-0x48(%rbp)
mov      %r14,-0x50(%rbp)
mov      %r15,-0x58(%rbp)
// stmxcsr is to save the value in the MXCSR register to - 0x60(%rbp)
stmxcsr  -0x60(%rbp)  
mov      -0x60(%rbp),%eax
and      $0xffc0,%eax // MXCSR_MASK = 0xFFC0
// cmp subtracts the difference of the first operand from the second operand, and sets the flag bit in eflags according to the result.
// It is essentially the same as the sub instruction, but does not change the value of the operand
cmp      0x1762cb5f(%rip),%eax  # 0x00007fdf5c62d2c4 
// Jump to target address when ZF=1
je       0x00007fdf45000772 
// Load m32 into MXCSR register
ldmxcsr  0x1762cb52(%rip)      # 0x00007fdf5c62d2c4  

After loading these parameters, see the following figure.

In the next article, let's continue to introduce generate_ call_ The rest of the implementation in the stub () function.

Chapter 4 - the JVM finally starts calling the main() method of the Java main class

In the previous article, Article 3 - creating a new stack frame for CallStub, we introduced generate_ call_ The partial implementation of the stub () function completes the operation of pressing parameters into the CallStub stack frame. The state at this time is shown in the following figure.

Keep watching generate_ call_ The implementation of stub() function will load the thread register next. The code is as follows:

__ movptr(r15_thread, thread);
__ reinit_heapbase();

The generated assembly code is as follows:

mov    0x18(%rbp),%r15  
mov    0x1764212b(%rip),%r12   # 0x00007fdf5c6428a8

Compared with the above stack frame, you can see that the position 0x18(%rbp) stores thread and stores this parameter in the% r15 register.

If there are parameters when calling the function, the parameters need to be passed. The code is as follows:

Label parameters_done;
// parameter_ Copy size to c_rarg3 is in the rcx register
__ movl(c_rarg3, parameter_size);
// Check C_ Whether the value of rarg3 is legal. When two operands are combined, only the flag bit is modified and the result is not returned
__ testl(c_rarg3, c_rarg3);
// If not, jump to parameters_ On the done branch
__ jcc(Assembler::zero, parameters_done);

// If the following logic is executed, it means parameter_ The value of size is not 0, that is, it needs to be
// The called java method provides parameters
Label loop;
// Copy the data contained in the address parameters, that is, the pointer of the parameter object, to c_rarg2 register
__ movptr(c_rarg2, parameters);       
// Will c_rarg3 copy to C_ In rarg1, copy the number of parameters to C_ In rarg1
__ movl(c_rarg1, c_rarg3);            
__ BIND(loop);
// Will c_ The address contained in the memory pointed to by rarg2 is copied to rax
__ movptr(rax, Address(c_rarg2, 0));
// c_ The pointer of the parameter object in rarg2 plus the pointer width of 8 bytes points to the next parameter
__ addptr(c_rarg2, wordSize);       
// Will c_ The value in rarg1 is minus one
__ decrementl(c_rarg1);            
// Pass method call parameters
__ push(rax);                       
// If the number of parameters is greater than 0, jump to loop to continue
__ jcc(Assembler::notZero, loop);

Here is a loop for passing parameters, which is equivalent to the following code:

while(%esi){
   rax = *arg
   push_arg(rax)
   arg++;   // ptr++
   %esi--;  // counter--
}

The generated assembly code is as follows:

// Send the parameter size in the stack to% ecx
mov    0x10(%rbp),%ecx   
// And operation. It is equal to 0 only when the value in% ecx is 0 
test   %ecx,%ecx          
// No parameters need to be passed. Jump to parameters directly_ Just do
je     0x00007fdf4500079a 
// -- loop --
// This is the end of the assembly. It indicates that paramter size is not 0 and parameters need to be passed
mov    -0x8(%rbp),%rdx
mov    %ecx,%esi
mov    (%rdx),%rax
add    $0x8,%rdx
dec    %esi
push   %rax
// Jump to loop
jne    0x00007fdf4500078e  

Because you want to call a Java method, you will press the actual parameters for the Java method, that is, the parameter size parameter, which starts from parameters. The stack after pressing parameters is shown in the figure below.

When the parameters that need to call the Java method are ready, the Java method will be called next. Here we need to highlight the method calling convention when Java interprets execution. Unlike the calling convention of C/C + + under x86, parameters are passed through the stack instead of registers. To be more straightforward, parameters are passed through the local variable table, so the above figure CallStub() Argument word1... Argument word n in the function stack frame is actually part of the local variable table of the Java method being called.

Next, let's look at the code calling Java methods, as follows:

// Calling Java methods
// -- parameters_done --

__ BIND(parameters_done); 
// Copy the data contained in the Method address to rbx via Method *
__ movptr(rbx, method);            
// Copy the entry address of the interpreter to c_rarg1 register
__ movptr(c_rarg1, entry_point);    
// Copy the data of rsp register to r13 register
__ mov(r13, rsp);                   

// Call the interpreter's interpretation function to call Java methods
// Pass C when calling_ Rarg1, which is the entry address of the interpreter
__ call(c_rarg1); 

The generated assembly code is as follows:

// Send Method * to% rbx
mov     -0x18(%rbp),%rbx  
// Enter_ Point sent to% rsi
mov     -0x10(%rbp),%rsi  
// Save the caller's stack top pointer to% r13
mov     %rsp,%r13    
// Calling Java methods     
callq   *%rsi             

Note that after calling the callq instruction, the address of the next instruction of the callq instruction will be pressed on the stack, and then jump to the address specified by the first operand, that is, the address represented by *% rsi. The address of the next instruction is pushed in so that the function can return from the sub function by jumping to the address on the stack.  

The callq instruction calls entry_point. entry_point will be described in detail later.

Part 5 - pop up stack frame and processing return result after calling Java method

In Chapter 4 of the previous article, the JVM finally began to call the main() method of the Java main class. It introduced calling entry point through callq, but we didn't finish reading generate_ call_ Implementation of the stub () function. Next, in generate_ call_ The stub () function will process the return value after calling the Java method. At the same time, it also needs to perform the back stack operation, that is, restore the stack to the state before calling the Java method. What is the state before the call? As described in Chapter 2 - JVM virtual machine calling the main() method of Java main class in this way, this state is shown in the following figure.

generate_ call_ The next code implementation of the stub () function is as follows:

// Saving method call results depends on the result type, as long as it is not T_OBJECT, T_LONG, T_FLOAT or T_DOUBLE, all as T_INT processing
// Copy the value of the result address to c_rarg0, that is, save the result of the method call in the rdi register. Note that result is the address of the return value of the function
__ movptr(c_rarg0, result);     

Label is_long, is_float, is_double, exit;

// Set result_ Copy the value of the type address to c_rarg1, that is, the type returned by the result of the method call is saved in the esi register
__ movl(c_rarg1, result_type);  

// Jump to different processing branches according to different result types
__ cmpl(c_rarg1, T_OBJECT);
__ jcc(Assembler::equal, is_long);
__ cmpl(c_rarg1, T_LONG);
__ jcc(Assembler::equal, is_long);
__ cmpl(c_rarg1, T_FLOAT);
__ jcc(Assembler::equal, is_float);
__ cmpl(c_rarg1, T_DOUBLE);
__ jcc(Assembler::equal, is_double);

// When the logic is executed here, t is processed_ Int type,
// Write the value in rax to C_ The address saved by rarg0 points to the memory
// After calling the function, if the return value is of type int, it will be returned according to the calling convention
// Will be stored in eax
__ movl(Address(c_rarg0, 0), rax); 

__ BIND(exit);


// rsp_ after_ Copy the valid address saved in call to rsp, that is, rsp moves to the high address direction,
// The original method calls arguments argument 1,..., argument n,
// It is equivalent to pop-up from the stack, so the following statement performs the stack withdrawal operation
__ lea(rsp, rsp_after_call);  // The lea instruction loads an address into a register

Here we will focus on result and result_type, result is calling call_ The helper () function will be passed, that is, it will indicate call_ Where does the helper () function store the return value after calling the Java method. For the result of JavaValue type, the return type has been set before the call, so the result above_ The type variable only needs to get the result type from JavaValue. For example, when calling the main() method of the Java main class, in jni_CallStaticVoidMethod() function and JNI_ invoke_ The return type will be set to t in the static() function_ Void, that is, the void returned by the main() method.

The generated assembly code is as follows:

// Save the result at - 0x28 on the stack
mov    -0x28(%rbp),%rdi  
// Save the result type at - 0x20 on the stack
mov    -0x20(%rbp),%esi  
cmp    $0xc,%esi         // Is it T_OBJECT type
je     0x00007fdf450007f6
cmp    $0xb,%esi         // Is it T_LONG type
je     0x00007fdf450007f6
cmp    $0x6,%esi         // Is it T_FLOAT type
je     0x00007fdf450007fb
cmp    $0x7,%esi         // Is it T_DOUBLE type
je     0x00007fdf45000801

// If it's T_INT type, directly write the returned result% eax to the position of - 0x28(%rbp) in the stack
mov    %eax,(%rdi)       

// -- exit --

// rsp_ after_ Copy the valid address of call to rsp
lea    -0x60(%rbp),%rsp  

To let you see clearly, I'll post the stack frame status before calling the Java method, as follows:

As can be seen from the figure, the location pointed to by the - 0x60(%rbp) address does not exactly include the actual parameters argument word 1... Argument word n pressed when calling the Java method. So now rbp and rsp are the points in the figure.

Next, restore the previously saved caller save register, which is also part of the calling convention, as follows:

__ movptr(r15, r15_save);
__ movptr(r14, r14_save);
__ movptr(r13, r13_save);
__ movptr(r12, r12_save);
__ movptr(rbx, rbx_save);

__ ldmxcsr(mxcsr_save); 

The generated assembly code is as follows:

mov      -0x58(%rbp),%r15
mov      -0x50(%rbp),%r14
mov      -0x48(%rbp),%r13
mov      -0x40(%rbp),%r12
mov      -0x38(%rbp),%rbx
ldmxcsr  -0x60(%rbp)

After popping up the actual parameters saved for calling the Java method and recovering the caller save register, continue to perform the back stack operation. The implementation is as follows:

// restore rsp
__ addptr(rsp, -rsp_after_call_off * wordSize);

// return
__ pop(rbp);
__ ret(0);

The generated assembly code is as follows:

// %Add 0x60 to rsp, that is, perform the stack withdrawal operation
// When callee pops up_ Save register and the six parameters of stack pressing
add    $0x60,%rsp 
pop    %rbp
// Method returns. q in the instruction represents a 64 bit operand, which means
// The return address stored in the stack of is 64 bits
retq  

Remember that in the previous chapter 3 - CallStub new stack frame creation, the new stack frame was created through the following assembly:

push   %rbp         
mov    %rsp,%rbp 
sub    $0x60,%rsp

Now to exit this stack frame, add $0x60 to the address pointed to by% rsp, and restore the pointing of% rbp at the same time. Then jump to the instruction pointed to by return address and continue to execute.

For your convenience, I give you the picture used before again. This picture is the picture before returning to the stack:

As shown in the figure below, after the stack is withdrawn.

For paramter size and thread, javacalls:: call_ The hlper () function is responsible for the release, which is part of the C/C + + calling convention. So if we don't look at these two parameters, we have completely returned to the stack shown in the first figure given in this article.

You should be familiar with the above pictures. We have given them step by step when creating stack frames. You will exit as you create them now.

As mentioned earlier, when the Java method returns int type (if char, boolean, short and other types are returned, they are uniformly converted to int type), according to the Java method calling convention, the returned int value will be stored in% rax; If an object is returned, the address of the object is stored in% rax. How to distinguish between the address and the int value? The answer is to distinguish by return type; What if you return a non int, non object type value? Let's continue to see generate_ call_ Implementation logic of stub() function:

// handle return types different from T_INT
__ BIND(is_long);
__ movq(Address(c_rarg0, 0), rax);
__ jmp(exit);

__ BIND(is_float);
__ movflt(Address(c_rarg0, 0), xmm0);
__ jmp(exit);

__ BIND(is_double);
__ movdbl(Address(c_rarg0, 0), xmm0);
__ jmp(exit); 

The corresponding assembly code is as follows:

// -- is_long --
mov    %rax,(%rdi)
jmp    0x00007fdf450007d4

// -- is_float --
vmovss %xmm0,(%rdi)
jmp    0x00007fdf450007d4

// -- is_double --
vmovsd %xmm0,(%rdi)
jmp    0x00007fdf450007d4

When the long type is returned, it is also stored in% rax. Because the long type of Java is 64 bits and the code we analyze is also the 64 bit implementation under x86, the% rax register is also 64 bits and can accommodate 64 bits; When the return is float or double, it is stored in% xmm0.

Integrating this article with the previous articles, we should learn the calling conventions of C/C + + and the calling conventions of Java methods under interpretation and execution (including how to pass parameters and receive return values). If you don't understand, you will have a clear understanding by reading the article several times.

Part 6 - creation of new stack frame of Java method

In Chapter 2 - JVM virtual machine to call the main() method of Java main class, javacalls:: call is introduced_ The following code was mentioned when implementing the helper() function:

address entry_point = method->from_interpreted_entry();

This parameter will be passed as an argument to stubbroutines:: call_ The stub () function pointer points to the "function", and then in Chapter 4 - the JVM finally starts calling the main() method of the Java main class, it introduces calling entry through the callq instruction_ Point, then this entry_ What exactly is point? This one will be introduced in detail.

First look at from_ interpreted_ The entry() function is implemented as follows:

Source code location: / openjdk / hotspot / SRC / share / VM / OOP / method.hpp

volatile address from_interpreted_entry() const{ 
      return (address)OrderAccess::load_ptr_acquire(&_from_interpreted_entry); 
}

_ from_interpreted_entry is only a property defined in the Method class. The above Method directly returns the value of this property. When is this attribute assigned? In fact, it is set during Method connection (that is, Method connection will be made at the class connection stage in the class life cycle). The following methods will be called when the Method is connected:

void Method::link_method(methodHandle h_method, TRAPS) {
  // ...
  address entry = Interpreter::entry_for_method(h_method);
  // Sets both _i2i_entry and _from_interpreted_entry
  set_interpreter_entry(entry);
  // ...
}

First call interpreter:: entry_ for_ The method () function obtains the method entry according to the specific method type. After obtaining the entry, it will call set_ interpreter_ The entry () function saves the value to the corresponding attribute. set_ interpreter_ The implementation of the entry() function is very simple, as follows:

void set_interpreter_entry(address entry) { 
    _i2i_entry = entry;  
    _from_interpreted_entry = entry; 
}

Can be seen as_ from_ interpreted_ The entry property sets the entry value.

Let's take a look at entry_ for_ The method() function is implemented as follows:

static address entry_for_method(methodHandle m)  { 
     return entry_for_kind(method_kind(m)); 
}

First, through method_ The kind () function gets the corresponding type of the method and then calls entry_. for_ The kind() function obtains the entry corresponding to the method according to the method type_ point. Called entry_ for_ The kind() function is implemented as follows:

static address entry_for_kind(MethodKind k){ 
      return _entry_table[k]; 
}

Here we go straight back_ entry_ Entry of corresponding method type in table array_ Point address.

This involves the type of Java method MethodKind, because it needs to pass the entry_point enters the Java World and executes the logic related to Java methods, so entry_point will certainly create new stack frames for the corresponding Java methods, but the stack frames of different methods are different, such as Java common methods, Java synchronization methods, Java methods with native keywords, etc. Therefore, all methods are classified and different types get different entries_ Point entrance. What types are there? Let's take a look at the enumeration constants defined in the enumeration class MethodKind:

enum MethodKind {
    zerolocals,  // Common method             
    zerolocals_synchronized,  // Common synchronization methods         
    native,  // native method
    native_synchronized,  // native synchronization method
    ...
}

Of course, there are other types, but the main ones are the four types of methods defined in the enumeration class above.

In order to find the entry corresponding to a Java method as soon as possible_ The point entry saves this correspondence to_ entry_table, so entry_ for_ The kind () function can quickly get the entry corresponding to the method_ Point entrance. There are two methods for assigning values to elements in an array:

void AbstractInterpreter::set_entry_for_kind(AbstractInterpreter::MethodKind kind, address entry) {
  _entry_table[kind] = entry;
}

So when will set be called_ entry_ for_ The answer to the kind() function is in templateinterpretergenerator:: generate_ In the all() function, generate_ The all() function calls generate_ method_ The entry () function generates an entry for each Java method_ Point, each generating an entry corresponding to the method type_ Point is saved to_ entry_table.

Here is a detailed introduction to generate_ The implementation logic of the all() function will be called when HotSpot starts to generate entries for various Java methods_ point. The call stack is as follows:

TemplateInterpreterGenerator::generate_all()  templateInterpreter.cpp
InterpreterGenerator::InterpreterGenerator()  templateInterpreter_x86_64.cpp
TemplateInterpreter::initialize()    templateInterpreter.cpp
interpreter_init()                   interpreter.cpp
init_globals()                       init.cpp
Threads::create_vm()                 thread.cpp
JNI_CreateJavaVM()                   jni.cpp
InitializeJVM()                      java.c
JavaMain()                           java.c
start_thread()                       pthread_create.c

Generate called_ The all() function will generate a series of public code entries and interpretercodelets of all bytecodels executed during the operation of HotSpot. Some very important entry implementation logic will be described in detail later. Here we only look at the logic of generating entries by ordinary Java methods without native keyword modification. generate_ The all() function has the following implementation:

#define method_entry(kind)                                                                    \
{                                                                                             \
    CodeletMark cm(_masm, "method entry point (kind = " #kind ")");                           \
    Interpreter::_entry_table[Interpreter::kind] = generate_method_entry(Interpreter::kind);  \
}  

method_entry(zerolocals)

Where method_entry is a macro. After extension, the above method_ The entry (zero values) statement becomes the following form:

Interpreter::_entry_table[Interpreter::zerolocals] = generate_method_entry(Interpreter::zerolocals);

_ entry_ The table variable is defined in the AbstractInterpreter class as follows:

static address  _entry_table[number_of_method_entries];  

number_of_method_entries represents the total number of method types. You can get the corresponding method entries by using the method type as the array subscript. Call generate_ method_ The entry () function generates corresponding method entries for various types of methods. generate_ method_ The implementation of the entry() function is as follows:

address AbstractInterpreterGenerator::generate_method_entry(AbstractInterpreter::MethodKind kind) {
  bool                   synchronized = false;
  address                entry_point = NULL;
  InterpreterGenerator*  ig_this = (InterpreterGenerator*)this;

  // Generate different entries according to the method type kind
  switch (kind) { 
  // Represents a common method type
  case Interpreter::zerolocals :
      break;
  // Represents a common synchronization method type
  case Interpreter::zerolocals_synchronized: 
      synchronized = true;
      break;
  // ...
  }

  if (entry_point) {
     return entry_point;
  }

  return ig_this->generate_normal_entry(synchronized);
}

Zerolocals refers to normal Java method calls, including the main() method of the Java program. For zerolocals, Ig will be called_ this->generate_ normal_ The entry() function generates an entry. generate_ normal_ The entry () function will generate a stack for the executed method, and the stack is composed of three parts: local variable table (used to store the passed in parameters and local variables of the called method), Java method stack frame data and operand stack_ The point routine (in fact, it is a piece of machine instruction, which is called stub in English) will create these three parts to assist in the execution of Java methods.

Let's go back to the knowledge point introduced at the beginning and call entry through the callq instruction_ Point routine. The stack frame status at this time is described in Chapter 4 - the JVM finally starts calling the main() method of the Java main class. For your convenience, it is given again here:

Note that when the callq instruction is executed, the return address of the function will be stored at the top of the stack, so the item return address will be pressed in the above figure.

The CallStub() function calls generate through the callq instruction_ normal_ Entry generated by the entry() function_ Point, several registers store important values, as follows:

rbx -> Method*
r13 -> sender sp
rsi -> entry point

The following is the analysis of generate_ normal_ The implementation logic of the entry () function, which is the most important part of calling Java methods. The important implementation logic of the function is as follows:

address InterpreterGenerator::generate_normal_entry(bool synchronized) {
  // ...
  // entry_ Code entry address of point function
  address entry_point = __ pc();   
 
  // Currently, the pointer to Method is stored in rbx. Find ConstMethod through Method **
  const Address constMethod(rbx, Method::const_offset()); 
  // Find AccessFlags through Method *
  const Address access_flags(rbx, Method::access_flags_offset()); 
  // Get the size of the parameter through ConstMethod *
  const Address size_of_parameters(rdx,ConstMethod::size_of_parameters_offset());
  // Get the size of the local variable through ConstMethod *
  const Address size_of_locals(rdx, ConstMethod::size_of_locals_offset());
 
  // The calculation method for obtaining metadata of various methods has been described above,
  // However, the calculation is not performed. The corresponding assembly will be generated to perform the calculation
  // Calculate ConstMethod * and save it in rdx
  __ movptr(rdx, constMethod);                    
  // Calculate the parameter size and save it in rcx 
  __ load_unsigned_short(rcx, size_of_parameters);

  // rbx: save base address; rcx: save loop variables; rdx: save target address; rax: save the return address (used below)
  // The values in each register at this time are as follows:
  //   rbx: Method*
  //   rcx: size of parameters
  //   r13: sender_sp (could differ from sp+wordSize 
  //        if we were called via c2i) is the top stack address of the caller
  // Calculate the size of the local variable and save it to rdx
  __ load_unsigned_short(rdx, size_of_locals);
  // Since the local variable table is used to store the passed in parameters and local variables of the called method,
  // Therefore, rdx minus rcx is the usable size of the local variable of the called method 
  __ subl(rdx, rcx); 
 
  // ...
 
  // The return address is saved in CallStub. If the stack does not pop up to rax, the middle
  // There will be a return address so that the local variable table is not continuous,
  // This will lead to inconsistent calculation methods of local variables, so it will be returned temporarily
  // The return address is stored in rax
  __ pop(rax);
 
  // Calculate the address of the first parameter: current stack top address + variable size * 8 - one word size
  // Note that since the address is stored on the low address and the stack extends to the low address, only
  // The address of the first parameter can be obtained by adding n-1 variable size
  __ lea(r14, Address(rsp, rcx, Address::times_8, -wordSize));
 
  // Set the local variable of the function to 0, that is, initialize it to prevent the influence of the previous value
  // rdx: the size that can be used by the local variable of the called method
  {
    Label exit, loop;
    __ testl(rdx, rdx);
    // If RDX < = 0, do nothing
    __ jcc(Assembler::lessEqual, exit); 
    __ bind(loop);
    // Initializing local variables
    __ push((int) NULL_WORD); 
    __ decrementl(rdx); 
    __ jcc(Assembler::greater, loop);
    __ bind(exit);
  }
 
  // Generate fixed frame
  generate_fixed_frame(false);
 
  // ... omit the logic of statistics and stack overflow, which will be described in detail later

  // If it is a synchronization method, you also need to execute lock_method() function, so
  // It will affect the stack frame layout 
  if (synchronized) {
    // Allocate monitor and lock method
    lock_method();
  } 

  // Jump to the first bytecode instruction of the target Java method and execute its corresponding machine instruction
   __ dispatch_next(vtos);
 
  // ... omit the statistics related logic, which will be described in detail later
 
  return entry_point;
}

It seems that there are many implementations of this function, but in fact, the logic implementation is relatively simple. It is to create the corresponding local variable table according to the actual situation of the called method, and then there are two very important functions generate_fixed_frame() and dispatch_ The next () function, which we will describe in detail later.

After calling generate_ fixed_ Before the frame () function, the state of the stack changes to the state shown in the following figure.

Compared with the previous figure, you can see that there are more slots such as local variable 1... Local variable n, which together with argument word 1... Argument word n constitute the local variable table of the called Java method, that is, the purple part in the figure. In fact, slots such as local variable 1... Local variable n belong to the called Java method stack frame, while argument word 1... Argument word n belong to the CallStub() function stack frame. These two parts together form a local variable table, which is called stack frame overlap in professional terms.

In addition, it can be seen that% r14 points to the first parameter of the local variable table, while the return address of the CallStub() function is saved in% rax, and Method * is still stored in% rbx. The values stored in these registers will be generated in the call generate_ fixed_ The frame () function is used, so we need to emphasize it here.

Part 7 - creating stack frames for Java methods

The creation of local variable table was introduced in Chapter 6 - creation of new stack frame of Java method. The status of stack frame after creation is shown in the figure below.

The status of each register is shown below.

// %The return address is stored in the rax register
rax: return address     
// Pointer to the Java method to execute
rbx: Method*          
// Local variable table pointer  
r14: pointer to locals 
// The top of the caller's stack
r13: sender sp 

Note the return address saved in rax, because it is in generate_ call_ In the stub() function__ call(c_rarg1) statement called by generate_ normal_ Entry generated by the entry() function_ Point, so when entry_ After the point execution is completed, it will return to generate_ call_ Continue execution in the stub() function__ The code below the call(c_rarg1) statement is

Part 5 - the code involved in pop-up stack frames after calling Java methods and processing the returned results.

Generate called_ fixed_ The implementation of the frame() function is as follows:

Source code location: openjdk/hotspot/src/cpu/x86/vm/templateInterpreter_x86_64.cpp

void TemplateInterpreterGenerator::generate_fixed_frame(bool native_call) {
  // Save the return address next to the local variable area
  __ push(rax);     
  // Create stack frames for Java methods       
  __ enter();      
  // Save the top address of the caller        
  __ push(r13);           
   // Temporarily set last_ The value of the SP property is set to NULL_WORD 
  __ push((int)NULL_WORD); 
  // Get ConstMethod * and save it to r13
  __ movptr(r13, Address(rbx, Method::const_offset()));     
  // Save the address of Java method bytecode to r13
  __ lea(r13, Address(r13, ConstMethod::codes_offset()));    
  // Save Method * to stack
  __ push(rbx);             
 
  // The default value of the ProfileInterpreter property is true,
  // Indicates that statistics of relevant information is required for the interpretation and execution methods
  if (ProfileInterpreter) {
    Label method_data_continue;
    // The MethodData structure is based on ProfileData,
    // Record the data in the running state of the function
    // MethodData is divided into three parts,
    // One is operation related statistics such as function type,
    // One is parameter type operation related statistics,
    // There is also an extra extension that holds
    // Information about de optimization
    // Get in Method_ Method_ The value of the data attribute and save it to rdx
    __ movptr(rdx, Address(rbx,
           in_bytes(Method::method_data_offset())));
    __ testptr(rdx, rdx);
    __ jcc(Assembler::zero, method_data_continue);
    // That's it. Explain_ method_data has been initialized,
    // Get through MethodData_ The value of the data attribute and store it in rdx
    __ addptr(rdx, in_bytes(MethodData::data_offset()));
    __ bind(method_data_continue);
    __ push(rdx);      
  } else {
    __ push(0);
  }
  
  // Get ConstMethod * and store it in rdx
  __ movptr(rdx, Address(rbx, 
        Method::const_offset()));          
  // Get ConstantPool * and store it in rdx
  __ movptr(rdx, Address(rdx, 
         ConstMethod::constants_offset())); 
 // Get ConstantPoolCache * and store it in rdx
  __ movptr(rdx, Address(rdx, 
         ConstantPool::cache_offset_in_bytes())); 
  // Save ConstantPoolCache * to stack
  __ push(rdx); 
  // Save the address of the first parameter to the stack
  __ push(r14); 
 
  if (native_call) {
   // There is no need to save Java when calling the native method
   // Method because there is no bytecode
    __ push(0); 
  } else {
   // Save the Java method bytecode address to the stack,
   // Note that the value of the r13 register has been changed above
    __ push(r13);
  }
  
  // Reserve a slot in advance, which is of great use later
  __ push(0); 
  // Save the stack bottom address to this slot
  __ movptr(Address(rsp, 0), rsp); 
}

For ordinary Java methods, the generated assembly code is as follows:

push   %rax
push   %rbp
mov    %rsp,%rbp
push   %r13
pushq  $0x0
mov    0x10(%rbx),%r13
lea    0x30(%r13),%r13 // The lea instruction gets the memory address itself
push   %rbx
mov    0x18(%rbx),%rdx
test   %rdx,%rdx
je     0x00007fffed01b27d
add    $0x90,%rdx
push   %rdx
mov    0x10(%rbx),%rdx
mov    0x8(%rdx),%rdx
mov    0x18(%rdx),%rdx
push   %rdx
push   %r14
push   %r13
pushq  $0x0
mov    %rsp,(%rsp)

The compilation is relatively simple. I won't say more here. The stack frame status generated after the above assembly is executed is shown in the following figure.

After calling generate_ fixed_ The values saved in some registers after the frame() function are as follows:

rbx: Method*
ecx: invocation counter
r13: bcp(byte code pointer)
rdx: ConstantPool* Address of constant pool
r14: Address of the first parameter in the local variable table

After executing generate_ fixed_ After the frame() function, it will continue to return and execute interpretergenerator:: generate_ normal_ The entry () function. If the machine code is generated for the synchronization method, you also need to call lock_method() function, which will change the state of the current stack frame and add some information required for synchronization. It will be described in detail later when introducing the implementation of lock.

InterpreterGenerator::generate_ normal_ The entry () function will eventually return the entry execution address of the generated machine code, and then pass the variable_ entry_table array, so that you can use the method type as the array subscript to obtain the corresponding method entry.

Chapter 8 - dispatch_ The next() function dispatches bytecode

In generate_ normal_ Generate is called in the entry() function_ fixed_ The frame () function generates the corresponding stack frame for the execution of Java methods, and then calls dispatch_ The next () function executes the bytecode of the Java method. generate_ normal_ Dispatch called by the entry() function_ The values saved in some registers before the next() function are as follows:

rbx: Method*
ecx: invocation counter
r13: bcp(byte code pointer)
rdx: ConstantPool* Address of constant pool
r14: Address of the first parameter in the local variable table

dispatch_ The next() function is implemented as follows:

// From generate_ fixed_ When the frame() function generates the Java method call stack frame,
// If this is the first call, r13 points to the first address of the bytecode,
// That is, the first bytecode. At this time, the step parameter is 0
void InterpreterMacroAssembler::dispatch_next(TosState state, int step) {
 
  load_unsigned_byte(rbx, Address(r13, step)); 
 
  // At the current bytecode position, the pointer moves forward step width,
  // Get the value on the address, which is Opcode (range 1 ~ 202) and stored in rbx
  // The value of step is determined by the bytecode instruction and its operands
  // Auto increment r13 for next bytecode dispatch
  increment(r13, step);
 
  // Returns all bytecode entry points in the current stack top state
  dispatch_base(state, Interpreter::dispatch_table(state)); 
}

r13 points to the first address of the bytecode. When called for the first time, the value of the parameter step is 0, then load_ unsigned_ The byte() function takes a byte value from the memory pointed to by r13, and what it takes out is the opcode of the bytecode instruction. Increase the step size of r13 so that the opcode of the next bytecode instruction will be taken out during the next execution.

Called dispatch_ The implementation of the table() function is as follows:

static address*   dispatch_table(TosState state)  {
   return _active_table.table_for(state); 
}

In_ active_ Get the entry address of the corresponding stack top cache state in table_ active_ The table variable is defined in the TemplateInterpreter class, as follows:

static DispatchTable  _active_table;  

DispatchTable class and table_ Functions such as for () are defined as follows:

DispatchTable  TemplateInterpreter::_active_table;

class DispatchTable VALUE_OBJ_CLASS_SPEC {
 public:
  enum { 
    length = 1 << BitsPerByte 
  }; // The value of BitsPerByte is 8
 
 private:
  // number_of_states=9,length=256
  // _ table is byte code division 
  address  _table[number_of_states][length];   
 
 public:
  // ...
  address*   table_for(TosState state){ 
    return _table[state]; 
  }

  address*   table_for(){ 
    return table_for((TosState)0); 
  }
  // ...
}; 

address is u_ Alias of type char *_ Table is a two-dimensional array table with dimensions of stack top state (9 kinds in total) and bytecode (256 at most). It stores the entry point of bytecode corresponding to each stack top state. Since the top stack cache has not been introduced here, it is not easy to understand. However, the related contents of the top stack cache and byte code division will be introduced in detail later. After the introduction, it will be easier to understand this part of logic.

InterpreterMacroAssembler::dispatch_ The dispatch_ in the next () function is called The base() function is implemented as follows:

void InterpreterMacroAssembler::dispatch_base(
  TosState  state, // Indicates the top of stack cache state
  address*  table,
  bool verifyoop
) {
  // ...
  // Get the address of the current stack top status bytecode and save it to rscratch1
  lea(rscratch1, ExternalAddress((address)table));
  // Jump to the entry corresponding to the bytecode and execute the machine code instruction
  // address = rscratch1 + rbx * 8
  jmp(Address(rscratch1, rbx, Address::times_8));
} 

For example, if a byte size instruction is taken (for example, iconst_0, aload_0, etc. are all byte size instructions), then interpretermacroassembler:: Dispatch_ The assembly code generated by the next() function is as follows:

// In generate_ fixed_ In the frame() function
// Have% r13 stored bcp
// %Stored in ebx is the Opcode of bytecode, that is, the Opcode
movzbl 0x0(%r13),%ebx  
 
// $0x7ff73ba4a0 this address points to
// It is a one-dimensional array in the corresponding state, with a length of 256
movabs $0x7ffff73ba4a0,%r10

// Note that constants are stored in% r10 according to the calculation formula
// %r10+%rbx*8 to obtain the address pointing to the storage entry address,
// Get the entry address through * (% r10+%rbx*8),
// Then jump to the entry address for execution
jmpq *(%r10,%rbx,8)

%r10 refers to the one-dimensional array under the corresponding stack top cache state, with a length of 256, in which the stored value is opcode, as shown in the following figure.

The following function shows that the entry address is set for each top state of each bytecode.

void DispatchTable::set_entry(int i, EntryPoint& entry) {
  assert(0 <= i && i < length, "index out of bounds");
  assert(number_of_states == 9, "check the code below");
  _table[btos][i] = entry.entry(btos);
  _table[ctos][i] = entry.entry(ctos);
  _table[stos][i] = entry.entry(stos);
  _table[atos][i] = entry.entry(atos);
  _table[itos][i] = entry.entry(itos);
  _table[ltos][i] = entry.entry(ltos);
  _table[ftos][i] = entry.entry(ftos);
  _table[dtos][i] = entry.entry(dtos);
  _table[vtos][i] = entry.entry(vtos);
}

The parameter i is opcode, and each byte code and corresponding opcode can be referred to https://docs.oracle.com/javas....

So_ table is shown in the following figure.

_ The one dimension of table is the top cache state and the two dimensions are Opcode. Through these two dimensions, you can find a piece of machine instruction, which is the machine instruction fragment to be executed by the bytecode located according to the current top cache state.

Call dispatch_ The next () function executes the bytecode of the Java method, which is actually executed by finding the entry address of the corresponding machine instruction fragment according to the bytecode. This machine code is translated according to the semantics of the corresponding bytecode, which will be described in detail later.

The official account [in-depth analysis of Java virtual machine HotSpot] has updated the VM source code to analyze 60+ articles. Welcome to be concerned. If there are any problems, add WeChat mazhimazh to pull you into virtual cluster communication.

Topics: Java jvm