In the previous introduction of control transfer instruction, only the main logic of related bytecode interpretation and execution is briefly introduced, and the statistics related logic is not introduced. For control transfer instructions, the TemplateTable::branch(bool is_jsr, bool is_wide) function is usually called to generate relevant assembly code. These assembly codes contain statistical logic, which will be described in detail in this article.
Most of the control transfer instructions call the TemplateTable::branch() function to generate statistics related codes, as shown in the following table.
Opcode |
Mnemonic |
describe |
Function to generate statistics related code |
0x99 |
ifeq |
Jump when the int value at the top of the stack is equal to 0 |
TemplateTable::branch(false,false) |
0x9a |
ifne |
Jump when the int value at the top of the stack is not equal to 0 |
TemplateTable::branch(false,false) |
0x9b |
iflt |
Jump when the int value at the top of the stack is less than 0 |
TemplateTable::branch(false,false) |
0x9c |
ifge |
Jump when the int value at the top of the stack is greater than or equal to 0 |
TemplateTable::branch(false,false) |
0x9d |
ifgt |
Jump when the int value at the top of the stack is greater than 0 |
TemplateTable::branch(false,false) |
0x9e |
ifle |
Jump when the int value at the top of the stack is less than or equal to 0 |
TemplateTable::branch(false,false) |
0x9f |
if_icmpeq |
Compare the size of two int values at the top of the stack. Jump when the result is equal to 0 |
TemplateTable::branch(false,false) |
0xa0 |
if_icmpne |
Compare the size of two int values at the top of the stack. Jump when the result is not equal to 0 |
TemplateTable::branch(false,false) |
0xa1 |
if_icmplt |
Compare the size of two int values at the top of the stack. Jump when the result is less than 0 |
TemplateTable::branch(false,false) |
0xa2 |
if_icmpge |
Compare the size of two int values at the top of the stack. Jump when the result is greater than or equal to 0 |
TemplateTable::branch(false,false) |
0xa3 |
if_icmpgt |
Compare the size of two int values at the top of the stack. Jump when the result is greater than 0 |
TemplateTable::branch(false,false) |
0xa4 |
if_icmple |
Compare the size of two int values at the top of the stack. Jump when the result is less than or equal to 0 |
TemplateTable::branch(false,false) |
0xa5 |
if_acmpeq |
Compare the two reference values at the top of the stack, and jump when the results are equal |
TemplateTable::branch(false,false) |
0xa6 |
if_acmpne |
Compare the two reference values at the top of the stack, and jump when the results are not equal |
TemplateTable::branch(false,false) |
0xa7 |
goto |
Unconditional jump |
TemplateTable::branch(false,false) |
0xa8 |
jsr |
Jump to the specified 16 bit offset position and push the address of the next instruction of jsr into the top of the stack |
TemplateTable::branch(true,false) |
0xa9 |
ret |
Return to the instruction location of the index of the local variable instruction (generally used in combination with JSR or jsr_w) |
InterpreterMacroAssembler::profile_ret() |
0xaa |
tableswitch |
For switch conditional jump, case value is continuous (variable length instruction) |
InterpreterMacroAssembler::profile_switch_case() InterpreterMacroAssembler::profile_switch_default() |
0xab |
lookupswitch |
It is used for switch conditional jump. The case value is discontinuous (variable length instruction), which will be rewritten to_ fast_linearswitch or_ fast_binaryswitch is an instruction used inside the virtual machine, so the logic of bytecode depends on the logic of these two instructions |
InterpreterMacroAssembler::profile_switch_case() InterpreterMacroAssembler::profile_switch_default() |
0xc8 |
goto_w |
Unconditional jump (wide index) |
TemplateTable::branch(false,true) |
0xc9 |
jsr_w |
Jump to the specified 32-bit offset position and set jsr_w the address of the next instruction is pushed to the top of the stack |
TemplateTable::branch(true,true) |
Let's take goto instruction as an example. Goto bytecode instruction was introduced in a previously written control and transfer bytecode instruction. The corresponding generation function is templatetable:_ Goto(), the generated assembly code is as follows:
// Copy the Method * saved in the current stack frame to% rcx 0x00007fffe101dd10: mov -0x18(%rbp),%rcx // If the option ProfileInterpreter is enabled, performance statistics related to branch jump are performed // %MDP (Method Data Pointer) is saved in rax 0x00007fffe101dd14: mov -0x20(%rbp),%rax // If method:_ method_ If the value of data is NULL, jump to -- profile_continue ---- 0x00007fffe101dd18: test %rax,%rax 0x00007fffe101dd1b: je 0x00007fffe101dd39 // When the code is executed here, it means method:_ method_ The value of data is not NULL // According to method:_ method_ Get jumpdata:: token from data_ off_ Set the value of the attribute at offset and store it in% rbx 0x00007fffe101dd21: mov 0x8(%rax),%rbx // Add DataLayout::counter_increment, the value is 1 0x00007fffe101dd25: add $0x1,%rbx // sbb is a subtraction instruction with borrow 0x00007fffe101dd29: sbb $0x0,%rbx // Store back to jumpdata:: Token_ off_ Set offset 0x00007fffe101dd2d: mov %rbx,0x8(%rax) // The method data pointer needs to be updated to reflect the new target. // %The method data is stored in rax // Get jumpdata:: display according to MethodData_ off_ Value at set offset 0x00007fffe101dd31: add 0x10(%rax),%rax // Update the value stored in% rax to the interpreter in the stack_ frame_ mdx_ Offset offset 0x00007fffe101dd35: mov %rax,-0x20(%rbp)
Method::_ method_ The type of the data attribute is MethodData *, which is in the MethodData class_ The data attribute can hold the details of Java methods. For example, a bytecode instruction of a Java method may have multiple backedges, so the runtime information related to these backedges will be stored in the_ The data attribute points to an area of memory. Jumpdata:: take as above_ off_ Set is in MethodData::_ The corresponding position of a memory area pointed to by data stores the number of jumps. In addition, you need to pay attention to the interpreter in the stack_ frame_ mdx_ The offset stores the method data pointer. These things also need to be introduced. MethodData:_ The data storage structure and related storage contents can only be understood after the data is stored. The next chapter will introduce them in detail.
Next, the following assembly instructions are generated, as follows:
// **** profile_continue **** // Read the 2-byte data starting from the 1-byte backward offset of the current bytecode position into% rdx 0x00007fffe101dd39: movswl 0x1(%r13),%edx // Reverse the byte order of values in% rdx 0x00007fffe101dd3e: bswap %edx // Shift the value in% rdx by 16 bits to the right. The above two steps are to calculate the offset of the jump branch 0x00007fffe101dd40: sar $0x10,%edx // Expand data in% rdx from 2 bytes to 4 bytes 0x00007fffe101dd43: movslq %edx,%rdx // Add the current bytecode address with the offset saved by% rdx to calculate the jump target address 0x00007fffe101dd46: add %rdx,%r13
If UseLoopCounter is true, there will be the following assembly. When the following assembly is executed, the status of each register is as follows:
increment backedge counter for backward branches
rax: MDO ebx: MDO bumped taken-count rcx: method rdx: target offset r13: target bcp r14: locals pointer
Compiled as follows:
// Verify whether the rdx is greater than 0. If it is greater than 0, it means to jump forward. If it is less than 0, it means to jump back, // If it is greater than 0, it will jump to - - dispatch - - so that statistics will be carried out only after returning to the edge 0x00007fffe101dd49: test %edx,%edx 0x00007fffe101dd4b: jns 0x00007fffe101de30 // When this operation is performed, it indicates that there are edge returns to be counted // Check method:_ method_ Whether counters is NULL. If it is not empty, jump to - --- has_counters ---- 0x00007fffe101dd51: mov 0x20(%rcx),%rax 0x00007fffe101dd55: test %rax,%rax 0x00007fffe101dd58: jne 0x00007fffe101ddf4 // If it is empty, use interpreter Runtime:: build_ method_ The counters () function creates a new MethodCounters 0x00007fffe101dd5e: push %rdx 0x00007fffe101dd5f: push %rcx 0x00007fffe101dd60: callq 0x00007fffe101dd6a 0x00007fffe101dd65: jmpq 0x00007fffe101dde8 0x00007fffe101dd6a: mov %rcx,%rsi 0x00007fffe101dd6d: lea 0x8(%rsp),%rax 0x00007fffe101dd72: mov %r13,-0x38(%rbp) 0x00007fffe101dd76: mov %r15,%rdi 0x00007fffe101dd79: mov %rbp,0x200(%r15) 0x00007fffe101dd80: mov %rax,0x1f0(%r15) 0x00007fffe101dd87: test $0xf,%esp 0x00007fffe101dd8d: je 0x00007fffe101dda5 0x00007fffe101dd93: sub $0x8,%rsp 0x00007fffe101dd97: callq 0x00007ffff66b581c 0x00007fffe101dd9c: add $0x8,%rsp 0x00007fffe101dda0: jmpq 0x00007fffe101ddaa 0x00007fffe101dda5: callq 0x00007ffff66b581c 0x00007fffe101ddaa: movabs $0x0,%r10 0x00007fffe101ddb4: mov %r10,0x1f0(%r15) 0x00007fffe101ddbb: movabs $0x0,%r10 0x00007fffe101ddc5: mov %r10,0x200(%r15) 0x00007fffe101ddcc: cmpq $0x0,0x8(%r15) 0x00007fffe101ddd4: je 0x00007fffe101dddf 0x00007fffe101ddda: jmpq 0x00007fffe1000420 0x00007fffe101dddf: mov -0x38(%rbp),%r13 0x00007fffe101dde3: mov -0x30(%rbp),%r14 0x00007fffe101dde7: retq 0x00007fffe101dde8: pop %rcx 0x00007fffe101dde9: pop %rdx // Store the created new MethodCounters in% rax 0x00007fffe101ddea: mov 0x20(%rcx),%rax //If the creation fails, jump to -- dispatch---- 0x00007fffe101ddee: je 0x00007fffe101de30
As follows, when the - XX:+TieredCompilation option is enabled, that is, the assembly generated only when hierarchical compilation is enabled:
// **** has_counters **** // Assembly generated only when ProfileInterpreter performance collection is turned on // Get method:_ method_ Add the data attribute to rbx and verify whether it is empty. If it is empty, jump to ---- no_mdo ---- 0x00007fffe101ddf4: mov 0x18(%rcx),%rbx 0x00007fffe101ddf8: test %rbx,%rbx 0x00007fffe101ddfb: je 0x00007fffe101de17 //Method::_ method_ If the data attribute is not empty, add method:_ method_ data::_ backedge_ counter // Count value. If it exceeds the threshold value, jump to ---- background_ counter_ overflow ---- 0x00007fffe101ddfd: mov 0x70(%rbx),%eax 0x00007fffe101de00: add $0x8,%eax 0x00007fffe101de03: mov %eax,0x70(%rbx) 0x00007fffe101de06: and $0x1ff8,%eax 0x00007fffe101de0c: je 0x00007fffe101df22 // When the threshold is not exceeded, jump to ---- dispatch---- 0x00007fffe101de12: jmpq 0x00007fffe101de30 // **** no_mdo **** // Add method:_ method_ counters::backedge_ Count of calls to counter, // If the threshold value is exceeded, jump to ---- background_ counter_ overflow ---- 0x00007fffe101de17: mov 0x20(%rcx),%rcx 0x00007fffe101de1b: mov 0xc(%rcx),%eax 0x00007fffe101de1e: add $0x8,%eax 0x00007fffe101de21: mov %eax,0xc(%rcx) 0x00007fffe101de24: and $0x1ff8,%eax 0x00007fffe101de2a: je 0x00007fffe101df22 // **** dispatch **** // r13 has become the target jump address. Here is the first bytecode of the jump address loaded into rbx, and then executed // Jump logic for bytecode instructions 0x00007fffe101de30: movzbl 0x0(%r13),%ebx 0x00007fffe101de35: movabs $0x7ffff73b9e40,%r10 0x00007fffe101de3f: jmpq *(%r10,%rbx,8) // **** profile_method **** // Since we are talking about the assembly code in the case of layered compilation, the profile will not be executed_ The assembly code under method,
// That is, the profile will not be called_ Method() function creates a MethodData instance and assigns it to Method::_method_data // Via call_VM() function to call InterpreterRuntime::profile_method() function 0x00007fffe101de43: callq 0x00007fffe101de4d 0x00007fffe101de48: jmpq 0x00007fffe101dec8 0x00007fffe101de4d: lea 0x8(%rsp),%rax 0x00007fffe101de52: mov %r13,-0x38(%rbp) 0x00007fffe101de56: mov %r15,%rdi 0x00007fffe101de59: mov %rbp,0x200(%r15) 0x00007fffe101de60: mov %rax,0x1f0(%r15) 0x00007fffe101de67: test $0xf,%esp 0x00007fffe101de6d: je 0x00007fffe101de85 0x00007fffe101de73: sub $0x8,%rsp 0x00007fffe101de77: callq 0x00007ffff66b4d84 0x00007fffe101de7c: add $0x8,%rsp 0x00007fffe101de80: jmpq 0x00007fffe101de8a 0x00007fffe101de85: callq 0x00007ffff66b4d84 0x00007fffe101de8a: movabs $0x0,%r10 0x00007fffe101de94: mov %r10,0x1f0(%r15) 0x00007fffe101de9b: movabs $0x0,%r10 0x00007fffe101dea5: mov %r10,0x200(%r15) 0x00007fffe101deac: cmpq $0x0,0x8(%r15) 0x00007fffe101deb4: je 0x00007fffe101debf 0x00007fffe101deba: jmpq 0x00007fffe1000420 0x00007fffe101debf: mov -0x38(%rbp),%r13 0x00007fffe101dec3: mov -0x30(%rbp),%r14 0x00007fffe101dec7: retq // End call_ End of vm() function // restore target bytecode 0x00007fffe101dec8: movzbl 0x0(%r13),%ebx // Call set_ method_ data_ pointer_ for_ Assembly generated by bcp() function 0x00007fffe101decd: push %rax 0x00007fffe101dece: push %rbx // Get method:_ method_ Data and store it in% rax 0x00007fffe101decf: mov -0x18(%rbp),%rbx 0x00007fffe101ded3: mov 0x18(%rbx),%rax // If method:_ method_ If data is NULL, jump to -- set_mdp ---- 0x00007fffe101ded7: test %rax,%rax 0x00007fffe101deda: je 0x00007fffe101df17 // Via call_ VM_ The assembly generated by the leaf() function calls InterpreterRuntime::bcp_to_di() function 0x00007fffe101dee0: mov %r13,%rsi 0x00007fffe101dee3: mov %rbx,%rdi 0x00007fffe101dee6: test $0xf,%esp 0x00007fffe101deec: je 0x00007fffe101df04 0x00007fffe101def2: sub $0x8,%rsp 0x00007fffe101def6: callq 0x00007ffff66b4bb4 0x00007fffe101defb: add $0x8,%rsp 0x00007fffe101deff: jmpq 0x00007fffe101df09 0x00007fffe101df04: callq 0x00007ffff66b4bb4 // rax: mdi // mdo is guaranteed to be non-zero here, we checked for it before the call. // Change Method::_method_data stored in% rbx 0x00007fffe101df09: mov 0x18(%rbx),%rbx // Add method:_ method_ data::_ Data offset 0x00007fffe101df0d: add $0x90,%rbx 0x00007fffe101df14: add %rbx,%rax // **** set_mdp **** // Through interpreter_frame_mdx_offset to get mdx 0x00007fffe101df17: mov %rax,-0x20(%rbp) 0x00007fffe101df1b: pop %rbx 0x00007fffe101df1c: pop %rax // End set_method_data_pointer_for_bcp() function call // Jump to -- dispatch---- 0x00007fffe101df1d: jmpq 0x00007fffe101de30
Invoked interpreterruntime:: Profile_ The method() function is implemented as follows:
IRT_ENTRY(void, InterpreterRuntime::profile_method(JavaThread* thread)) // .. frame fr = thread->last_frame(); methodHandle method(thread, fr.interpreter_frame_method()); Method::build_interpreter_method_data(method, THREAD); IRT_END // If the value of Method::MethodData is NULL, create a new MethodData instance and assign a value void Method::build_interpreter_method_data(methodHandle method, TRAPS) { // ... MutexLocker ml(MethodData_lock, THREAD); if (method->method_data() == NULL) { ClassLoaderData* loader_data = method->method_holder()->class_loader_data(); MethodData* method_data = MethodData::allocate(loader_data, method, CHECK); method->set_method_data(method_data); } }
Method::_ method_ The data property creates a MethodData instance and assigns a value. Therefore, the interpreter Runtime:: profile is called_ The method() function will make method::_ method_ The value of the data property is not NULL.
Then look at the following assembly code:
// The following assembly is generated only when UseOnStackReplacement is enabled // Jump to this branch when the threshold is exceeded // **** backedge_counter_overflow **** // Complement the data in rdx 0x00007fffe101df22: neg %rdx // Add the address of r13 to rdx. These two steps are to calculate the jump address 0x00007fffe101df25: add %r13,%rdx // When the back edge count reaches the threshold, it will // By calling call_VM() function to call InterpreterRuntime::frequency_counter_overflow() function 0x00007fffe101df28: callq 0x00007fffe101df32 0x00007fffe101df2d: jmpq 0x00007fffe101dfb0 0x00007fffe101df32: mov %rdx,%rsi 0x00007fffe101df35: lea 0x8(%rsp),%rax 0x00007fffe101df3a: mov %r13,-0x38(%rbp) 0x00007fffe101df3e: mov %r15,%rdi 0x00007fffe101df41: mov %rbp,0x200(%r15) 0x00007fffe101df48: mov %rax,0x1f0(%r15) 0x00007fffe101df4f: test $0xf,%esp 0x00007fffe101df55: je 0x00007fffe101df6d 0x00007fffe101df5b: sub $0x8,%rsp 0x00007fffe101df5f: callq 0x00007ffff66b45c8 0x00007fffe101df64: add $0x8,%rsp 0x00007fffe101df68: jmpq 0x00007fffe101df72 0x00007fffe101df6d: callq 0x00007ffff66b45c8 0x00007fffe101df72: movabs $0x0,%r10 0x00007fffe101df7c: mov %r10,0x1f0(%r15) 0x00007fffe101df83: movabs $0x0,%r10 0x00007fffe101df8d: mov %r10,0x200(%r15) 0x00007fffe101df94: cmpq $0x0,0x8(%r15) 0x00007fffe101df9c: je 0x00007fffe101dfa7 0x00007fffe101dfa2: jmpq 0x00007fffe1000420 0x00007fffe101dfa7: mov -0x38(%rbp),%r13 0x00007fffe101dfab: mov -0x30(%rbp),%r14 0x00007fffe101dfaf: retq // End call_VM() function call // Recover bytecode to be executed 0x00007fffe101dfb0: movzbl 0x0(%r13),%ebx // rax: osr nmethod (osr ok) or NULL (osr not possible) // ebx: target bytecode // rdx: scratch // r14: locals pointer // r13: bcp // %rax stores the compiled results. If it is NULL, it means that there is no suitable compiled result, otherwise it needs to perform the replacement operation on the stack // Verification frequency_ counter_ Whether the compilation result returned by the overflow() function is null, // If it is empty, jump to ----- dispatch -----, that is, continue to interpret and execute bytecode 0x00007fffe101dfb5: test %rax,%rax 0x00007fffe101dfb8: je 0x00007fffe101de30 // If it is not empty, it means that the method compilation is completed. Nmethod::_ entry_ The offset of the BCI attribute is copied into rcx 0x00007fffe101dfbe: mov 0x48(%rax),%ecx // If rcx is equal to InvalidOSREntryBci, jump to ----- dispatch---- 0x00007fffe101dfc1: cmp $0xfffffffe,%ecx 0x00007fffe101dfc4: je 0x00007fffe101de30 // Start performing on stack replacement // Note that the compilation results have been stored in% rax, so the execution of this compilation is the replacement on the stack, but the // Before, you also need to convert the interpretation stack to the compilation stack, because the calling conventions of the two are completely different // Temporarily store the value in% rax in% r13 because the following OSR is called_ migration_ The begin() function may // Will destroy the value stored in% rax 0x00007fffe101dfca: mov %rax,%r13 // By calling call_VM() function calls SharedRuntime::OSR_migration_begin() function // Call OSR_ migration_ The begin () function completes the migration of variables and monitor s on the stack frame 0x00007fffe101dfcd: callq 0x00007fffe101dfd7 0x00007fffe101dfd2: jmpq 0x00007fffe101e052 0x00007fffe101dfd7: lea 0x8(%rsp),%rax 0x00007fffe101dfdc: mov %r13,-0x38(%rbp) 0x00007fffe101dfe0: mov %r15,%rdi 0x00007fffe101dfe3: mov %rbp,0x200(%r15) 0x00007fffe101dfea: mov %rax,0x1f0(%r15) 0x00007fffe101dff1: test $0xf,%esp 0x00007fffe101dff7: je 0x00007fffe101e00f 0x00007fffe101dffd: sub $0x8,%rsp 0x00007fffe101e001: callq 0x00007ffff6a18a6a 0x00007fffe101e006: add $0x8,%rsp 0x00007fffe101e00a: jmpq 0x00007fffe101e014 0x00007fffe101e00f: callq 0x00007ffff6a18a6a 0x00007fffe101e014: movabs $0x0,%r10 0x00007fffe101e01e: mov %r10,0x1f0(%r15) 0x00007fffe101e025: movabs $0x0,%r10 0x00007fffe101e02f: mov %r10,0x200(%r15) 0x00007fffe101e036: cmpq $0x0,0x8(%r15) 0x00007fffe101e03e: je 0x00007fffe101e049 0x00007fffe101e044: jmpq 0x00007fffe1000420 0x00007fffe101e049: mov -0x38(%rbp),%r13 0x00007fffe101e04d: mov -0x30(%rbp),%r14 0x00007fffe101e051: retq // At this time, the OSR buffer is stored in% rax and passed as the first parameter to the code generated after OSR compilation // Copy values from% rax to% rsi(j_rarg0) 0x00007fffe101e052: mov %rax,%rsi // Get interpreter_frame_sender_sp_offset value at offset 0x00007fffe101e055: mov -0x8(%rbp),%rdx // leaveq is equivalent to movq% RBP,% RSP and pop% RBP 0x00007fffe101e059: leaveq // Pop return address into% rcx 0x00007fffe101e05a: pop %rcx 0x00007fffe101e05b: mov %rdx,%rsp // -The value of StackAlignmentInBytes is $0xfffffffffffff0, which ensures that the stack is aligned by 8 bytes 0x00007fffe101e05e: and $0xfffffffffffffff0,%rsp // Push the return address into the stack 0x00007fffe101e062: push %rcx // Jump to nmethod:_ osr_ entry_ Point, start execution 0x00007fffe101e063: jmpq *0x88(%r13)
When compiling a hot code block, if you call interpreterruntime:: frequency_ counter_ If the overflow() function obtains appropriate compilation results, it is necessary to perform stack replacement. After the replacement is completed, the interpretation execution will directly change to compilation execution. About how to compile hot code and how to call sharedruntime:: OSR_ migration_ The begin() function completes stack frame migration and other operations, which will be described in detail later, but not here.
The whole process is shown in the figure below.
Call interpreterruntime:: frequency_ counter_ The overflow() function compiles the code block and calls sharedruntime:: OSR_ migration_ The begin () function performs on stack replacement, which will be described in detail later.
The official account is analyzed in depth. Java virtual machine HotSpot has updated the VM source code to analyze the related articles to 60+. Welcome to the attention. If there are any problems, add WeChat mazhimazh, pull you into the virtual cluster communication.