Title 49 - calling conventions

Posted by dotbob on Sat, 01 Jan 2022 02:19:11 +0100

JVM bytecode uses some technical terms. When it comes to call points, It refers to a location in a method (caller) where another method (callee) is called. Moreover, in the case of non static method calls, we always resolve the method to an object. This object is the so-called receiver object, and its runtime type is called the receiver type (receiver type).

The calling convention defines the Application Binary Interface (ABI) where the program control flow is transferred in or out from a function (or method), that is, how to pass parameters and return values, and how to prepare and recover the stack. Sometimes, it is necessary to maintain stack frame information that supports debugging, exception handling and garbage collection requirements.

Once the calling convention of a language on a platform is defined, any compiler that generates code for this language on this platform should follow this Convention. If codes in different languages follow the same calling convention, they can interact with each other. For example, when the main() method of interpretation execution described earlier calls a local function, interpretation execution pushes the parameters from left to right, and the stacking order of the local function is from right to left. Therefore, it is necessary to generate special routines for conversion, so that the interaction can be smooth.

There are about three types of stack frames in HotSpot VM, namely, the stack frame corresponding to C/C + + function, Java interpretation stack and Java compilation stack. Therefore, there will be calls to C/C + + function, Java interpretation execution and Java compilation execution. Each call needs to abide by different call conventions. The following describes each call convention in detail.

1. Java interprets the calling convention for execution

We have introduced Java interpretation execution in detail before. Let's summarize the conventions related to invocation.

During interpretation execution, some values need to be stored in some registers in advance, as follows:

value

x86_64

interp. method

RBX

interp. arg ptr

RSP+8

interp. saved sp

RSI

Some registers are generally reserved to store some specific values, as follows:

value

x86_64

JavaThread

R15

HeapBase

R12

To facilitate the use of R12 and R15 registers in the program, HotSpot VM defines the corresponding variables as follows:

REGISTER_DECLARATION(Register, r12_heapbase, r12); // callee-saved
REGISTER_DECLARATION(Register, r15_thread, r15); // callee-saved

Macros are defined in register HPP file, as follows:

#define REGISTER_DECLARATION(type, name, value) \
extern const type name;                         \
enum { name##_##type##EnumValue = value##_##type##EnumValue }

After macro expansion:

extern const Register  r12_heapbase;
enum { r12_heapbase_RegisterEnumValue = r12_RegisterEnumValue }
extern const Register  r15_thread;
enum { r15_thread_RegisterEnumValue = r15_RegisterEnumValue }

Then the corresponding value will be assigned to the corresponding variable, as follows:

const Register  r12_heapbase = ((Register)r12_heapbase_RegisterEnumValue);
const Register  r15_thread   = ((Register)r15_thread_RegisterEnumValue);  

When address compression is allowed, R12 generally saves the base address of the heap.

value

x86_64

interp. java sp

RSP

interp. fp

RBP

RSP represents the top of the current stack frame, while RBP represents the bottom of the current stack frame. Therefore, both RSP and RBP have fixed usage and will not participate in any other calculation or parameter transmission.

When explaining execution, the Java virtual machine uses the local variable table to complete the parameter transfer during method call. When a method is called, its parameters are passed to the position of the continuous local variable table starting from 0. In particular, when calling an instance method, the 0th local variable must sum to store the reference of the object where the called instance method is located, that is, the object represented by the this keyword. It should be noted that when the passed parameter type is long or double, it needs to occupy two slots. For x86-64 bits, although a slot is 8 bytes and has been stored by long or double type values, another slot is still required and set this slot to NULL.

Explain that the execution result will be stored at the top of the stack, and the arguments pushed by the caller should be cleaned up by the caller, because the callee does not necessarily know the location and number of parameters.  

2. Calling convention for Java compilation execution

Java compilation and execution has not been introduced yet, but let's first introduce the relevant calling conventions.

During compilation and execution, some values need to be stored in some registers in advance, as follows:

value

x86_64

java int args

j_rarg 0..5

java long args

same as ints

java float args

j_farg 0..7

inline cache

RAX

 j_rarg0... j_rarg5 has 6 registers to pass parameters, such as int, boolean and so on. It should be noted that long type parameters can also be passed through j_rarg to pass, because for x84-64, the register is 8 bytes in size and can be used to pass long type arguments. Number of floating point type through j_farg0...j_farg7 these eight registers to pass.

j_rarg and j_farg related registers are defined as follows:

REGISTER_DECLARATION(Register, j_rarg0, c_rarg1);
REGISTER_DECLARATION(Register, j_rarg1, c_rarg2);
REGISTER_DECLARATION(Register, j_rarg2, c_rarg3);
REGISTER_DECLARATION(Register, j_rarg3, c_rarg4);
REGISTER_DECLARATION(Register, j_rarg4, c_rarg5);
REGISTER_DECLARATION(Register, j_rarg5, c_rarg0);
 
REGISTER_DECLARATION(XMMRegister, j_farg0, xmm0);
REGISTER_DECLARATION(XMMRegister, j_farg1, xmm1);
REGISTER_DECLARATION(XMMRegister, j_farg2, xmm2);
REGISTER_DECLARATION(XMMRegister, j_farg3, xmm3);
REGISTER_DECLARATION(XMMRegister, j_farg4, xmm4);
REGISTER_DECLARATION(XMMRegister, j_farg5, xmm5);
REGISTER_DECLARATION(XMMRegister, j_farg6, xmm6);
REGISTER_DECLARATION(XMMRegister, j_farg7, xmm7);

After macro expansion:

extern const Register  j_rarg0;
enum { j_rarg0_RegisterEnumValue = c_rarg1_RegisterEnumValue }
extern const Register  j_rarg1;
enum { j_rarg1_RegisterEnumValue = c_rarg2_RegisterEnumValue }
extern const Register  j_rarg2;
enum { j_rarg2_RegisterEnumValue = c_rarg3_RegisterEnumValue }
extern const Register  j_rarg3;
enum { j_rarg3_RegisterEnumValue = c_rarg4_RegisterEnumValue }
extern const Register  j_rarg4;
enum { j_rarg4_RegisterEnumValue = c_rarg5_RegisterEnumValue }
extern const Register  j_rarg5;
enum { j_rarg5_RegisterEnumValue = c_rarg0_RegisterEnumValue }
 
extern const XMMRegister  j_farg0;
enum { j_farg0_XMMRegisterEnumValue = xmm0_XMMRegisterEnumValue }
extern const XMMRegister  j_farg1;
enum { j_farg1_XMMRegisterEnumValue = xmm1_XMMRegisterEnumValue }
extern const XMMRegister  j_farg2;
enum { j_farg2_XMMRegisterEnumValue = xmm2_XMMRegisterEnumValue }
extern const XMMRegister  j_farg3;
enum { j_farg3_XMMRegisterEnumValue = xmm3_XMMRegisterEnumValue }
extern const XMMRegister  j_farg4;
enum { j_farg4_XMMRegisterEnumValue = xmm4_XMMRegisterEnumValue }
extern const XMMRegister  j_farg5;
enum { j_farg5_XMMRegisterEnumValue = xmm5_XMMRegisterEnumValue }
extern const XMMRegister  j_farg6;
enum { j_farg6_XMMRegisterEnumValue = xmm6_XMMRegisterEnumValue }
extern const XMMRegister  j_farg7;
enum { j_farg7_XMMRegisterEnumValue = xmm7_XMMRegisterEnumValue }

Assign the corresponding value to the corresponding variable as follows:

const Register     j_rarg0 = ((Register)j_rarg0_RegisterEnumValue);
const Register     j_rarg1 = ((Register)j_rarg1_RegisterEnumValue);
const Register     j_rarg2 = ((Register)j_rarg2_RegisterEnumValue);
const Register     j_rarg3 = ((Register)j_rarg3_RegisterEnumValue);
const Register     j_rarg4 = ((Register)j_rarg4_RegisterEnumValue);
const Register     j_rarg5 = ((Register)j_rarg5_RegisterEnumValue);

const XMMRegister  j_farg0 = ((XMMRegister)j_farg0_XMMRegisterEnumValue);
const XMMRegister  j_farg1 = ((XMMRegister)j_farg1_XMMRegisterEnumValue);
const XMMRegister  j_farg2 = ((XMMRegister)j_farg2_XMMRegisterEnumValue);
const XMMRegister  j_farg3 = ((XMMRegister)j_farg3_XMMRegisterEnumValue);
const XMMRegister  j_farg4 = ((XMMRegister)j_farg4_XMMRegisterEnumValue);
const XMMRegister  j_farg5 = ((XMMRegister)j_farg5_XMMRegisterEnumValue);
const XMMRegister  j_farg6 = ((XMMRegister)j_farg6_XMMRegisterEnumValue);
const XMMRegister  j_farg7 = ((XMMRegister)j_farg7_XMMRegisterEnumValue);

Variable j_rarg and j_ The prefix J of farg indicates the registers used during Java compilation and execution, such as j_rarg0 passes the first non floating point type parameter through j_farg0 passes the first floating-point type parameter, etc. j_rarg0 to j_ The registers corresponding to rarg5 are as follows:

reg. arg

int#0

int#1

int#2

int#3

int#4

int#5

float regs

name

j_rarg0

j_rarg1

j_rarg2

j_rarg3

j_rarg4

j_rarg5

j_farg0...j_farg7

Linux/Solaris

RSI

RDX

RCX

R8

R9

RDI

XMM0..XMM7

j_farg0 to j_farg7 corresponds to 8 registers from xmm0 to xmm7.  

3. Calling convention of native method

The calling convention of the local function corresponding to the native method is introduced here. CPU architecture, bits and operating system jointly determine how to call. Here we only introduce the calling convention of 64 bit system on Linux under x86 architecture.

When executing local functions, some values need to be stored in some registers in advance, as follows:

value

x86_64

sp alignment

16 bytes

long result

RAX

int result

RAX

native sp

RSP

return pc

RSP(0)

stack arg #i

RSP(8+i*8)

float result

XMM0

reg. int args

(see below)

reg. long args

same as ints

reg. float args

(see below)

In the x86-64 bit system, the arguments of the first six non floating point numbers are passed through the register, the first eight floating point numbers are passed through the register, and the redundant parameters need to be passed through the stack, and the order of pressure is from right to left, that is, the position of the last parameter at the bottom of the stack.

The conventions of registers used to pass parameters are as follows:

reg. arg

int#0

int#1

int#2

int#3

int#4

int#5

float regs

name

c_rarg0

c_rarg1

c_rarg2

c_rarg3

c_rarg4

c_rarg5

f_rarg0..f_rarg7

Linux/Solaris

RDI

RSI

RDX

RCX

R8

R9

XMM0..XMM7

In order to facilitate use in the program, we directly define the following variables and use the corresponding registers through variables.

REGISTER_DECLARATION(Register, c_rarg0, rdi);
REGISTER_DECLARATION(Register, c_rarg1, rsi);
REGISTER_DECLARATION(Register, c_rarg2, rdx);
REGISTER_DECLARATION(Register, c_rarg3, rcx);
REGISTER_DECLARATION(Register, c_rarg4, r8);
REGISTER_DECLARATION(Register, c_rarg5, r9);
 
REGISTER_DECLARATION(XMMRegister, c_farg0, xmm0);
REGISTER_DECLARATION(XMMRegister, c_farg1, xmm1);
REGISTER_DECLARATION(XMMRegister, c_farg2, xmm2);
REGISTER_DECLARATION(XMMRegister, c_farg3, xmm3);
REGISTER_DECLARATION(XMMRegister, c_farg4, xmm4);
REGISTER_DECLARATION(XMMRegister, c_farg5, xmm5);
REGISTER_DECLARATION(XMMRegister, c_farg6, xmm6);
REGISTER_DECLARATION(XMMRegister, c_farg7, xmm7);

The results after macro expansion are as follows:

extern const Register  c_rarg0;
enum { c_rarg0_RegisterEnumValue = rdi_RegisterEnumValue }
extern const Register  c_rarg1;
enum { c_rarg1_RegisterEnumValue = rsi_RegisterEnumValue }
extern const Register  c_rarg2;
enum { c_rarg2_RegisterEnumValue = rdx_RegisterEnumValue }
extern const Register  c_rarg3;
enum { c_rarg3_RegisterEnumValue = rcx_RegisterEnumValue }
extern const Register  c_rarg4;
enum { c_rarg4_RegisterEnumValue = r8_RegisterEnumValue }
extern const Register  c_rarg5;
enum { c_rarg5_RegisterEnumValue = r9_RegisterEnumValue }
 
extern const XMMRegister  c_farg0;
enum { c_farg0_XMMRegisterEnumValue = xmm0_XMMRegisterEnumValue }
extern const XMMRegister  c_farg1;
enum { c_farg1_XMMRegisterEnumValue = xmm1_XMMRegisterEnumValue }
extern const XMMRegister  c_farg2;
enum { c_farg2_XMMRegisterEnumValue = xmm2_XMMRegisterEnumValue }
extern const XMMRegister  c_farg3;
enum { c_farg3_XMMRegisterEnumValue = xmm3_XMMRegisterEnumValue }
extern const XMMRegister  c_farg4;
enum { c_farg4_XMMRegisterEnumValue = xmm4_XMMRegisterEnumValue }
extern const XMMRegister  c_farg5;
enum { c_farg5_XMMRegisterEnumValue = xmm5_XMMRegisterEnumValue }
extern const XMMRegister  c_farg6;
enum { c_farg6_XMMRegisterEnumValue = xmm6_XMMRegisterEnumValue }
extern const XMMRegister  c_farg7;
enum { c_farg7_XMMRegisterEnumValue = xmm7_XMMRegisterEnumValue }

Assign the corresponding value to the corresponding variable as follows:

const Register     c_rarg0 = ((Register)c_rarg0_RegisterEnumValue);
const Register     c_rarg1 = ((Register)c_rarg1_RegisterEnumValue);
const Register     c_rarg2 = ((Register)c_rarg2_RegisterEnumValue);
const Register     c_rarg3 = ((Register)c_rarg3_RegisterEnumValue);
const Register     c_rarg4 = ((Register)c_rarg4_RegisterEnumValue);
const Register     c_rarg5 = ((Register)c_rarg5_RegisterEnumValue);

const XMMRegister  c_farg0 = ((XMMRegister)c_farg0_XMMRegisterEnumValue);
const XMMRegister  c_farg1 = ((XMMRegister)c_farg1_XMMRegisterEnumValue);
const XMMRegister  c_farg2 = ((XMMRegister)c_farg2_XMMRegisterEnumValue);
const XMMRegister  c_farg3 = ((XMMRegister)c_farg3_XMMRegisterEnumValue);
const XMMRegister  c_farg4 = ((XMMRegister)c_farg4_XMMRegisterEnumValue);
const XMMRegister  c_farg5 = ((XMMRegister)c_farg5_XMMRegisterEnumValue);
const XMMRegister  c_farg6 = ((XMMRegister)c_farg6_XMMRegisterEnumValue);
const XMMRegister  c_farg7 = ((XMMRegister)c_farg7_XMMRegisterEnumValue);

When compiling and executing Java methods and local functions, parameters will be passed through 6 registers. The calling convention is:

In other words, when Java is compiled and executed, parameters are also passed through the six registers used by C/C + + functions, but the register order used to specify parameters is different. For example, the first parameter of Java uses% rsi register, while C/C + + functions use% rdi. There is a reason for this. Assuming that the java compiler calls the native instance method, the local function implementation of the native method will have one more parameter JNIEnv *, so we only need to use% rdi to store it. If the parameters of the Java method are stored in% rdi, they need to be saved to the stack, and other registers do not need to be moved.

If it is a native static method, the register actually needs to be moved, because the local function corresponding to the native static method will have two more parameters JNIEnv * and jclass.

The official account is analyzed in depth. Java virtual machine HotSpot has updated the VM source code to analyze the related articles to 60+. Welcome to the attention. If there are any problems, add WeChat mazhimazh, pull you into the virtual cluster communication.