Detailed explanation of C&Golang function call process

Posted by anf.etienne on Sat, 09 Oct 2021 08:03:08 +0200

The previous article talked about the call instruction calling the sum function in main.

At this time, the CPU jumps to sum and starts to execute the following commands:

0x0000000000400526 <+0>:push   %rbp          0x0000000000400527 <+1>:mov   %rsp,%rbp 0x000000000040052a <+4>:mov   %edi,-0x14(%rbp)  0x000000000040052d <+7>:mov   %esi,-0x18(%rbp)  0x0000000000400530 <+10>:mov   -0x14(%rbp),%edx0x0000000000400533 <+13>:mov   -0x18(%rbp),%eax0x0000000000400536 <+16>:add   %edx,%eax0x0000000000400538 <+18>:mov   %eax,-0x4(%rbp)0x000000000040053b <+21>:mov   -0x4(%rbp),%eax0x000000000040053e <+24>:pop   %rbp0x000000000040053f <+25>:retq

The first two instructions of sum are the same as those of main.

0x0000000000400526 <+0>:push   %rbp            # sum Function prologue that holds the caller's rbp0x0000000000400527 <+1>:mov   %rsp,%rbp   # sum function preamble, adjust the rbp register to point to its own stack frame start position

They all save the caller's rbp and then set a new value to point to the starting address of the current function stack frame. At this time, sum saves the rbp value of main (0x7fffffe510) and modifies the rbp value to the starting position of sum's own stack frame (0x7fffffe4e0).

It can be seen from the above instructions that the function preamble of sum does not reserve stack space for local and temporary variables of sum by adjusting the value of rsp like the preamble of main.

Does this mean that sum does not use the stack to store local variables?

It can be seen from the following analysis that the sum local variable s still exists on the stack and can be used without reservation. As mentioned earlier, the memory on the stack does not need to be allocated in the application layer code. The operating system has allocated it and can be used directly.

The reason why main needs to adjust the rsp value to reserve the stack space used by local variables and temporary variables is that main also needs to call sum with call, and call will automatically subtract 8 from the rsp value, and then save the return address of the function to the stack memory location indicated by rsp. If main does not adjust the rsp value, Then, when call saves the value of the return address of the function, it will overwrite the value of the local variable or temporary variable of main, and no instruction in sum will automatically use rsp to save data to the stack, so there is no need to adjust the value of rsp.

Look at the four instructions that follow.

0x000000000040052a <+4>:mov   %edi,-0x14(%rbp)  # Put the first parameter a Put temporary variable 0 x000000000040052d <+7>:mov   %esi,-0x18(%rbp)  # Put the second parameter b Put temporary variable 0 x0000000000400530 <+10>:mov   -0x14(%rbp),%edx # Read the first to from the temporary variable edx Register 0 x0000000000400533 <+13>:mov   -0x18(%rbp),%eax # Read the second from the temporary variable to the eax register

The above instruction saves the two parameters passed from main to sum in the appropriate position of the current stack frame by rbp plus offset, and then takes them out and puts them into the register. It seems a bit superfluous. This is because the optimization level is not specified for GCC during compilation, and no optimization is done by default during gcc compiler, so it looks verbose.

Let's look at the next three instructions.

0x0000000000400536 <+16>:add   %edx,%eax            # implement a + b And save the results to eax Register 0 x0000000000400538 <+18>:mov   %eax,-0x4(%rbp)  # Assign the addition result to the variable s0x000000000040053b <+21>:mov   -0x4(%rbp),%eax  # Read the value of the s variable into the eax register

The first instruction is responsible for performing the addition operation and storing the result in eax. The second instruction stores the value in eax in the memory where the local variable s is located. The third instruction reads the value of the local variable s into eax. It can be seen that the local variable s is arranged to rbp by the compiler  - 0x4 is in the memory corresponding to this address.

Here, the main functions of sum have been completed. Let's take a look at the status diagram of the current stack and register:

It should be noted that the two parameters and return values of sum are int, accounting for only 4 bytes in memory, while each stack memory unit in the figure is 8 bytes and aligned according to the 8-byte address boundary, so it is like the above figure.

Next, continue to execute the pop%rbp instruction, which contains the following two operations:

  1. Put the value in the stack memory referred to by the current rsp into rbp, so that rbp will recover to the value when the first instruction of sum is not executed, that is, it will re point to the starting position of the main stack frame.

  2. Add 8 to the value of rsp, so that rsp points to the stack memory containing the value 0x40055e, and the value in this stack unit was put in by call when main called sum, and the value put in is the value of the next instruction immediately after call.

The status diagram is as follows:

Continue to execute the retq instruction. The above instruction takes 0x40055e from the stack unit pointed to by rsp and stores it in rip. At the same time, increase the value of rsp by 8. In this way, the value of rip becomes the next instruction of main calling the call instruction of sum, so it returns to main to continue execution.

At this time, the value in eax is 3, that is, the value returned after the execution of sum. Take a look at the following status chart:

Continue with the following command:

mov   %eax,-0x4(%rbp)  # Assign the return value of sum function to variable n

The above instruction puts the value (3) in eax into rbp  - 0x4 refers to the memory where the local variable n of main is located, so the meaning of this instruction is to assign the return value of sum to the local variable n. at this time, the state diagram is as follows:

Further instructions are as follows:

0x0000000000400561 <+33>:mov   -0x4(%rbp),%eax0x0000000000400564 <+36>:mov   %eax,%esi0x0000000000400566 <+38>:mov   $0x400604,%edi0x000000000040056b <+43>:mov   $0x0,%eax0x0000000000400570 <+48>:callq 0x400400 <printf@plt>0x0000000000400575 <+53>:mov   $0x0,%eax

The above instructions first prepare parameters for printf, and then call printf, which is similar to the process of calling sum, so that CPU can be directly executed to the second countdown leaveq instructions of main. At this time, the stack and register status is as follows:

The last mov $0x0 of the leaveq instruction, the%eax directive, is to return main value 0 to register eax, and so on when main returns, it can get the value by calling main.

Executing the leaveq instruction is equivalent to executing the following two-day instructions:

mov %rbp, %rsppop %rbp

The leaveq instruction first copies the value of rbp to rsp, so that rsp points to the stack unit referred to by rbp, and then the leaveq instruction pop s the value of the stack unit to rbp, so that rsp and rbp return to the state when they just entered main. As follows:

At this time, there is only the retq instruction left in main. As described in the analysis of sum before, this instruction will completely return to the function calling main to continue execution.

So far, the function calling process of C has been introduced. Next, let's talk about the Go function calling process.

Well, that's the end of this article. If you like, let's have a triple hit.  

Scan code to pay attention to official account and get more quality content.

  

Topics: C Go