Introduction to assembly language
Introduction to assembly language 1: Environmental preparation
Execution:
sudo apt-get update
Execution:
sudo apt-get install nasm
Check:
New file: vi t.c
int main() { return 0; }
New file: first asm
global main main: mov eax, 0 ret
Compile to generate first file
(32-bit system):
$ nasm -f elf first.asm -o first.o $ gcc -m32 first.o -o first
64 bit system:
$ nasm -f elf64 first.asm -o first.o $ gcc -m64 first.o -o first
result:
function:
$ ./first ; echo $?
result:
Introduction to assembly language II: enjoy the environment first
New file: VI test asm
global main main: mov eax, 1 mov ebx, 2 add eax, ebx ret
Compile run:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 3 dontla@dontla-virtual-machine:~/desktop/test$
Exercise 1:
global main main: mov eax, 1 add eax, 2 add eax, 3 add eax, 4 add eax, 5 ret
result:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 15
Exercise 2:
global main main: mov eax, 1 mov ebx, 2 mov ecx, 3 mov edx, 4 add eax, ebx add eax, ecx add eax, edx ret
result:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 10
Introduction to simple instructions
mov
For data transmission instructions, we can use mov instructions as follows to achieve the purpose of data transmission.
mov eax, 1 ; Give Way eax The value of is 1( eax = 1) mov ebx, 2 ; Give Way ebx The value of is 2( ebx = 2) mov ecx, eax ; hold eax The value of is passed to ecx(ecx = eax)
add
Addition instruction
add eax, 2 ; eax = eax + 2 add ebx, eax ; ebx = ebx + eax
ret
Return instruction, similar to return in C language, is used for return after function call (described in detail later).
sub
Subtraction instruction (similar to addition instruction)
sub eax, 1 ; eax = eax - 1 sub eax, ecx ; eax = eax - ecx
More registers
In addition to eax, ebx, ecx and edx listed above, there are also some registers:
esi edi ebp
eax, ebx, ecx and edx are general-purpose registers, which can store data at will and participate in most operations. The remaining three are more common in some memory access scenarios, but at present, you can grab one and use it.
Introduction to assembly language 3: it's time to go to memory
Relationship between cpu, register and memory
Pointer and memory
Register and memory
The memory is outside the cpu and the registers are in the cpu. There are only a limited number of registers, which makes the cpu more expensive
Assign the value of the register to memory
mov [0x0699], eax mov [0x0998], ebx mov [0x1299], ecx mov [0x1499], edx mov [0x1999], esi
Assign the value of memory to the register
mov eax, [0x0699] mov ebx, [0x0998] mov ecx, [0x1299] mov edx, [0x1499] mov esi, [0x1999]
Hands on programming
global main main: mov ebx, 1 mov ecx, 2 add ebx, ecx mov [sui_bian_xie], ebx mov eax, [sui_bian_xie] ret section .data sui_bian_xie dw 0
result:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test -no-pie dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 3 dontla@dontla-virtual-machine:~/desktop/test$ ls -lh Total consumption 24 K -rwxrwxr-x 1 dontla dontla 16K 6 March 22:32 test -rw-rw-r-- 1 dontla dontla 151 6 March 22:08 test.asm -rw-rw-r-- 1 dontla dontla 848 6 March 22:32 test.o dontla@dontla-virtual-machine:~/desktop/test$
relocation R_X86_64_32S against `.data' can not be used when making a PIE object; recompile with -fP
Resolution:
mov ebx, 1 ; take ebx Assign a value of 1 mov ecx, 2 ; take ecx The value assigned is 2 add ebx, ecx ; ebx = ebx + ecx mov [sui_bian_xie], ebx ; take ebx Save your values mov eax, [sui_bian_xie] ; Read out the saved value again and put it in the eax in ret ; Return, the final return value of the whole program is eax Value in
Note: the value of eax register when the program returns is the return value after the whole program exits. This is a convention in the environment we use at present, and we abide by it
These two lines of code:
section .data sui_bian_xie dw 0
The first line indicates that the following contents will be put into the data area of the executable file after compilation, and the corresponding memory will be allocated as the program starts. (if the registers are not enough, we should open up memory as a space for temporary data storage)
The second line is the key to describing the real data. This line means to open up a 4-byte space and fill it with 0. dw (double word) here means four bytes (one word type, two chars, one char, one byte, two chars are two bytes), and the Sui in front_ bian_ Xie means that you can write it casually here, that is, just give it a name, which is convenient for you to distinguish when writing code_ bian_ Xie will be processed into a specific address by the compiler during compilation. We don't need to care about the specific address. We know the Sui before and after anyway_ bian_ Xie refers to the same thing.
Crazy code writing
global main main: mov ebx, [number_1] mov ecx, [number_2] add ebx, ecx mov [result], ebx mov eax, [result] ret section .data number_1 dw 10 number_2 dw 20 result dw 0
result:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test -no-pie dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 30 dontla@dontla-virtual-machine:~/desktop/test$
Disassembly
First check whether the disassembly tool is installed,
dontla@dontla-virtual-machine:~/desktop/test$ which gdb /usr/bin/gdb dontla@dontla-virtual-machine:~/desktop/test$
Otherwise:
$ sudo apt-get install gdb -y
New test asm
global main main: mov eax, 1 mov ebx, 2 add eax, ebx ret
compile:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test
function:
dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 3
Disassemble the test file with gdb (disassemble the machine language into assembly language):
dontla@dontla-virtual-machine:~/desktop/test$ gdb ./test GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test... (No debugging symbols found in ./test) (gdb)
The format adjustment of disassembly is called intel format:
(gdb) set disassembly-flavor intel (gdb)
continue:
(gdb) disas main Dump of assembler code for function main: 0x0000000000001130 <+0>: mov eax,0x1 0x0000000000001135 <+5>: mov ebx,0x2 0x000000000000113a <+10>: add eax,ebx 0x000000000000113c <+12>: ret 0x000000000000113d <+13>: nop DWORD PTR [rax] End of assembler dump. (gdb)
Dynamic debugging
Break point:
(gdb) break *0x0000000000001135 Breakpoint 1 at 0x1135
Operation: error is found
(gdb) run Starting program: /home/dontla/desktop/test/test Warning: Cannot insert breakpoint 1. Cannot access memory at address 0x1135 (gdb)
To view the value of the eax register:
(gdb) info registers eax eax 0x1 1
To view the value of the ebx register:
(gdb) info registers ebx ebx 0x55555140 1431654720
Execute the next procedure and look at the value of ebx:
(gdb) stepi 0x000055555555513a in main () (gdb) info registers ebx ebx 0x2 2
Continue single step execution and check the value. Enter the instruction disas to check which sentence of code the program executes:
(gdb) stepi 0x000055555555513c in main () (gdb) info registers eax eax 0x3 3 (gdb) disas Dump of assembler code for function main: 0x0000555555555130 <+0>: mov $0x1,%eax 0x0000555555555135 <+5>: mov $0x2,%ebx 0x000055555555513a <+10>: add %ebx,%eax => 0x000055555555513c <+12>: retq 0x000055555555513d <+13>: nopl (%rax) End of assembler dump. (gdb)
If you want the program to run directly to the end, enter the instruction continue:
(gdb) continue Continuing. [Inferior 1 (process 32774) exited with code 03] (gdb)
Introduction to assembly language 4: get through C and assembly language
Episode: the relationship between C language and assembly language
New program test c
int x, y, z; int main() { x = 2; y = 3; z = x + y; return z; }
Compile execution output:
dontla@dontla-virtual-machine:~/desktop/test1$ gcc test.c -o test dontla@dontla-virtual-machine:~/desktop/test1$ ./test ; echo $? 5 dontla@dontla-virtual-machine:~/desktop/test1$
The assembly code equivalent to the above c code is as follows:
global main main: mov eax, 2 mov [x], eax mov eax, 3 mov [y], eax mov eax, [x] mov ebx, [y] add eax, ebx mov [z], eax mov eax, [z] ret section .data x dw 0 y dw 0 z dw 0
Why do you want to save it, take it out and save it again? It's too troublesome!
Why not just do this?
global main main: mov [x], 2 mov [y], 3 add [x], [y] mov [z], [x] ret section .data x dw 0 y dw 0 z dw 0
Direct error reporting:
dontla@dontla-virtual-machine:~/desktop/test1$ nasm -f elf64 test.asm -o test.o test.asm:4: error: operation size not specified test.asm:5: error: operation size not specified test.asm:6: error: invalid combination of opcode and operands test.asm:7: error: invalid combination of opcode and operands
I don't know why... Can't you operate memory directly? You have to operate memory through registers?
Change:
global main main: mov eax, 2 mov [x], eax mov eax, [x] mov ebx, 3 mov [y], ebx mov ebx, [y] add eax, ebx mov [z], eax ret section .data x dw 0 y dw 0 z dw 0
Operation results:
dontla@dontla-virtual-machine:~/desktop/test1$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test1$ gcc -m64 test.o -o test -no-pie dontla@dontla-virtual-machine:~/desktop/test1$ ./test ; echo $? 5
Note: ret finally takes the value of eax as the return value, not the last sentence code
Note that the shortcut key for vi select all to delete the code is ESC – > ggvg
Uncover the true face of C program (check the C language program and the program written in assembly language respectively, compile the executable file, and disassemble the compiled code with gdb)
Prepare two codes:
test1.c
int x, y, z; int main() { x = 2; y = 3; z = x + y; return z; }
test2.asm
global main main: mov eax, 2 mov [x], eax mov eax, 3 mov [y], eax mov eax, [x] mov ebx, [y] add eax, ebx mov [z], eax mov eax, [z] ret section .data x dw 0 y dw 0 z dw 0
Compile and generate executable files test1 and test2 respectively:
dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test1.c -o test1 dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test2.asm -o test2.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 -fno-lto test2.o -o test2 -no-pie dontla@dontla-virtual-machine:~/desktop/test$ ls test1 test1.c test2 test2.asm test2.o dontla@dontla-virtual-machine:~/desktop/test$
1. View the disassembly code of test1 file with gdb:
dontla@dontla-virtual-machine:~/desktop/test$ gdb ./test1 GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test1... (No debugging symbols found in ./test1) (gdb) set disassembly-flavor intel (gdb) disas main Dump of assembler code for function main: 0x0000000000001129 <+0>: endbr64 0x000000000000112d <+4>: push rbp 0x000000000000112e <+5>: mov rbp,rsp 0x0000000000001131 <+8>: mov DWORD PTR [rip+0x2edd],0x2 # 0x4018 <x> 0x000000000000113b <+18>: mov DWORD PTR [rip+0x2ed7],0x3 # 0x401c <y> 0x0000000000001145 <+28>: mov edx,DWORD PTR [rip+0x2ecd] # 0x4018 <x> 0x000000000000114b <+34>: mov eax,DWORD PTR [rip+0x2ecb] # 0x401c <y> 0x0000000000001151 <+40>: add eax,edx 0x0000000000001153 <+42>: mov DWORD PTR [rip+0x2ebb],eax # 0x4014 <z> 0x0000000000001159 <+48>: mov eax,DWORD PTR [rip+0x2eb5] # 0x4014 <z> 0x000000000000115f <+54>: pop rbp 0x0000000000001160 <+55>: ret End of assembler dump. (gdb)
2. Check the disassembly code of test2 with gdb:
dontla@dontla-virtual-machine:~/desktop/test$ gdb ./test2 GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test2... (No debugging symbols found in ./test2) (gdb) set disassembly-flavor intel (gdb) disas main Dump of assembler code for function main: 0x0000000000401110 <+0>: mov eax,0x2 0x0000000000401115 <+5>: mov DWORD PTR ds:0x404028,eax 0x000000000040111c <+12>: mov eax,0x3 0x0000000000401121 <+17>: mov DWORD PTR ds:0x40402a,eax 0x0000000000401128 <+24>: mov eax,DWORD PTR ds:0x404028 0x000000000040112f <+31>: mov ebx,DWORD PTR ds:0x40402a 0x0000000000401136 <+38>: add eax,ebx 0x0000000000401138 <+40>: mov DWORD PTR ds:0x40402c,eax 0x000000000040113f <+47>: mov eax,DWORD PTR ds:0x40402c 0x0000000000401146 <+54>: ret 0x0000000000401147 <+55>: nop WORD PTR [rax+rax*1+0x0] End of assembler dump. (gdb)
You can compare:
test1 disassembly
0x0000000000001129 <+0>: endbr64 0x000000000000112d <+4>: push rbp 0x000000000000112e <+5>: mov rbp,rsp 0x0000000000001131 <+8>: mov DWORD PTR [rip+0x2edd],0x2 # 0x4018 <x> 0x000000000000113b <+18>: mov DWORD PTR [rip+0x2ed7],0x3 # 0x401c <y> 0x0000000000001145 <+28>: mov edx,DWORD PTR [rip+0x2ecd] # 0x4018 <x> 0x000000000000114b <+34>: mov eax,DWORD PTR [rip+0x2ecb] # 0x401c <y> 0x0000000000001151 <+40>: add eax,edx 0x0000000000001153 <+42>: mov DWORD PTR [rip+0x2ebb],eax # 0x4014 <z> 0x0000000000001159 <+48>: mov eax,DWORD PTR [rip+0x2eb5] # 0x4014 <z> 0x000000000000115f <+54>: pop rbp 0x0000000000001160 <+55>: ret
test2 disassembly
0x0000000000401110 <+0>: mov eax,0x2 0x0000000000401115 <+5>: mov DWORD PTR ds:0x404028,eax 0x000000000040111c <+12>: mov eax,0x3 0x0000000000401121 <+17>: mov DWORD PTR ds:0x40402a,eax 0x0000000000401128 <+24>: mov eax,DWORD PTR ds:0x404028 0x000000000040112f <+31>: mov ebx,DWORD PTR ds:0x40402a 0x0000000000401136 <+38>: add eax,ebx 0x0000000000401138 <+40>: mov DWORD PTR ds:0x40402c,eax 0x000000000040113f <+47>: mov eax,DWORD PTR ds:0x40402c 0x0000000000401146 <+54>: ret 0x0000000000401147 <+55>: nop WORD PTR [rax+rax*1+0x0]
How does the disassembly I generated differ so much from the author's???
The author said that the assembly language of test2 can also be simplified as follows:
global main main: mov dword [x], 0x2 mov dword [y], 0x3 mov eax, [x] mov ebx, [y] add eax, ebx mov [z], eax mov eax, [z] ret section .data x dw 0 y dw 0 z dw 0
Introduction to assembly language 5: process control (I)
The CPU has a register inside, which is specially used to record where the program is executed
x86: eip register
eip cannot be modified manually, but the system itself can
Jump instruction jmp
global main main: mov eax, 1 mov ebx, 2 jmp gun_kai add eax, ebx gun_kai: ret
Equivalent C language program:
int main() { int a = 1; int b = 2; goto gun_kai; a = a + b; gun_kai: return a; }
In fact, the goto statement in C language is a jmp instruction after compilation. Its function is to jump directly to a certain place. You can jump forward or backward. The goal of jump is the label behind jmp. After compilation, this label will be processed into an address, which is actually jumping to a certain address. The function of jmp inside the CPU is to modify eip and make it suddenly become another value, Then the CPU will jump over and execute the code elsewhere
What if looks like in the assembly
int main() { int a = 50; if( a > 10 ) { a = a - 10; } return a; }
Equivalent assembly code:
global main main: mov eax, 50 cmp eax, 10 ; yes eax And 10 jle xiaoyu_dengyu_shi ; If 10 is less than or equal to eax Jump when sub eax, 10 xiaoyu_dengyu_shi: ret
notes:
cmp Instruction, which is specially used to compare two numbers jle,Conditional jump instruction: jump when the previous comparison result is "less than or equal to", otherwise it will not jump
Other instructions:
Equal to jnb Not less than jnbe Not less than or equal to jne Not equal to jg greater than(Signed) jge Greater than or equal to(Signed) jl less than(Signed) jle Less than or equal to(Signed) jng Not greater than(Signed) jnge Not greater than or equal to(Signed) jnl Not less than jnle Not less than or equal to jns Unsigned jnz Nonzero js If signed jz If zero
First, the jump instruction is preceded by the letter j
The key is the letter after j
For example, j is followed by ne, which corresponds to jne jump instruction. n and e correspond to not and equal respectively, that is, "unequal". That is, when the result of the comparison instruction is "don't want to wait", it will jump.
a: above e: equal b: below n: not g: greater l: lower s: signed z: zero
View disassembly of else and else if
New program test c
int main() { register int grade = 80; register int level; if ( grade >= 85 ){ level = 1; } else if ( grade >= 70 ) { level = 2; } else if ( grade >= 60 ) { level = 3; } else { level = 4; } return level; }
(there is a register keyword in the program, which is used to limit that this variable can only be represented by registers after compilation, which is convenient for us to analyze. Readers can remove the register keyword and compare the disassembly code as needed.)
Run the program first to check the return value:
dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.c -o test dontla@dontla-virtual-machine:~/desktop/test$ ./test ; echo $? 2
View assembly code with gdb disassembly:
dontla@dontla-virtual-machine:~/desktop/test$ gdb ./test GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test... (No debugging symbols found in ./test) (gdb) set disassembly-flavor intel (gdb) disas main Dump of assembler code for function main: 0x0000000000001129 <+0>: endbr64 0x000000000000112d <+4>: push rbp 0x000000000000112e <+5>: mov rbp,rsp 0x0000000000001131 <+8>: push rbx 0x0000000000001132 <+9>: mov ebx,0x50 0x0000000000001137 <+14>: cmp ebx,0x54 0x000000000000113a <+17>: jle 0x1143 <main+26> 0x000000000000113c <+19>: mov ebx,0x1 0x0000000000001141 <+24>: jmp 0x1160 <main+55> 0x0000000000001143 <+26>: cmp ebx,0x45 0x0000000000001146 <+29>: jle 0x114f <main+38> 0x0000000000001148 <+31>: mov ebx,0x2 0x000000000000114d <+36>: jmp 0x1160 <main+55> 0x000000000000114f <+38>: cmp ebx,0x3b 0x0000000000001152 <+41>: jle 0x115b <main+50> 0x0000000000001154 <+43>: mov ebx,0x3 0x0000000000001159 <+48>: jmp 0x1160 <main+55> 0x000000000000115b <+50>: mov ebx,0x4 0x0000000000001160 <+55>: mov eax,ebx 0x0000000000001162 <+57>: pop rbx 0x0000000000001163 <+58>: pop rbp 0x0000000000001164 <+59>: ret End of assembler dump. (gdb)
I can't see anything. I'm dizzy. I don't see it anymore!
Status register eflags
The function of "flag register" is to remember some special CPU States, such as whether the result of the previous operation is positive or negative, whether there is carry in the calculation process, and whether the calculation result is zero. The subsequent jump instruction determines whether to jump according to the state in the eflags register.
cmp instruction is actually subtracting two operands, and some states after subtraction will eventually be reflected in the eflags register
Introduction to assembly language 6: process control (2)
Disassembly cycle structure
This is a C language loop program:
int main{ int sum = 0; int i = 1; while( i <= 10 ) { sum = sum + i; i = i + 1; return sum; }
If you don't use loops, how do you implement goto?
int sum = 10; int i = 1; _start: if( i <= 10 ) { sum = sum + i; i = i + 1; goto _start; }
Of course, the "Shanzhai" C language code compiled by the anti foreign exchange is actually like this:
int sum = 10; int i = 1; _start: if( i > 10 ) { goto _end_of_block; } sum = sum + i; i = i + 1; goto _start; _end_of_block:
. . .
Write a loop in assembly
Translate the above code directly into assembly, as follows:
global main main: mov eax, 0 mov ebx, 1 _start: cmp ebx, 10 jg _end_of_block add eax, ebx add ebx, 1 jmp _start _end_of_block: ret
It is nothing more than the substitution of some existing assembly instructions
What about other cycles?
It's all a routine. It doesn't make any difference
Introduction to assembly language 7: function call (1)
Example of assembly language calling function:
global main eax_plus_1s: add eax, 1 ret ebx_plus_1s: add ebx, 1 ret main: mov eax, 0 mov ebx, 0 call eax_plus_1s call eax_plus_1s call ebx_plus_1s add eax, ebx ret
In fact, when the call instruction is executed, one more thing the CPU needs to do before jumping is to save the eip and jump to the target. When encountering the ret instruction, restore the eip saved in the last call. We know that the eip directly determines where the CPU will execute the code. When the eip is restored, it means that the program will go to the previous position again.
A program cannot avoid call ing many times. Where are these eip values saved?
There is a place called "stack", which is a memory area designated by the operating system before the program starts. The return address after each function call is stored in the stack
What is stack?
. . .
In the actual CPU, the above top of the stack is also recorded by a register, which is called esp(stack pointer), every time the call instruction is executed.
When eip is put on the stack (call function), it is roughly equivalent to executing such instructions:
sub esp, 4 mov dword ptr[esp], eip
Translated into C language (if esp is a pointer of void *):
esp = (void*)( ((unsigned int)esp) - 4 )//The stack pointer moves 4 bytes to the lower order *( (unsigned int*) esp ) = (unsigned int) eip//Save the address of the program before call ing the function in the memory pointed to by the stack pointer
Hands on test
New code test asm:
global main eax_plus_1s: add eax, 1 ret main: mov eax, 0 call eax_plus_1s ret
Compile the executable file test first:
dontla@dontla-virtual-machine:~/desktop/test$ nasm -f elf64 test.asm -o test.o dontla@dontla-virtual-machine:~/desktop/test$ gcc -m64 test.o -o test dontla@dontla-virtual-machine:~/desktop/test$ ls test test.asm test.o dontla@dontla-virtual-machine:~/desktop/test$
Disassemble with gdb and make a breakpoint at + 5 (run before disas):
dontla@dontla-virtual-machine:~/desktop/test$ gdb ./test GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test... (No debugging symbols found in ./test) (gdb) r Starting program: /home/dontla/desktop/test/test [Inferior 1 (process 38995) exited with code 01] (gdb) disas main Dump of assembler code for function main: 0x0000555555555134 <+0>: mov $0x0,%eax 0x0000555555555139 <+5>: callq 0x555555555130 <eax_plus_1s> 0x000055555555513e <+10>: retq 0x000055555555513f <+11>: nop End of assembler dump. (gdb) b *0x0000555555555139 Breakpoint 1 at 0x555555555139 (gdb) r Starting program: /home/dontla/desktop/test/test Breakpoint 1, 0x0000555555555139 in main () (gdb)
Then disas main view disassembly:
(gdb) disas main Dump of assembler code for function main: 0x0000555555555134 <+0>: mov $0x0,%eax => 0x0000555555555139 <+5>: callq 0x555555555130 <eax_plus_1s> 0x000055555555513e <+10>: retq 0x000055555555513f <+11>: nop End of assembler dump. (gdb)
To view the value of the eip register:
(gdb) info register eip Invalid register `eip'
An error is reported, saying that the register eip is invalid
Check the value of the instruction pointer rip register: (checking the address of the rip register is meaningless, only the value is meaningful)
(gdb) info register rip rip 0x555555555139 0x555555555139 <main+5>
View the value of stack pointer rsp register: (the value of rsp register saves the address of stack top pointer)
(gdb) info registers rsp rsp 0x7fffffffe408 0x7fffffffe408
View the value of the top of the stack pointed to by the address stored in the stack pointer register rsp
(gdb) p/x *(unsigned int*)$rsp $1 = 0xf7dea0b3
Next, use stepi to execute the next program: you can see that the program is executed into the function, and then use disas to see which code is executed into the function:
(gdb) stepi 0x0000555555555130 in eax_plus_1s () (gdb) disas Dump of assembler code for function eax_plus_1s: => 0x0000555555555130 <+0>: add $0x1,%eax 0x0000555555555133 <+3>: retq End of assembler dump.
Now look at the value of rsp register: it is 4 less than just now
(gdb) info register rsp rsp 0x7fffffffe400 0x7fffffffe400
Then check what rsp points to the top of the stack:
(gdb) p/x *(unsigned int*)$rsp $2 = 0x5555513e
What is this $2 = 0x55513e? Don't worry. Let's check the disassembly of the main function:
Dump of assembler code for function main: 0x0000555555555134 <+0>: mov $0x0,%eax 0x0000555555555139 <+5>: callq 0x555555555130 <eax_plus_1s> 0x000055555555513e <+10>: retq 0x000055555555513f <+11>: nop End of assembler dump.
Eh, why is it different from the author's???
Why do I have so many 555? emmmmmmmmmmmmmmmmmmmmmmmm
However, at least the value of the stack top pointer can be determined (pay attention to distinguish the address of the stack top pointer from the concept of value), which is the address of the next code after the function is executed
Introduction to assembly language 8: function call (2)
Transfer process of parameters and return values during function call: in assembly language, parameters and return values of function call can be transferred through registers
It's not that simple
. . .
Alert scope
. . .
Registers in the CPU are globally visible. So using registers is actually using something like a global variable.
What exactly do you need
To achieve recursion, the state of the function needs to be locally visible and can only be accessed in the current layer of functions. In recursion, layers will call themselves. The state between each layer should ensure locality and cannot affect each other.
In the environment of C language, the local variables in a function are actually the local state when the function is executed. In the assembly environment, registers are globally visible and cannot be used as local variables.
stack
. . .
Referring to the idea of saving the return address of call instruction, if the current key registers are saved in the stack in each layer of function, and then the next layer of function is called, and when the lower layer of function returns, the registers are recovered from the stack, so as to ensure that the lower layer of function will not destroy the shape of the upper layer of function.
. . .
Inbound and outbound
. . .
push eax ; take eax Save the value of to the stack pop ebx ; Take out the value at the top of the stack and store it in the ebx in
. . .
Make a function that won't affect the whole world
. . .
Recursion again
Then, we have solved the problem of saving the local state in the function. One of the routines is to let the function save the old value before using a register, and then restore it after it is used up. After the execution of the function, all the registers are clean and will not be stained by the function.
Functions in C language
. . .
Conclusion: I don't understand why push out pop is used???
Take a look at this:
[push and pop beep beep beep] https://b23.tv/ZBdrp3
push x is to save (a memory address that stores the value of x) from the heap to a space. Similarly, pop is to take it out, but the system is stupid and doesn't know who the stored value belongs to, so push pop should pay attention to the order and don't get confused,
Introduction to assembly language 9: summary and follow-up (gossip)
What do you learn from compilation
. . .