1. Preliminary knowledge - program memory allocation
The memory occupied by a program compiled by C/C + + is divided into the following parts:
-
Stack - it is automatically allocated and released by the compiler to store the parameter values of functions, the values of local variables, etc. its operation mode is similar to the stack in the data structure.
-
Heap - it is usually allocated and released by the programmer. If the programmer does not release it, it may be recycled by the OS at the end of the program. Note that it is different from the heap in the data structure. The allocation method is similar to the linked list, ha ha.
-
Global area (static) - global variables and static variables are stored together. Initialized global variables and static variables are in one area, and uninitialized global variables and uninitialized static variables are in another adjacent area. They are released by the system after the program is completed.
-
Text constant area - constant strings are placed here. Released by the system at the end of the program.
-
Program code area - the binary code that holds the function body.
Program example
This is written by an elder, very detailed.
//main.cpp int a = 0; //Global initialization area int a = 0; //Global initialization area char *p1; //Global uninitialized area main() { int b; //Stack char s[] = "abc"; //Stack char *p2; //Stack char *p3 = "123456"; //123456 \ 0 is in the constant area and p3 is on the stack. static int c = 0; //Global (static) initialization area p1 = (char *)malloc(10); //The allocated areas of 10 and 20 bytes are in the heap area. p2 = (char *)malloc(20); strcpy(p1, "123456"); //123456 \ 0 is placed in the constant area, and the compiler may optimize it in the same place as the "123456" pointed to by p3. }
2. Theoretical knowledge of heap and stack
2.1 application method
stack:
Automatically assigned by the system. For example, declare a local variable int b in the function; The system automatically opens up space for B in the stack
heap:
The programmer needs to apply by himself and specify the size. In C language, malloc function is used to apply:
If p1 = (char *)malloc(10);
Apply with the new operator in C + +:
If p2 = (char *)malloc(10);
But note that p1 and p2 are on the stack.
2.2 system response after application
Stack: as long as the remaining space of the stack is greater than the applied space, the system will provide memory for the program, otherwise an exception will be reported, indicating stack overflow.
Heap: first of all, you should know that the operating system has a linked list that records the address of free memory. When the system receives the application from the program, it will traverse the linked list to find the first heap node whose space is greater than the applied space, then delete the node from the linked list of free nodes, and allocate the space of the node to the program. In addition, for most systems, The size of this allocation will be recorded at the first address in this memory space, so that the delete statement in the code can correctly release this memory space. In addition, because the size of the found heap node is not necessarily equal to the size of the application, the system will automatically put the excess part back into the free linked list.
2.3 size limit of application
Stack: under windows, stack is a data structure extending to low address, which is a continuous memory area. This sentence means that the address at the top of the stack and the maximum capacity of the stack are predetermined by the system. Under windows, the size of the stack is 2M (or 1M, in short, it is a constant determined at compile time). If the applied space exceeds the remaining space of the stack, overflow will be prompted. Therefore, the space available from the stack is small.
Heap: a heap is a data structure that extends to a high address and is a discontinuous memory area. This is because the system uses the linked list to store the free memory address, which is naturally discontinuous, and the traversal direction of the linked list is from low address to high address. The size of the heap is limited by the virtual memory available in the computer system. It can be seen that the space obtained by heap is flexible and large.
2.4 comparison of application efficiency
Stack: automatically allocated by the system, fast. But programmers have no control.
Heap: it is the memory allocated by new. It is generally slow and easy to generate memory fragments, but it is the most convenient to use.
In addition, under WINDOWS, the best way is to use VirtualAlloc to allocate memory. It is not on the heap or on the stack. It directly reserves a fast memory in the process address space, although it is the most inconvenient to use. But it's fast and flexible.
2.5 storage contents in heap and stack
Stack: when calling a function, the first thing on the stack is the address of the next instruction in the main function (the next executable statement of the function call statement), and then the parameters of the function. In most C compilers, the parameters are stacked from right to left, and then the local variables in the function. Note that static variables are not stacked.
When this function call is over, the local variable comes out of the stack first, then the parameter, and finally the pointer at the top of the stack points to the address stored at the beginning, that is, the next instruction in the main function, from which the program continues to run.
Heap: generally, one byte is used at the head of the heap to store the size of the heap. The specific contents of the heap are arranged by the programmer.
2.6 comparison of access efficiency
char s1[] = "aaaaaaaaaaaaaaa";
char *s2 = "bbbbbbbbbbbbbbbbb";
AAA is assigned at runtime; BBB is determined at compile time; However, in future access, the array on the stack is faster than the string pointed to by the pointer (such as heap).
For example:
#include void main() { char a = 1; char c[] = "1234567890"; char *p ="1234567890"; a = c[1]; a = p[1]; return; }
The corresponding assembly code is as follows:
10: a = c[1]; // c statement 00401067 8A 4D F1 mov cl,byte ptr [ebp-0Fh] 0040106A 88 4D FC mov byte ptr [ebp-4],cl 11: a = p[1]; // c statement 0040106D 8B 55 EC mov edx,dword ptr [ebp-14h] 00401070 8A 42 01 mov al,byte ptr [edx+1] 00401073 88 45 FC mov byte ptr [ebp-4],al
The first one reads the elements in the string directly into the register cl, while the second one reads the pointer value into edx first. It is obviously slow to read characters according to edx.
2.7 summary
The difference between heap and stack can be seen by the following analogy:
Using the stack is like eating in a restaurant. We just order (apply), pay, and eat (use). When we are full, we leave without paying attention to the preparation work such as cutting and washing dishes and the finishing work such as washing dishes and pots. Its advantage is fast, but it has little freedom.
Using piles is like making your favorite dishes by yourself. It is more troublesome, but it is more in line with your own taste, and has a large degree of freedom.
3. Memory structure in windows process
Before reading this article, if you don't even know what the stack is, please read the basics at the back of the article.
People who have come into contact with programming know that high-level languages can access data in memory through variable names. So how are these variables stored in memory? How does the program use these variables? This will be discussed in depth below. If there is no special statement in the C language code below, the release version compiled by VC is used by default.
First, let's learn how C language variables are divided in memory. C language has global variable, local variable, static variable and register variable. Each variable has a different allocation method. Let's start with the following code:
#include <stdio.h> int g1=0, g2=0, g3=0; int main() { static int s1=0, s2=0, s3=0; int v1=0, v2=0, v3=0; //Print out the memory address of each variable printf("0x%08x\n",&v1); //Print the memory address of each local variable printf("0x%08x\n",&v2); printf("0x%08x\n\n",&v3); printf("0x%08x\n",&g1); //Print the memory address of each global variable printf("0x%08x\n",&g2); printf("0x%08x\n\n",&g3); printf("0x%08x\n",&s1); //Print the memory address of each static variable printf("0x%08x\n",&s2); printf("0x%08x\n\n",&s3); return 0; }
The compiled execution result is:
0x0012ff78 0x0012ff7c 0x0012ff80 0x004068d0 0x004068d4 0x004068d8 0x004068dc 0x004068e0 0x004068e4
The output is the memory address of the variable. Where V1, V2 and V3 are local variables, G1, G2 and G3 are global variables, and S1, S2 and S3 are static variables. You can see that these variables are continuously distributed in memory, but the memory address allocated by local variables and global variables is 180000 miles different, while the memory allocated by global variables and static variables is continuous. This is because local variables and global / static variables are the result of allocation in different types of memory areas. The memory space of a process can be logically divided into three parts: code area, static data area and dynamic data area. Dynamic data areas are generally "stacks". "Stack" and "heap" are two different dynamic data areas. Stack is a linear structure and heap is a chain structure. Each thread of the process has a private "stack", so although the code of each thread is the same, the data of local variables do not interfere with each other. A stack can be described by "base address" and "top address". Global variables and static variables are allocated in the static data area, and local variables are allocated in the dynamic data area, that is, the stack. The program accesses local variables through the base address and offset of the stack.
├---—┤Low end memory area │ ...... │ ├---—┤ │ Dynamic data area │ ├---—┤ │ ...... │ ├---—┤ │ Code area │ ├---—┤ │ Static data area │ ├---—┤ │ ...... │ ├---—┤High end memory area
Stack is a first in and last out data structure. The top address of the stack is always less than or equal to the base address of the stack. We can first understand the process of function call, so as to have a deeper understanding of the role of stack in the program. Different languages have different function call rules. These factors include parameter push rules and stack balance. The call rules of windows API are different from those of ANSI C. the former is adjusted by the called function, and the latter is adjusted by the caller. The two are distinguished by the prefixes "_stdcall" and "_cdecl". Look at the following code first:
#include <stdio.h> void __stdcall func(int param1,int param2,int param3) { int var1=param1; int var2=param2; int var3=param3; printf("0x%08x\n",param1); //Print out the memory address of each variable printf("0x%08x\n",param2); printf("0x%08x\n\n",param3); printf("0x%08x\n",&var1); printf("0x%08x\n",&var2); printf("0x%08x\n\n",&var3); return; } int main() { func(1,2,3); return 0; }
The compiled execution result is:
0x0012ff78 0x0012ff7c 0x0012ff80 0x0012ff68 0x0012ff6c 0x0012ff70
├---—┤<—Top of stack at function execution( ESP),Low end memory area │ ...... │ ├---—┤ │ var 1 │ ├---—┤ │ var 2 │ ├---—┤ │ var 3 │ ├---—┤ │ RET │ ├---—┤<—"__cdecl"Function returns the top of the stack( ESP) │ parameter 1 │ ├---—┤ │ parameter 2 │ ├---—┤ │ parameter 3 │ ├---—┤<—"__stdcall"Function returns the top of the stack( ESP) │ ...... │ ├---—┤<—Stack bottom (base address) EBP),High end memory area
The above figure is what the stack looks like during a function call.
First, press the three parameters into the stack from right to left, first "param3", then "param2", and finally "param1"; Then press the return address (RET) of the function;
next, Jump to the function address and execute (I would like to add a point here. In the articles introducing the principle of buffer overflow under UNIX, it is mentioned that after pressing RET, continue to press the current EBP, and then use the current ESP instead of EBP. However, in an article introducing function calls under windows, it is said that this step is also available for function calls under windows, but according to my actual debugging, I did not find this step. This can also be seen from param3 and var1 There is only a 4-byte gap between them (this can be seen);
Step 3: subtract a number from the top of the stack (ESP) to allocate memory space for local variables. In the above example, subtract 12 bytes (ESP=ESP-34, each int variable occupies 4 bytes); Then initialize the memory space of the local variable. Since the "_stdcall" call adjusts the stack by the called function, the stack should be recovered before the function returns. First recover the memory occupied by the local variable (ESP=ESP+34), then take out the return address, fill in the EIP register, recover the memory occupied by the previously pressed parameter (ESP=ESP+3*4), and continue to execute the caller's code.
See the following assembly code:
;--------------func Assembly code of function------------------- :00401000 83EC0C sub esp, 0000000C //Create memory space for local variables :00401003 8B442410 mov eax, dword ptr [esp+10] :00401007 8B4C2414 mov ecx, dword ptr [esp+14] :0040100B 8B542418 mov edx, dword ptr [esp+18] :0040100F 89442400 mov dword ptr [esp], eax :00401013 8D442410 lea eax, dword ptr [esp+10] :00401017 894C2404 mov dword ptr [esp+04], ecx ........................((omit several codes) :00401075 83C43C add esp, 0000003C ;Recover the stack and reclaim the memory space of local variables :00401078 C3 ret 000C ;Function returns to recover the memory space occupied by parameters ;If so“__cdecl"If so, here is“ ret",The stack will be recovered by the caller ;-------------------End of function------------------------- ;--------------Main program call func Function code-------------- :00401080 6A03 push 00000003 //Press in parameter param3 :00401082 6A02 push 00000002 //Press in parameter param2 :00401084 6A01 push 00000001 //Press in parameter param1 :00401086 E875FFFFFF call 00401000 //Call func function ;If so“__cdecl"If so, the stack will be restored here“ add esp, 0000000C"
Smart readers almost understand the principle of buffer overflow. Let's start with the following code:
#include <stdio.h> #include <string.h> void __stdcall func() { char lpBuff[8]="\0"; strcat(lpBuff,"AAAAAAAAAAA"); return; } int main() { func(); return 0; }
How about running it again after compilation? Ha, the "0x00000000" memory referenced by the "0x00414141" instruction. The memory cannot be 'read', "Illegal operation!" "41" is the hexadecimal ASCII code of "A", which is obviously the problem with strcat. " The size of "lpBuff" is only 8 bytes, counted into the \ 0 at the end. Strcat can only write 7 "A" at most, but the program actually writes 11 "A" plus 1 \ 0. Let's take A look at the figure above. The extra 4 bytes just cover the memory space of RET, resulting in the function returning to A wrong memory address and executing the wrong instruction. If this string can be carefully constructed and divided into three parts, the first part is only filled with meaningless data to achieve the purpose of overflow, followed by A data covering RET, followed by A shellcode, as long as the RET address can point to the first instruction of this shellcode, the shellcode can be executed when the function returns. However, different versions of software and different running environments may affect the location of this shellcode in memory, so it is very difficult to construct this ret. Generally, A large number of NOP instructions are filled between RET and shellcode, which makes exploit more versatile.
├---—┤<—Low end memory area │ ...... │ ├---—┤<—from exploit Start of filling in data │ │ │ buffer │<—Fill in useless data │ │ ├---—┤ │ RET │<—point shellcode,or NOP Scope of instruction ├---—┤ │ NOP │ │ ...... │<—Filled NOP Command, yes RET Pointing range │ NOP │ ├---—┤ │ │ │ shellcode │ │ │ ├---—┤<—from exploit End of filling data │ ...... │ ├---—┤<—High end memory area
Dynamic data under windows can be stored not only in the stack, but also in the heap. Friends who know C + + know that C + + can use the new keyword to dynamically allocate memory. Look at the following C + + Code:
#include <stdio.h> #include <iostream.h> #include <windows.h> void func() { char *buffer=new char[128]; char bufflocal[128]; static char buffstatic[128]; printf("0x%08x\n",buffer); //Print the memory address of the variable in the heap printf("0x%08x\n",bufflocal); //Print the memory address of the local variable printf("0x%08x\n",buffstatic); //Print the memory address of the static variable } void main() { func(); return; }
The program execution result is:
0x004107d0 0x0012ff04 0x004068c0
It can be found that the memory allocated with the new keyword is neither in the stack nor in the static data area. VC compiler realizes the dynamic memory allocation of new keyword through "heap" under windows. Before talking about "heap", let's learn about several API functions related to "heap":
- HeapAlloc Request memory space in heap - HeapCreate Create a new heap object - HeapDestroy Destroy a heap object - HeapFree Free requested memory - HeapWalk Enumerates all memory blocks of heap objects - GetProcessHeap Gets the default heap object for the process - GetProcessHeaps Get all heap objects of the process - LocalAlloc - GlobalAlloc
When the process is initialized, the system will automatically create a default heap for the process. The default memory size of this heap is 1M. Heap objects are managed by the system and exist in memory in a chain structure. You can dynamically request memory space through the heap through the following code:
HANDLE hHeap=GetProcessHeap(); char *buff=HeapAlloc(hHeap,0,8);
Where hhep is the handle to the heap object, and buff is the address pointing to the requested memory space. What exactly is this hhep? Does its value mean anything? Take a look at the following code:
#pragma comment(linker,"/entry:main") / / define the entry of the program #include <windows.h> _CRTIMP int (__cdecl *printf)(const char *, ...); //Define STL function printf /*--------------------------------------------------------------------------- Here, let's review the previous knowledge: (*Note) printf function is a function in the standard function library of C language, and the standard function library of VC is provided by MSVCRT DLL module implementation. It can be seen from the function definition that the number of parameters of printf is variable. The number of parameters pushed by the caller cannot be known in advance inside the function. The function can only obtain the information of the pushed parameters by analyzing the format of the first parameter string. Because the number of parameters here is dynamic, the caller must balance the stack. This is used here__ Cdecl call rule. BTW, the API function of Windows system is basically__ stdcall calls, with the exception of one API, wsprintf, which uses__ The cdecl call rule is the same as the printf function because the number of parameters is variable. ---------------------------------------------------------------------------*/ void main() { HANDLE hHeap=GetProcessHeap(); char *buff=HeapAlloc(hHeap,0,0x10); char *buff2=HeapAlloc(hHeap,0,0x10); HMODULE hMsvcrt=LoadLibrary("msvcrt.dll"); printf=(void *)GetProcAddress(hMsvcrt,"printf"); printf("0x%08x\n",hHeap); printf("0x%08x\n",buff); printf("0x%08x\n\n",buff2); }
The execution results are:
0x00130000 0x00133100 0x00133118
How is the value of hhep so close to the value of that buff? In fact, the handle of hhep is the address pointing to the header of HEAP. There is a structure called PEB (process environment block) in the user area of the process. This structure stores some important information about the process. The ProcessHeap stored at the PEB first address offset 0x18 is the address of the process default HEAP, and the offset 0x90 stores a pointer to the address list of all the heaps of the process. Many windows API s use the default HEAP of the process to store dynamic data. For example, all ANSI functions in windows 2000 apply for memory in the default HEAP to convert ANSI strings to Unicode strings. The access to a HEAP is sequential. Only one thread can access the data in the HEAP at the same time. When multiple threads have access requirements at the same time, they can only wait in line, resulting in the decline of program execution efficiency.
Finally, let's talk about data alignment in memory. Bit data alignment means that the memory address where the data is located must be an integer multiple of the data length. The memory starting address of DWORD data can be divided by 4, and the memory starting address of WORD data can be divided by 2. x86 CPU can directly access the aligned data. When it attempts to access an unaligned data, it will make a series of internal adjustments, These adjustments are transparent to the program, but will reduce the running speed, so the compiler will try to ensure data alignment when compiling the program. For the same piece of code, let's look at the execution results of programs compiled with three different compilers: VC, Dev-C + + and lcc:
#include <stdio.h> int main() { int a; char b; int c; printf("0x%08x\n",&a); printf("0x%08x\n",&b); printf("0x%08x\n",&c); return 0; }
This is the execution result compiled with VC:
0x0012ff7c 0x0012ff7b 0x0012ff80
The order of variables in memory: b(1 byte) - a(4 bytes) - c(4 bytes).
This is the execution result compiled with Dev-C + +:
0x0022ff7c 0x0022ff7b 0x0022ff74
The order of variables in memory: c(4 bytes) - 3 bytes apart - b(1 byte) - a(4 bytes).
This is the execution result compiled with lcc:
0x0012ff6c 0x0012ff6b 0x0012ff64
The order of variables in memory: the same as above.
The three compilers have achieved data alignment, but the latter two compilers are obviously not as "smart" as VC, so that a char occupies 4 bytes and wastes memory.
Basics:
Stack is a simple data structure. It is a linear table that can only be inserted or deleted at one end. One end that allows insertion or deletion is called the top of the stack, and the other end is called the bottom of the stack. The insertion and deletion of the stack are called in and out of the stack. There is a set of CPU instructions that can access the memory stack of a process. Among them, POP instruction realizes out of stack operation, and PUSH instruction realizes in stack operation. The ESP register of the CPU stores the stack top pointer of the current thread, and the EBP register stores the stack bottom pointer of the current thread. The EIP register of the CPU stores the memory address of the next CPU instruction. When the CPU finishes executing the current instruction, it reads the memory address of the next instruction from the EIP register and then continues to execute.
Original link: https://blog.csdn.net/yingms/article/details/53188974