Programmer self-cultivation reading notes - memory

Posted by didgydont on Sat, 05 Feb 2022 19:32:44 +0100

1. Program memory layout

  the virtual address space of a process generally contains several parts:

  • The part used by the kernel is inaccessible to the process, and different systems occupy different sizes;
  • Stack memory: it is used to maintain the temporary variables and function calls of the program, and the allocation and destruction are completed by the system;
  • Heap memory: the memory that users can allocate through specific system calls, and the allocation and release are completed by users;
  • Executable image: a memory map that stores executable files;
  • Reserved area: an inaccessible part of memory.
  • Dynamic library mapping area: used to load dynamic libraries.

Call stack and convention 2

2.1 stack

   in the VMA of a process, the stack is a memory area with FILO characteristics, and the memory allocation and release of this area are completed by the system. To sum up, the stack in VMA increases downward, that is, the stack top address decreases to press the stack, and the stack top address increases to elastic stack. Generally, the ESP register stores the address of the top of the stack, that is, esp + + represents the pop-up stack, and ESP – represents entering the stack.
   stack can not only save temporary variables, but also an important tool for realizing functions. When a function is called, the stack will store the information required for the function call, which is called stack frame or activity record. Stack frames generally include:

  • Return address and parameters of function;
  • Temporary variables: including non static local variables of functions and other temporary variables automatically generated during compilation;
  • Saved context: includes registers that need to remain unchanged before and after function calls.

   registers will be used when the function is executed. One is esp, which always points to the top of the stack, and the other is ebp, also known as frame pointer, which always points to a fixed position, that is, ebp before calling the function. Only by saving this ebp can we restore to the state before the call.

   the basic steps of a function call are as follows:

  • Push all or part of the parameters into the stack. If other parameters are not on the stack, use a specific register to pass them;
  • Stack the address of the next instruction of the current instruction;
  • Jump to the function body for execution.

   when the function returns, execute the opposite process, that is, pop the stack and jump to the address to be returned. The following describes the above process through the disassembly of a function call. The following is the main C code:

int add(int a){
    return a + 2;

int main(){
    int a = 2;
    a = add(3);
    return 0;

   use GCC - s main C assembles the code into assembly code. The brief code is as follows (RBP and RSP on 64 bit system are only the extension of ebp and esp on 32-bit system):

	pushq	%rbp            #Push rbp onto the stack
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp      #Let rbp point to the top of the current stack, namely rsp
	.cfi_def_cfa_register 6
	movl	%edi, -4(%rbp)  # Copy the edi value to rbp - 4
	movl	-4(%rbp), %eax  #%eax = %ebp - 2
	addl	$2, %eax        #%eax = 2 + %eax
	popq	%rbp            # Pop up rbp
	.cfi_def_cfa 7, 8

	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	subq	$16, %rsp           #Move the pointer at the top of rsp stack to reserve some space
	movl	$2, -4(%rbp)        # a = 2, a's address is rbp - 4
	movl	$3, %edi            # %edi = 3
	call	add                 #jump to add
	movl	%eax, -4(%rbp)      #Write the value of eax to rbp - 4, a
	movl	$0, %eax

   the assembly code of most function calls conforms to the above basic process, but not all. Some will modify the entry mode in order to achieve the optimization effect, such as:

  • The function is declared as static;
  • The function is only used directly in this unit without displaying or implicitly taking the address (that is, no function pointer points to this function).

2.2 calling convention

   that is, when calling a function, whether the caller or the callee maintains the corresponding stack and how to deal with the parameter return value. Different conventions have different provisions, generally involving:

  • The order and mode of function parameter transfer;
  • Maintenance mode of stack;
  • The decoration of symbol names.

  for details, please refer to Calling convention.

2.3 function return value transfer

   for relatively small data, only eax can be used to return, and the redundant return value larger than the size of an eax register can be returned by eax and edx. For a larger return value, eax stores the starting address of the return value, allocates the space of bytes required for the return value on the stack, and assigns the space starting address to eax. This article may not be completely correct at present.

3 heap and memory management

3.1 reactor

   the data on the stack is managed by the system, which is safe and efficient, but it can not adapt to large memory and flexible use. The heap space is much larger than the stack space. Users in C can use malloc and other APIs to apply for memory management by themselves.
   the implementation of malloc does not call the system call to apply for a piece of memory every time, but the runtime applies for a piece of memory in advance and then distributes this part of memory to the application. Therefore, the memory manager for the application is actually the runtime.

3.2 Linux process heap management

  Linux provides two system calls for allocating memory: brk and mmap.
  brk is used to set the end address of the process data segment, which can expand or narrow the data segment.

       int brk(void *addr);
       void *sbrk(intptr_t increment);
       brk() and sbrk() change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment).  Increasing the program break has  the  effect  of  allocating  memory  to  the  process; decreasing the break deallocates memory.
       brk()  sets the end of the data segment to the value specified by addr, when that value is reasonable, the system has enough memory, and the process does not exceed its maximum data size (see setrlimit(2)).
       sbrk() increments the program's data space by increment bytes.  Calling sbrk() with an increment of 0 can be used to find the  current  location  of the program break.

The function of   mmap is to apply for a virtual address space for the operating system. This space may be a mapping file, so mmap can be used for file mapping.

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
       mmap()  creates a new mapping in the virtual address space of the calling process.  The starting address for the new mapping is specified in addr.  The length argument specifies the length of the mapping (which must be greater than 0).

   the basic logic of glibc's malloc in processing users' space applications is:

  • When the applied space size is less than 128kb, a space is allocated according to the heap allocation algorithm in the existing heap space to return;
  • When the size of the requested space is greater than 128kb, an anonymous space is allocated by using the mmap function, and then the space is allocated to the user in this anonymous space.

  the actual amount of space that can be applied for by the heap depends entirely on the remaining space occupied by other parts of the process space, so it is generally less than the theoretical value, depending on the specific situation.

3.3 windows process heap management

  the inherited VMA space of windows is similar to that of linux, including dll and exe files, heap and stack. The reason why there are multiple stacks in the figure below is multithreading. Each thread has an independent stack. The windbos system provides the VirtualAlloc function to apply for a piece of virtual memory space from the system. Using this API requires that the system space size must be an integral multiple of the page.

   another heap manager under windows provides a set of heap related API s that can be used to create, allocate, release and destroy space:

  • HeapCreate: create a heap;
  • Heaprealloc: allocate memory in a heap;
  • HeapFree: release the allocated memory;
  • HeapDestroy: destroy a heap.

   each process will have a heap by default. The heap is created when the process starts and exists until the end of the process. The default heap size is 1Mb. Of course, it can also be specified through compilation parameters.

3.4 heap allocation algorithm

   heap management algorithm is to manage a large block of memory applied from the system and allocate it to the application. Its basic idea is similar to the memory management of the operating system. The general methods are as follows:

  • Free linked list method: use a two-way linked list to connect all space links. At the same time, if the user applies for a memory size of K bytes, K+4 bytes of space will be allocated, and the additional 4 bytes are located at the beginning of the space to indicate the size of K;
  • Bitmap method: divide the heap into fixed size blocks, and each block has the same size. When the user requests memory, the space of integer blocks is always allocated. The first block is called the head of the allocated area, and the rest is called the body of the allocated area;
  • Object pool: the idea of object pool is very simple. If the space allocated each time is the same (assuming K bytes), then the space size can be used as a unit to divide the whole space into a large number of k-byte blocks. Only one small block needs to be found during each request. The management method of object pool can be idle linked list or bitmap.

  composite algorithm is adopted in practical application. For example, for glibc

  • For the space application less than 64 bytes, the algorithm similar to object pool is adopted;
  • For the space application method larger than 512 bytes, the best adaptation algorithm is adopted;
  • For space applications greater than 64 bytes and less than 512 bytes, the above best compromise strategy is adopted;
  • For applications larger than 128KB, the mmap mechanism is used.

Topics: Linux Windows