for loop analysis

Posted by kavitam on Mon, 07 Feb 2022 08:53:53 +0100

In the programming language, loop is one of the three language processes (sequence, branch and loop). Among them, loop is the most attractive thing in programming. It gives full play to the advantages of human thinking and computer computing, reflects the skills and wisdom of programmers, and also reflects the simplicity, elegance and beauty of code. The most commonly used loop should be the for loop, while others, such as while, do while, can basically be written as a for loop. At the same time, the for loop can be rewritten into a recursive function. This paper first creates a project with a simple function containing a for loop through VC, and then analyzes its assembly code with IDA tool.

The for loop is mainly composed of the following forms, for example:

    for( i = 0; i < imax; i++) { ... }

It can be regarded as consisting of four basic parts, namely initialization statement (i = 0), conditional statement (i < IMAX), body subsequent statement (i + +), and body for repeated execution of loop ({...}). First, create a Win32 Console program with VC. We input a simple function containing a for loop. The code is as follows:

#include "stdafx.h"
#include <string.h>

int getstring(char* s)
    int length = strlen(s);
    int i;
    for(i=0; i<length; i++)
        //Priority plus minus (%)!
        s[i] = (s[i] - 'a' + 1)%26 + 'a';
    return length;

int main(int argc, char* argv[])
    char s[] = "abcdefg\0";
    int result = getstring(s);
    printf("%s\n", s);
    return 0;

Briefly introduce the above code below. The function getstring is used to change each character in the string to its next English character (for example, a to b, b to c,..., z to a) for the input string (assuming that all the input strings are composed of lowercase letters a-z). Therefore, in the above program, we initialize a string as "abcdefg" in main and output "bcdefgh".

Let's use IDA to check the disassembly code (win32 debug): in the assembly code, the getstring function and main function appear in the same order as in our C + + code. You can see:

The getstring function ranges from 00401020h to 00401095h (code size: 118 bytes, which actually occupies 160 bytes after being aligned by 16bytes);

The main function ranges from 004010c0h to 00401124h (code amount: 101 bytes, which actually occupies 128 bytes after alignment);

Between functions, 0xCC is used to fill in, so that the starting address of the function is at the integer multiple of 16bytes;

Let's take a brief look at the main function first. Obviously, the string s is the space on the stack in the main function: the assembly code is as follows:

.text:004010C0 main            proc near               ; CODE XREF: j_mainj
.text:004010C0 var_50          = dword ptr -50h
.text:004010C0 var_10          = dword ptr -10h
.text:004010C0 var_C           = dword ptr -0Ch
.text:004010C0 var_8           = dword ptr -8
.text:004010C0 var_4           = byte ptr -4
.text:004010C0                 push    ebp
.text:004010C1                 mov     ebp, esp

                               ;main Applied for 80 on the stack bytes Temporary space
.text:004010C3                 sub     esp, 50h ; main Applied for 80 on the stack bytes Temporary space
.text:004010C6                 push    ebx      ; Save current value of register( esp Continue to decrease and store at 80 bytes (at adjacent low address)
.text:004010C7                 push    esi
.text:004010C8                 push    edi

                               ;Put 80 bytes The temporary space is initialized to 0 for all xCC fill
.text:004010C9                 lea     edi, [ebp+var_50]
.text:004010CC                 mov     ecx, 14h
.text:004010D1                 mov     eax, 0CCCCCCCCh
.text:004010D6                 rep stosd               ; stosd: String storage

                                   ;hold.rdata Data in char s[],s: ebp + var_C
                               ;eax <- "abcd"
.text:004010D8                 mov     eax, ds:??_C@_08PIFP@abcdefg?$AA?$AA@
.text:004010DD                 mov     [ebp+var_C], eax

                               ;ecx <- "efg0"
.text:004010E0                 mov     ecx, ds:dword_422024
.text:004010E6                 mov     [ebp+var_8], ecx
.text:004010E9                 mov     dl, ds:byte_422028
.text:004010EF                 mov     [ebp+var_4], dl

                               ;call getstring(char* s);
.text:004010F2                 lea     eax, [ebp+var_C]
.text:004010F5                 push    eax
.text:004010F6                 call    j_getstring
.text:004010FB                 add     esp, 4  //The caller restores the stack.
.text:004010FE                 mov     [ebp+var_10], eax ;result = getstring(s);

                               ;call printf("%s", s);
.text:00401101                 lea     ecx, [ebp+var_C]
.text:00401104                 push    ecx
.text:00401105                 push    offset ??_C@_03HHKO@?$CFs?6?$AA@
.text:0040110A                 call    printf
.text:0040110F                 add     esp, 8 ;Recovery stack
.text:00401112                 xor     eax, eax
.text:00401114                 pop     edi
.text:00401115                 pop     esi
.text:00401116                 pop     ebx  ;Restore register contents (cause esp (added) 

                               Release 80 requested on the stack bytes Temporary space
.text:00401117                 add     esp, 50h 

                               ;Check whether the stack is damaged?
.text:0040111A                 cmp     ebp, esp
.text:0040111C                 call    __chkesp
.text:00401121                 mov     esp, ebp
.text:00401123                 pop     ebp
.text:00401124                 retn
.text:00401124 main            endp

The main function is not the point, but we can still see some basic things, such as how the function uses the stack. As can be seen from the above code, the first thing to enter the function is to copy the esp to ebp, which is equivalent to taking a "snapshot" of the stack at the time of entering the function (because the esp is always in dynamic change, it is necessary to cache the top address of the stack at the time of entering the function). After that, the parameter access is completed through the "snapshot" ebp, that is, (ebp + parameter offset) to access the parameters. The caller is responsible for restoring the stack (restoring the value of esp, because the esp will decrease when the next instruction address, parameters and other information is push ed into the stack), which belongs to the scope of the call convention.

Then, the function "applies" to enough space on the stack for the temporary variables in the function at one time (80bytes in the main function), and initializes them all and fills them with 0xCC. The temporary variables declared inside the subsequent function are all in the space on this stack. Before the function returns, this space will be "released" by the function ([remarks] Please note that the application and release here refer to the addition and subtraction of esp, that is, the top address of the stack. The subtraction corresponds to the application and the addition corresponds to the release. Please note that it is not the same concept as the memory management on the heap). For example, the char s [] declared in the main function is within these 80bytes. Please note how the s array is initialized in the assembly code. Since I set char s [] = "abcdefg"ABCDEFG \ 0"", the content of the string "abcdefg"ABCDEFG \ 0"" is stored in In the rdata section, because it is exactly 8 characters, it can be considered to be equivalent to two DWORD data. Therefore, two mov instructions are used to initialize the s array in the assembly code.

Before the function is about to return, compare the current stack state (ESP) with the "snapshot" (ebp) when entering the function again to see whether they match. Otherwise, the stack may be damaged during the execution of the function, or the caller and the function do not follow the same calling convention. We use IDA # to modify ESP or ebp during debugging to make them unequal, and then enter__ After the chkesp function, I found that a dialog box as shown in the figure below will pop up. Observe the information output on the dialog box to the effect that "the ESP value is not stored correctly during the function call, which is usually caused by using the function pointer declared by the call convention different from the function to call the function." In my impression, chkesp seems to be added after a visual stdio version and has become the default open option, which has made the application more secure. It seems that there is a compilation switch in the IDE that can turn off the code generated by the compiler to check the stack. We can turn off this switch to make the generated application more streamlined and run more efficiently, But there is no doubt that it may reduce its security. We will not continue to analyze its follow-up.



If you use a hexadecimal editor to fill all the assembly code (add esp, **h) that releases the temporary variable space with NOP instructions (0x90), the above dialog box will pop up when running. If we let the program run then, another dialog box common in XP will pop up: exe needs to close when encountering a problem. Then we click to view the technical information of the error report sent by it, and we will see the following dialog box. Here I checked the Address, which is the assembly code that triggers the interrupt (int 3), then CPUID (obtained through CPUID instruction), followed by some module, stack and memory information. Since we deliberately destroyed the esp verification conditions at runtime, it is obviously unlikely that these data will have any results when sent to MS company.


Next, let's look at the assembly code of getstring, which contains a basic for loop:

.text:00401020 getstring       proc near               ; CODE XREF: j_getstringj
.text:00401020 var_48          = dword ptr -48h
.text:00401020 var_8           = dword ptr -8
.text:00401020 var_4           = dword ptr -4
.text:00401020 arg_0           = dword ptr  8
.text:00401020                 push    ebp
.text:00401021                 mov     ebp, esp  ;ebp = esp;
.text:00401023                 sub     esp, 48h  ;Apply on the stack 72 bytes((for temporary variables)
.text:00401026                 push    ebx
.text:00401027                 push    esi
.text:00401028                 push    edi ;Save register data
.text:00401029                 lea     edi, [ebp+var_48]
.text:0040102C                 mov     ecx, 12h
.text:00401031                 mov     eax, 0CCCCCCCCh
.text:00401036                 rep stosd

                               ;call strlen(s);
.text:00401038                 mov     eax, [ebp+arg_0] ;Take out parameters s Address of (in) ebp (below)
.text:0040103B                 push    eax    ;Parameter stack
.text:0040103C                 call    strlen
.text:00401041                 add     esp, 4 ;Caller recovery stack pointer
.text:00401044                 mov     [ebp+var_4], eax ; length = strlen(s);
                               ; Enter from here for Loop, the following is the initialization part(i = 0)
.text:00401047                 mov     [ebp+var_8], 0   ; i = 0;
.text:0040104E                 jmp     short loc_401059 ; Jump to the condition judgment section

                                   ; Here are the following statements( i++;)
.text:00401050 loc_401050:                             ; CODE XREF: getstring+60j
.text:00401050                 mov     ecx, [ebp+var_8]
.text:00401053                 add     ecx, 1
.text:00401056                 mov     [ebp+var_8], ecx ; i = i + 1;

                               ; Here is the conditional judgment statement (i < length? )
                               ;jge: stay >= Jump to the specified address when;
.text:00401059 loc_401059:                             ; CODE XREF: getstring+2Ej
.text:00401059                 mov     edx, [ebp+var_8]
.text:0040105C                 cmp     edx, [ebp+var_4]
.text:0040105F                 jge     short loc_401082 ; i<length? 

                               ; The following is the topic of the loop {... }
.text:00401061                 mov     eax, [ebp+arg_0] ; eax = s;
.text:00401064                 add     eax, [ebp+var_8] ; eax = s + i;
.text:00401067                 movsx   eax, byte ptr [eax] ; eax = s[i];
.text:0040106A                 sub     eax, 60h        
                               ; eax = s[i] - 'a' + 1; ('a' = 0x61)

.text:0040106D                 cdq 
                               ;CDQ: Convert Double to Quad (386+)
                               ;hold edx Expand to eax In other words, it becomes 64 bits.

.text:0040106E                 mov     ecx, 1Ah         ; ecx = 26;
.text:00401073                 idiv    ecx             
                               ; idiv: Signed Division, eax As a result, edx Is the remainder;

.text:00401075                 add     edx, 61h         ; edx += 'a';
.text:00401078                 mov     eax, [ebp+arg_0]
.text:0040107B                 add     eax, [ebp+var_8]
.text:0040107E                 mov     [eax], dl        ; s[i] = edx low
.text:00401080                 jmp     short loc_401050 ; Jump to( i++)place

                               ; The following code follows the beginning of the loop body
.text:00401082 loc_401082:                              ; CODE XREF: getstring+3Fj
.text:00401082                 mov     eax, [ebp+var_4] ; return length;
.text:00401085                 pop     edi
.text:00401086                 pop     esi
.text:00401087                 pop     ebx      ; Recover register data
.text:00401088                 add     esp, 48h ; Free up the temporary space requested on the stack
.text:0040108B                 cmp     ebp, esp ; Check stack
.text:0040108D                 call    __chkesp
.text:00401092                 mov     esp, ebp
.text:00401094                 pop     ebp
.text:00401095                 retn
.text:00401095 getstring       endp

Code getstring

In the above code, the address of parameter s is [ebp + 8]? Why? You can see the process of calling the function:

    .text:004010F2                 lea     eax, [ebp+var_C]
    . text:004010F5 # push # eax / / esp is the address of s (on the stack)
    . text:004010F6                 call    j_getstring / / note that the call instruction will put the address of the next instruction (004010FB) on the stack and increase esp by 4;
    . text:004010FB add esp, 4 / / the caller restores the stack.

The call instruction then jumps to the getstring function:

    .text:00401020 getstring       proc near

    .text:00401020 arg_0           = dword ptr  8
    . text:00401020 # push # ebp / / to save the value of ebp, esp increases by 4 again;
    . text:00401021 , mov , ebp, esp / / therefore, the distance of , ebp / ESP distance parameter is 8 bytes;

After ebp is put into the stack, the data in the stack is as follows:


ESP --- > | original value of ebp (4bytes)

| jump address when the function returns (address of the next instruction of call j_getstring) (4 bytes)

| parameter address (the first / leftmost parameter), that is, the address of char s [] (on the stack);

Therefore, the distance between the parameters of the function and ebp is 8 bytes. If the parameters are stacked from right to left, the closer the parameters on the left are to the top of the stack (the shallower the depth). The screenshot of the above analysis content during IDA operation is as follows:



Here we will summarize the following typical assembly code generated by VC compiler for a function: (Note: the calling convention here is that the parameters are put on the stack from right to left, and the caller is responsible for the parameter out of the stack.)

    (1)push ebp; Store EBP in the stack and save the original value of EBP;

    (2)mov ebp,esp; EBP points to the top of the current stack; (after that, EBP is used as the basis for accessing parameters and function temporary variables)

    (3)sub esp,**h; Apply for space on the stack for temporary variables in the function. (more than or equal to the exact requirements of temporary variables, aligned with 32bits memory)

(4) prolog generated by the compiler. (Note: if you use the MS keyword named, you can write this part of the assembly code yourself.)

Including: the value of stack protection register, ESI, EDI, ebx and EBP (if they are used in the function); "Initialize" the space on the stack allocated by the temporary variable; (Note: fill with 0xCC)

(5) function body; (the return value is put into eax)

(6) epilog generated by compiler. (Note: the same as above, you can write this part of the assembly code by yourself with the naked keyword)

Including: restoring the protected register;

    (7)add esp, **h; Free up the space requested on the stack for temporary variables.

    (8)cmp ebp, esp

        call __chkesp

mov esp, ebp check stack pointer

(9) pop ebp restores the value of ebp

(10) retn return.

[note] note that there are two levels of initialization here. One is the compiler level (fill the temporary variable space on the stack with 0xcc), which is transparent to programmers; The other is the initialization of temporary variables when programmers code, which is at the high-level language coding level.

The data of all function temporary variables that have not been initialized by the programmer is 0xCC; In this way, it is easy to identify whether the variable has not been initialized at the coding level. For example, there is no data initialized by the programmer. Strings are such special values. It is easy to find that the programmer forgets to initialize due to carelessness. If the compiler does not do this action, the values of temporary variables are random data. They may be traces left by the use of past "functions". It is difficult to tell whether they have been intentionally initialized or forgotten by the programmer (they have never been given initial values). Therefore, although this initialization action of the compiler is not necessary for program operation, But it is necessary for our debugging.


[supplementary note] hoodlum1980 was published on August 27, 2011


The "initialization process" (filling the stack space applied in the function with 0xCC) I mentioned here is aimed at the code generated by the compiler when VC generates the Debug version. Therefore, in the Debug version, the uninitialized stack data and buffer in the program will be reflected as these values. There is a saying that 0xCC is an int3 instruction, so once the PC accidentally jumps to the stack, the compiler interrupt can be triggered during debugging. This statement is reasonable, but it is almost impossible for a PC to jump from a code segment to the stack space of the process, and each section has the definition of segment characteristics (equivalent to permissions and properties). So I have never seen this statement verified in reality. Of course, everything does not rule out an accident. As for why VC uses 0xCC to fill the space on the stack? It may need to be answered by more insiders who develop compilers.


The data distribution in the stack is as follows:


Stack top ESP - > temporary variable space (ensure memory alignment)

Protected register value

ebp ------ -- > old value of ebp

Jump address when function returns

Parameters (from top to bottom: left parameter, right parameter)


Here we can see the four basic parts contained in the for loop, which are arranged in the following order in the assembly code:

    1. Initialization statement (i=0); Then jump to 3;

    2. Follow up statement (i + +);

    3. Conditional statement (I < IMAX?); If the conditions are not met, jump to 5;

    4. The main body inside the circulation body ({...}); Then jump to 2;

    5. Continue to execute the following code down

The starting addresses of 2, 3 and 5 are labeled in the assembly code for jump, that is, the assembly code of a for loop is mainly composed of four parts, and three address labels for jump are generated at the same time.

In the loop body control, there are two main high-level language codes for the control process: continue and break; Many people are very clear about the impact of these two on the loop, but i'm afraid not everyone can distinguish the impact of these two on the subsequent statements (i + +). For example, if you have the following questions, please write their output. Can you figure it out?

//Title A:
int result;
for( i= 0, result = 0; i<10; i++, result++)
  if( i == 5 ) continue;
printf("%d", result);

//Topic B:
int result;
for( i= 0, result = 0; i<10; i++, result++)
  if( i == 5 ) break;
printf("%d", result);

In fact, continue will jump to 2 and break will jump to 5; Please note the difference between the two. Continue is to skip the loop body after continue (only the loop body cannot be complete, and the execution of the other three parts is complete), All subsequent statements (i + +) will be executed completely; The break is to leave the loop body directly (the subsequent i + + of this loop will not be executed). Therefore, it is also very important to understand this.

Next, we give a schematic diagram of a for loop as the end. As can be seen from the figure below, 2, 3 and 4 form a loop. The only entrance and exit of this loop is located in 3 (conditional statement). If the conditional statement is always TRUE, it will run endlessly in the loop and never jump out, that is, the so-called dead loop.  



[aside] strlen is implemented in VC in assembly language (its assembly code can be found in VC). It is very interesting. Its efficiency lies in that it is not to compare char by char, but to judge in a group of four chars (DWORD). I'm afraid it's hard to understand the assembly code of strlen for the first time. You can refer to the following articles:

    strlen source code analysis, ant

Topics: C++ p2p