2021-2022-1 20212820 Linux kernel principle and analysis week 8

Posted by monkeyx on Wed, 10 Nov 2021 17:46:47 +0100

How does the Linux kernel load and start an executable program

Understand the process of compiling links and ELF executable file format:

Creation process of executable file:

c code (. c) - after preprocessing by the compiler, compile it into assembly code (. asm) - assembler, generate object code (. o) - linker, link it into executable file (. out) - OS loads the executable file into memory for execution. As shown in the figure:

  one   Pretreatment: gcc -E -o   hello.cpp   hello.c -m32     Preprocessing (text file). Preprocessing is responsible for including the include d file and macro replacement

two   compile  :   gcc -x cpp-output -S -o   hello.s   hello.cpp -m32, compiled into assembly code (text file)

three   Assembly: gcc -x assembler -c hello.s -o   hello.o  - m32     Assemble into object code (ELF format, binary files, some machine instructions, but it can't run)

four   Link: gcc -o   hello   hello.o -m32     Linked into executable files (ELF format, binary files). Shared libraries are used in Hello executable files. Functions in printf and libc libraries, gcc -o, will be called   hello.static   hello.o -m32 -static      Static link, which puts all the things that need to be relied on for execution inside the program

example:

vi hello.c
gcc -E -o hello.cpp hello.c -m32 //Preprocess. c files. Preprocessing includes including the included files and macro replacement
 
vi hello.cpp
gcc -x cpp-output -S -o hello.s hello.cpp -m32 //compile
 
vi hello.s
gcc -x assembler -c hello.s -o hello.o -m32 //assembly
 
vi hello.o
gcc -o hello hello.o -m32 //link
 
vi hello
gcc -o hello.static hello.o -m32 -static 

ELF file format is shown in the following figure:

ELF has three main target files:

  1. Relocatable: save code and appropriate data to create executable / shared files with other object files, mainly. o files
  2. Executable file: it indicates how exec creates the program process image, how to load it, and where to start execution
  3. Share object file: save code and appropriate data to be linked by the following two connectors.
  4. (1) Connect to the editor to relocate and share object files. Load time link.
  5. (2) A dynamic linker that combines executable and other shared object files to create a process image. The runtime link.

Use the exec * library function to load an executable file

Dynamic link is divided into executable program loading dynamic link and runtime dynamic link

Dynamic link on load

gcc -shared shlibexample.c -o libshlibexample.so -m32

Runtime dynamic link

gcc -shared dllibexample.c -o libdllibexample.so -m32

Main dispatcher:

gcc main.c -o main -L /path/to/your/dir-l shlibexample -ldl -m32

be careful:

Compile main. Note that only - L (the directory of the interface header file corresponding to the Library) and - L (the library name, such as libshlibexample.so, excluding the parts of lib and. So) of shlibexample are provided here. No information about dllibexample is provided, but - ldl is indicated

Preparing. so files

#ifndef _SH_LIB_EXAMPLE_H_
#define _SH_LIB_EXAMPLE_H_
 
#define SUCCESS 0
#define FAILURE (-1)
 
#ifdef __cplusplus
extern "C" {
#endif
/*
* Shared Lib API Example
* input : none
* output    : none
* return    : SUCCESS(0)/FAILURE(-1)
*
*/
int SharedLibApi();//Content has only one function header definition
 
#ifdef __cplusplus
}
#endif
#endif /* _SH_LIB_EXAMPLE_H_ */
/*------------------------------------------------------*/
 
#include <stdio.h>
#include "shlibexample.h"
 
int SharedLibApi()
{
    printf("This is a shared libary!\n");
    return SUCCESS;
}/* _SH_LIB_EXAMPLE_C_ */

Dynamic loading Library

#ifndef _DL_LIB_EXAMPLE_H_
#define _DL_LIB_EXAMPLE_H_
#ifdef __cplusplus
extern "C" {
#endif
/*
* Dynamical Loading Lib API Example
* input    : none
* output   : none
* return   : SUCCESS(0)/FAILURE(-1)
*
*/
int DynamicalLoadingLibApi();
#ifdef __cplusplus
}
#endif
#endif /* _DL_LIB_EXAMPLE_H_ */
/*------------------------------------------------------*/
#include <stdio.h>
#include "dllibexample.h"
#define SUCCESS 0
#define FAILURE (-1)
/*
* Dynamical Loading Lib API Example
* input    : none
 * output   : none
 * return   : SUCCESS(0)/FAILURE(-1)
 *
*/
int DynamicalLoadingLibApi()
{
printf("This is a Dynamical Loading libary!\n");
return SUCCESS;
}

main.c

int main()
{
printf("This is a Main program!\n");
/* Use Shared Lib */
printf("Calling SharedLibApi() function of libshlibexample.so!\n");
SharedLibApi();//Can be called directly
/* Use Dynamical Loading Lib */
void * handle = dlopen("libdllibexample.so",RTLD_NOW);//Open the dynamic load library first
if(handle == NULL)
{
    printf("Open Lib libdllibexample.so Error:%s\n",dlerror());
    return   FAILURE;
}
int (*func)(void);
char * error;
func = dlsym(handle,"DynamicalLoadingLibApi");
if((error = dlerror()) != NULL)
{
    printf("DynamicalLoadingLibApi not found:%s\n",error);
    return   FAILURE;
}    
printf("Calling DynamicalLoadingLibApi() function of libdllibexample.so!\n");
func();  
dlclose(handle);//Used in conjunction with the dlopen function to unload the link library       
return SUCCESS;
}

Using gdb trace analysis, an execve system calls the kernel processing function sys_execve to verify your understanding of the processing required for Linux systems to load executable programs

Experimental process:

1. Delete the original menu and clone the new menu

 

2. Check the test.c file: add a new exec System Call (the red area is the modified part)

3. Modify the Makefile value

  4. After making rootfs, start the new kernel and test the function of exec

  5. Conduct gdb tracking and debugging

  Stop at sys first_ At execve, set other breakpoints; Press c to run all the way until the breakpoint sys_execve

You can see here that the inlet is the same

  struct pt_regs *regs is the bottom part of the kernel stack. When an interrupt occurs, both esp and IP press the stack. The starting point of the new program is to modify the value of EIP in the kernel stack (that is, replace the value pushed into the stack with new_ip).

Summary:

  • Understanding of "Linux kernel loading and starting an executable program"

The new executable program starts from new by modifying the kernel stack eip as the starting point of the new program_ Start after IP start execution_ Thread changes the position returned to the user state from the next instruction of int 0x80 to the entry position of the newly loaded executable file. When the execve system call is executed, it enters the kernel state, overwrites the executable program of the current process with the executable file loaded by execve(). When the execve system call returns, it returns the execution starting point (main function) of the new executable program, so the new executable program can be executed smoothly after the execve system call returns. When the execve system call returns, if it is a static link, elf_entry points to the header specified in the executable file (position 0x8048 * * * corresponding to the main function); If you need to rely on dynamic link libraries, elf_ The entry points to the starting point of the dynamic linker. Dynamic link is mainly completed by dynamic linker ld.

  • Pay special attention to where the new executable program starts?  

  When the execve() system call terminates and the process resumes its execution in the user state, the execution context is greatly changed, the new program to be executed has been mapped to the process space, and the new program is executed from the program entry point in the elf header. If the new program is statically linked, the program can run independently. The entry address in the elf header is the entry address of the program. If the new program is dynamically linked, the shared library needs to be loaded at this time. The entry address in the elf header is the entry address of the dynamic linker ld.

  • Why can the new executable program execute smoothly after the execve system call returns?

The new executable program requires the following:  
1. The library functions it needs.  
2. Its process space: code segment, data segment, kernel stack, user stack, etc.  
3. The required operating parameters.  
4. The system resources it needs.  
If the above four conditions are met, the new executable program will be in the runnable state and can be executed normally as long as it is scheduled. Let's see whether these conditions can be met one by one.  
Condition 1: if the new process is statically linked, the library function is already in the executable program file, and the condition is met. If it is dynamically linked, the entry address of the new process is the starting address of the dynamic linker ld, which can complete the loading of the required library functions and meet the conditions.  
Condition 2: execve system call clears the user state stack by greatly modifying the execution context, and replaces the process space of the old process with the process space of the new process. The new process inherits the required process space from the old process, and the condition is met.  
Condition 3: we usually input the parameters required by the executable program in the shell. The shell program passes these parameters to the execve system call by passing function parameters, and then the execve system call passes them to sys by passing system call parameters_ Execve, last sys_ When execve initializes the user state stack of the new program, these parameters are placed at the position where the main function takes the parameters. The conditions are met.  
Condition 4: if the current system does not have the required resources, the new process will be suspended until the resources are available, wake up the new process and become operational. The conditions can be met.  
To sum up, the new executable program can be executed smoothly.

  • What is the difference between statically linked executables and dynamically linked executables when the execve system call returns?

The execve system call calls sys_execve, and then sys_execve calls do_execve, then do_execve calls do_execve_common, then do_execve_common call exec_binprm, in Exec_ In binprm:

ret = search_binary_handler(bprm);//Find the corresponding parsing module (such as ELF) that conforms to the file format
...
One cycle:
    retval = fmt->load_binary(bprm);
...

  For the ELF file format, the fmt function pointer actually executes load_elf_binary,load_ elf_ Binary will call start_thread, at start_ In thread, modify the value of EIP in the kernel stack to point to elf_entry, jump to elf_entry execution.  
For statically linked executables, elf_entry is the starting point for the execution of the new program. For dynamically linked executable programs, the linker ld needs to be loaded first,  
elf_entry = load_elf_interp(...) 
Give the CPU control to ld to load the dependent library, and then ld will return the CPU control to the new process after completing the loading work.

Topics: Linux Operation & Maintenance server