Android NDK -- a summary of fork vs vfork in Linux application creation process

Posted by lbaxterl on Sun, 13 Feb 2022 17:24:50 +0100

introduction

Unix series processes are subprocesses obtained by copying init process or kernel process. The specific details of different implementations are different. Linux provides three kinds of fork, vfork and clone system calls.

1, Unix process overview

In Unix, the CPU allocates and schedules resources with the process as the allocation unit. Each process has a process identifier represented by a non negative shape. The process ID is always unique, but the process ID can be reused. When a process terminates, its process ID can be reused by other processes. An ordinary process has and has only one parent process, There are some special processes in the system.

Process 0 (process ID 0) is a scheduling process, also known as the swap process. It belongs to a part of the kernel and does not execute any programs on the disk. It is uniformly called the system process
Process 1 (process ID 1), also known as init process, is created and started by the kernel through relevant initialization scripts (*. rc or init.d) when the system is started. Init process will eventually become the parent process of all orphan processes.
Process 2 (process ID 2) is a page daemon, which is responsible for supporting paging operation of virtual storage system.

In addition to the process ID, the process also has some other identifiers, as shown in the following table (including but not limited to) related to user process control:

System call to get process ID identifier	explain
pid_t getpid(void)	The process ID of the calling process
pid_t getppid(void)	Parent process ID of the calling process
pid_t getuid(void)	Actual user ID of the calling process
pid_t geteuid(void)	Valid user ID of the calling process
pid_t getgid(void)	Actual user ID of the calling process
pid_t getegid(void)	Valid user ID of the calling process

Usually, many properties of the parent process will be inherited by the child process lock, including (but not limited to):

Actual user ID, actual group ID, valid user ID, valid group ID, additional group ID
Process group ID, session ID, control terminal
Set user ID flag and set group ID flag
Current working directory, root directory
Create mask word in file mode
Close on exec flag for any open file descriptor
Environment context and connected shared storage segments
Storage mapping

The main differences between them are:

Return value after fork call
The process ID is different from the PPID of the process
TMS of child process_ utime,tms_stime,tms_cutime,tms_ustime is set to 0.
The file lock set by the parent process will not be inherited by the child process

Sigcld - when a process terminates or stops, the sigcld signal is sent to its parent process. By default, the system ignores this signal and does not process it. However, if the parent process wants to be informed of the termination or stop status of the child process, the parent process can listen and capture the change signal.

2, fork, vfork, clone

The main applications of fork are:

A parent process expects to copy itself so that the parent and child processes can execute different code segments at the same time. For example, in network communication, the parent process waits for the request of the client. When receiving the request, it executes fork to make the child process process process the request, while the parent process continues to wait for the next request.
A process needs to execute a different program, such as a shell command. The child process calls exec immediately after returning from fork.

#include <uistd.h>

pid_t fork(void)

The new process created by fork is called a child process. Although the fork function will be executed only once, it will return twice:

The return value of the child process is 0, that is, it can be used to judge whether the child process or the parent process is executed. A process has and only has one parent process (the kernel exchange process ID is always 0)
The return value of the parent process is the pid of the child process, because there may be many child processes of a process. If the parent process is not told, the parent process cannot know the pid of its child process.

After returning, the parent and child processes continue to execute the instructions after the fork call, because after the fork, the child process obtains copies of the data space, heap and stack of the parent process. Instead of directly sharing these spaces, it only shares the body segment. Under Linux, we can call the following three system calls to create the child process.

Note: after fork, the execution order of parent process and child process is uncertain, which depends on the scheduling algorithm of the kernel.

system call	explain
fork	The created child process is a complete copy of the parent process, that is, it copies the memory space of the parent process, including the copy of the data space, heap and stack of the parent process. That is, the parent and child processes do not share these storage spaces, but share body segments.
vfork	The created child process shares data segments with the parent process, and the current process will be blocked after vfork() call. The parent process will not continue to execute until the child process exits.
clone	The created child processes can be fully or partially inherited from the memory space of the parent process by the user according to their own needs, which is equivalent to the generic implementation of fork, that is, it allows the caller to independently control those parts shared by the parent and child processes.

The implementation of fork in different thread libraries is slightly different. Other functions of vfork and clone can be regarded as an extended version of fork. The system call difference between vfork and fork only lies in clone_flags are inconsistent.

Traditional copy will certainly consume A lot of resources. Therefore, Linux designs A copy on write strategy. Its core idea is that the parent process and child process share page frames rather than copy page frames. As long as the frames are modified, the page cannot be shared. When the process tries to write A new page to the kernel, it checks whether the original process can write an exception frame to the main page or A child page. Therefore, when the process tries to write the exception frame to the main page, it checks whether it is the only process that can write to the new page, Mark the page frame as writable to the process. When process A uses the system call fork to create A child process B, because child process B is actually A copy of parent process A, it will have the same physical page as the parent process. In order to save memory and speed up the creation, the fork() function allows child process B to share the physical pages of parent process A in A read-only manner. At the same time, the access rights of parent process A to these physical pages are also set to read-only. In this way, when either parent process A or child process B performs A write operation on these shared physical pages, A page fault exception (page_fault int14) interrupt will be generated. At this time, the CPU will execute the exception handling function do provided by the system_ wp_ Page () to resolve this exception. do_wp_page() will cancel the sharing operation on this physical page that causes abnormal write interruption, copy A new physical page for the write process, and make parent process A and child process B have A physical page with the same content respectively Finally, when returning from the exception handling function, the CPU will re execute the write operation instruction that just caused the exception to continue the process.

3, vfork simple test

vfork creates lightweight processes, also known as threads, which share resources

After vfork is called, the parent process will hang until the child process ends (exit) and execve(2). Before that, the parent and child processes share memory pages.

#include<unistd.h>
#include<stdio.h>
#include<stdlib.h>

/**
* forkdemo.c
* n success ,the PID of the child process is returned int the parent,and 0 is returned
* in the child .
* onFailure,-1 is returned in the parent,no child process is created.and errno is set appropriately.
*
*/
int main(int argc, char* argv[])
{
    pid_t ret;
    int count =0;
	//Define a count shared variable in the space of the parent process
	printf("[parent]assign shared var on &count=%p in pid=%d\n",&count,getpid());
    printf("[parent]fork in pid=%d\n",getpid());
    ret=vfork(); //The vfork parent-child processes share the object count. The virtual memory addresses of the count variables shared by the parent and child processes are the same. After calling vfork, the parent process will hang, and the modification of the count by the child process will be reflected in the parent process
    if(ret==0)
    {
        printf("[child]start in pid=%d\n",getpid());
        count=100;
        printf("[child]assign on &count=%p  with count=%d\n",&count,count);
        sleep(2);
		_exit(0);//To exit a child process, you must call, because after using vfork() to create a child process, the parent process will be blocked until the child process calls exec or_ Exit function exits, otherwise it will report vfork: cxa_atexit.c:100: __new_exitfn: Assertion `l != ((void *)0)' failed
        //execl("./vfork2",0);
    }
    else
    {
		printf("[parent]continue in parent pid=%d\n",getpid());
        printf("[parent]ret=%d, &count=%p , count=%d\n",ret,&count,count);
        printf("[parent]the pid=%d\n",getpid());
    }
	return 0;
}

Operation results

unbuntu14:~/crazymo$ gcc forkdemo.c -o vforkunbuntu14:~/crazymo$ ./vfork
[parent]assign shared var on &count=0x7ffe774fe418 in pid=7957
[parent]fork in pid=7957
[child]start in pid=7958
[child]assign on &count=0x7ffe774fe418  with count=100
 //sleep(2) will be here, and then the parent process will continue to execute
[parent]continue in parent pid=7957
[parent]ret=7958, &count=0x7ffe774fe418 , count=100
[parent]the pid=7957

From the above running results, we can draw a simple conclusion: the virtual memory addresses of the count variables shared by the parent and child processes are the same. After calling vfork, the parent process will hang, and the modification of the count by the child process will be reflected in the parent process

4, fork simple test

#include<unistd.h>
#include<stdio.h>
#include<stdlib.h>

/**
* forkdemo.c
* n success ,the PID of the child process is returned int the parent,and 0 is returned
* in the child .
* onFailure,-1 is returned in the parent,no child process is created.and errno is set appropriately.
*
*/
int main(int argc, char* argv[])
{
    pid_t ret;
    int count =0;
	//Define a count shared variable in the space of the parent process
	printf("[parent]assign shared var on &count=%p in pid=%d\n",&count,getpid());
    printf("[parent]fork in pid=%d\n",getpid());
    //ret=vfork(); // The VFORK parent-child processes share the object count. The virtual memory addresses of the count variables shared by the parent and child processes are the same. After calling VFORK, the parent process will hang, and the modification of the count by the child process will be reflected in the parent process
	ret=fork();//Fork the address of the variable shared by the parent and child processes. The parent process does not share the value of the variable. The address of the count variable in the parent and child processes is the same, but the corresponding value is different. The count value in the parent process is 0, and the count value in the child process is 100, * * the virtual memory address of the count variable shared by the parent and child processes is the same, but the parent process will not hang after fork is called, The modification of count by the child process may not be reflected in the parent process**

    if(ret==0)
    {
        printf("[child]start in pid=%d\n",getpid());
        count=100;
        printf("[child]assign on &count=%p  with count=%d\n",&count,count);
        sleep(2);
		_exit(0);//To exit a child process, you must call, because after using vfork() to create a child process, the parent process will be blocked until the child process calls exec or_ Exit function exits, otherwise it will report vfork: cxa_atexit.c:100: __new_exitfn: Assertion `l != ((void *)0)' failed
        //execl("./vfork2",0);
    }
    else
    {
		printf("[parent]continue in parent pid=%d\n",getpid());
        printf("[parent]ret=%d, &count=%p , count=%d\n",ret,&count,count);
        printf("[parent]the pid=%d\n",getpid());
    }
}

Operation results

unbuntu14:~/crazymo$ gcc forkdemo.c -o fork
unbuntu14:~/crazymo$ ./fork
[parent]assign shared var on &count=0x7fffc709ab18 in pid=7950
[parent]fork in pid=7950
[parent]continue in parent pid=7950
[parent]ret=7951, &count=0x7fffc709ab18 , count=0
[parent]the pid=7950
[child]start in pid=7951
[child]assign on &count=0x7fffc709ab18  with count=100

From the above running results, we can draw a simple conclusion: the virtual memory addresses of the count variables shared by the parent and child processes are the same, but the parent process will not hang after invoking fork. Therefore, the modification of count by the child process may not be reflected in the parent process.

Topics: Linux Unix Android

Programmer Think