Python learning notes - detailed explanation of Python multi process

Posted by Sphen001 on Sun, 02 Jan 2022 12:53:08 +0100

1, Introduction to multitasking

1. General

Multitasking allows two functions or methods to execute at the same time. The advantage is that it makes full use of CPU resources and improves the execution efficiency of the program.

The concept of multitasking: multitasking refers to multiple tasks at the same time. For example, the computer is now equipped with a multitasking operating system, which can run multiple software at the same time

2. Multi task execution mode

Multi task execution mode: concurrent and parallel

  • Concurrency: alternating tasks over a period of time
    For example, when a single core CPU processes tasks, the operating system makes each software execute alternately. Because the CPU executes too fast, it seems that these software are executing at the same time

  • Parallel: for multi-core cpu processing multi tasks, the operating system will arrange a software for each core of the cpu, and multiple cores really execute software together. It should be noted here that multi-core CPUs execute multiple tasks in parallel, and multiple software are always executed together.

3. Summary

  • Using multitasking can make full use of CPU resources, improve the execution efficiency of the program, and make your program have the ability to handle multiple tasks.
  • There are two ways to execute multitasking: concurrency and parallelism. Parallelism here is the real meaning of executing multiple tasks together

For a detailed explanation, please refer to the books related to the operating system.

2, Introduction to process

1. Introduction to the process

In Python programs, you can use processes to achieve multitasking. Processes are a way to achieve multitasking.

2. Concept of process

A running program or software is a process. It is the basic unit for the operating system to allocate resources. That is, every time a process is started, the operating system will allocate certain running resources to ensure the operation of the process.

Note: after a program runs, there is at least one process. A process has one thread by default. Multiple threads can be created in the process. Threads are attached to the process. If there is no process, there will be no thread.

Multiple processes can complete multiple tasks. Each task is like an independent company. Each company operates separately, and each process is also executing its own tasks.

3. Summary

  • Process is the basic unit of resource allocation in the operating system
  • Process is a way to implement multitasking in Python programs

3, Use of multiple processes

1. Import process package

import multiprocessing

2. Description of process class

  • Group: Specifies the process group. Currently, only None can be used
  • Target: name of the target task to be executed
  • Name: process name
  • args: pass parameters to the execution task in tuple mode
  • kwargs: pass parameters to the execution task in the form of a dictionary

3. Common methods for instance objects created by process:

  • start(): start the child process instance (create child process)
  • join(): wait for the execution of the child process to end
  • terminate(): terminate the child process immediately regardless of whether the task is completed or not

4. Common attributes of instance objects created by process:

name: alias of the current process. The default is process-n, and N is an integer incremented from 1

4, Multi process multitasking code

import multiprocessing
import time

# Dance task
def dance():
    for i in range(3):
        print('Dancing')
        time.sleep(0.2)

# Singing task
def sing():
    for i in range(3):
        print('Singing')
        time.sleep(0.2)


# Create a child Process (the Process created by yourself is called a child Process, and the Process class that has been imported in the _init_.py file)

# 1.group: process group. At present, only None can be used. Generally, it does not need to be set
# 2.target: the target task executed by the process
# 3.name: process name. If not set, the default is Process-1

dance_process = multiprocessing.Process(target = dance)

# Start the child process to execute the corresponding task
dance_process.start()

# The main process performs singing tasks
sing()

# The process execution is out of order. The specific process to execute first is determined by the operating system scheduling
# Each execution result is different

5, Get process number

1. Purpose of obtaining process number

The purpose of obtaining the process number is to verify the relationship between the main process and the sub process, and you can know which main process created the sub process

There are two operations to get the process number:

  • Get current process number
  • Gets the current parent process number

2. Get the current process number

os.getpid() means to get the current process number

Example code:

import multiprocessing
import time
import os

# Dance task
def dance():

    dance_processid = os.getpid()
    print(dance_processid,multiprocessing.current_process())

    # Get parent process number
    dance_processid_parent = os.getppid()
    print("His father process id",dance_processid_parent)

    for i in range(3):
        print('Dancing')
        time.sleep(0.2)

        # Make the process according to the number of the process
        os.kill(dance_processid,9)

# Singing task
def sing():
    sing_processid = os.getpid()
    print(sing_processid, multiprocessing.current_process())

    for i in range(3):
        print('Singing')
        time.sleep(0.2)


# Gets the number of the current child process
main_processid = os.getpid()

# Get the object of the current process to see which process is executing the current code
print(main_processid,multiprocessing.current_process())



# Create a child Process (the Process created by yourself is called a child Process, and the Process class that has been imported in the _init_.py file)

# 1.group: process group. At present, only None can be used. Generally, it does not need to be set
# 2.target: the target task executed by the process
# 3.name: process name. If not set, the default is Process-1

dance_process = multiprocessing.Process(target = dance)
print(dance_process)
sing_process = multiprocessing.Process(target = sing)
print(sing_process)

# Start the child process to execute the corresponding task
dance_process.start()
sing_process.start()

6, The process executes a task with parameters

1. Introduction to the process executing tasks with parameters

Above, we use the process to execute tasks without parameters. Add that we use the process to execute tasks with parameters. How to pass parameters to the function?

There are two ways for the Process class to execute a task and pass parameters to the task:

  • args: indicates that parameters are passed to the execution task as tuples
  • kwargs: means to pass parameters to the execution task in dictionary mode
import multiprocessing

# Task of real information
def show_info(name,age):
    print(name,age)


# Create child process
# Parameters passed as tuples need to be consistent
sub_process = multiprocessing.Process(target = show_info,args = ("Reese",21))

# The order of parameter transfer in dictionary mode is not required to be consistent
sub_process1 = multiprocessing.Process(target= show_info,kwargs = {"age":20,"name":'xxf'})

# Start process
sub_process.start()
sub_process1.start()

7, Process considerations

1. Introduction to the points for attention of the process

  • Global variables are not shared between processes
  • The main process will wait for the execution of all child processes to end

2. Global variables are not shared between processes

Creating a sub process is actually copying the resources of the main process. A sub process is actually a copy of the main process. Therefore, changing the variables of one of the sub processes will not affect other processes.

import multiprocessing
import time

# Define a list of global variables
g_list = list()

# Task of adding data
def add_data():
    for i in range(3):
        # Because the list is a variable type, you can modify the data based on the original memory, and the memory address remains unchanged after modification
        # So you don't need to add the global keyword
        # Adding global means declaring the memory address of the global variable to be modified

        g_list.append(i)
        print("add:",i)
        time.sleep(0.2)

# Task of reading data
def read_data():
    print("read:",g_list)


# Child process for adding data
add_process = multiprocessing.Process(target = add_data)

# Child process reading data
read_process = multiprocessing.Process(target=read_data)


# The code executed by the main processes of linux and mac will not be copied, but for the corresponding windows system,
# The code executed by the main process will also be copied. For windows, if the process copy execution of the code creating the sub process is equivalent to recursive unlimited creation of the sub process, an error will be reported

# Solution: determine whether it is the main module to solve the windows recursive creation of sub processes
# Then the following code will not be copied when the child process is executing
if __name__ == '__main__':



    # Start process
    add_process.start()

    # The current process (main process) is waiting for the process of adding data to execute. After that, the code continues to execute
    add_process.join()

    read_process.start()

For linux and mac, the code executed by the main process will not be copied, but for the corresponding windows system, the code executed by the main process will also be copied. For windows, if the process copy execution of the code to create a sub process is equivalent to recursive unlimited creation of a sub process, an error will be reported

Solution: determine whether it is the main module to solve the windows recursive creation of sub processes

About the main module: the directly executed module is the main module (program entry module), so whether it is the main module code should be added to the directly executed module

  • Prevent others from executing the code in main when importing files
  • Prevent windows system from creating child processes recursively

Add main to the module you want to run

3. The main process will wait until the execution of all sub processes is completed

import multiprocessing
import time

def task():
    for i in range(10):
        print("Task execution...")
        time.sleep(0.2)

# Determine whether it is a direct execution module, program entry module

# The standard python writing method is to directly execute the module, and the code to judge whether it is the main module needs to be added

if __name__ == '__main__':
    # Create child process
    sub_process = multiprocessing.Process(target = task)

    # Set the child process as the guardian main process, and then the main process exits and the child process is destroyed directly
    sub_process.daemon = True

    sub_process.start()

    # Main process delay 0.5s
    time.sleep(0.5)


    # Or destroy the child process directly before exiting the main process
    sub_process.terminate()
    print("over")



# Conclusion: the main process will wait for the execution of the child process before exiting

# terms of settlement:
# 1. Set the subprocess as the guardian main process. The main process exits and the subprocess is destroyed. The subprocess will depend on the main process
# 2 let the child process destroy before the main process exits

How to solve:

1. Set the subprocess as the main Guardian process. The main process exits and the subprocess is destroyed. The subprocess will depend on the main process
2. Let the child process destroy before the main process exits

Topics: Python Back-end