Concurrent programming
Two methods of subprocess recycling
-
join() allows the main process to wait for the end of the subprocess and reclaim the subprocess resources. The main process then ends and reclaims the resources
from multiprocessing import Process import time def task(name): print(f'Child process{name}: starting......') time.sleep(1) print(f'Child process{name}: end......') if __name__ == '__main__': print('Enter the main process') pro_list = [] for i in range(3): pro_obj = Process(target=task, args=(i,)) pro_list.append(pro_obj) pro_obj.start() for pro in pro_list: # Only when the subprocess is forced to finish, can the main process finish and realize the subprocess resource recovery pro.join() print('End main process')
The main process ends normally, and the sub process and the main process are recycled together
Understanding knowledge
Zombie process: after the sub process ends, the main process does not end normally, and the sub process PID will not be recycled.
Disadvantages: the PID number in the operating system is limited. Only the PID number, that is, the resource is occupied, may cause the failure to create a new process
Orphan process: the subprocess is not ended, the main process is not ended normally, the subprocess PID will not be recycled, and will be recycled by the operating system optimization mechanism.
Operating system optimization mechanism: when the main process terminates unexpectedly, the operating system will detect whether there are running subprocesses. If there are, the operating system will put them into the optimization mechanism for recycling
Daemon
When the main process is ended, the child process must end. Daemons must be set before child processes are started
from multiprocessing import Process import time # Process task def task(): print('starting......') time.sleep(2) print('ending......') if __name__ == '__main__': print('Enter the main process') obj_list = [] for i in range(2): # Create process pro_obj = Process(target=task) obj_list.append(pro_obj) # Start Daemons pro_obj.daemon = True # The daemons must be set before the process is started pro_obj.start() for obj in obj_list: obj.join() print('End of main process')
Data is isolated between processes, code demonstration
from multiprocessing import Process count = 0 def func1(): global count count += 10 print(f'func1:{count}') def func2(count): count += 100 print(f'func2:{count}') if __name__ == '__main__': # Create subprocess 1 pro_obj1 = Process(target=func1) # Create subprocess 2 pro_obj2 = Process(target=func2, args=(count,)) # Subprocess 1 on pro_obj1.start() pro_obj1.join() # Subprocess 2 on pro_obj2.start() pro_obj2.join() print(f'Main process:{count}')
Output result
func1:10 func2:100 Main process: 0
thread
Reference resources: https://blog.csdn.net/daaikuaichuan/article/details/82951084
Generally speaking, processes and threads are divided
Process: the operating system allocates system resources (CPU time slice, memory and other resources) based on the process. The process is the minimum unit of resource allocation
Threads: sometimes referred to as lightweight processes, are the smallest unit of execution for operating system scheduling (CPU scheduling)
Difference between process and thread
- Scheduling: threads as the basic unit of scheduling and allocation, and processes as the basic unit of owning resources
- Concurrency: concurrent execution can be performed between processes, or between multiple threads of the same process
- Own resources: a process is an independent unit that owns resources. A thread does not own system resources, but can access resources belonging to a process.
- Overhead: the system allocates and reclaims resources for processes when they are created or undone. Threads are just different execution paths in a process. If a process is hung up, all threads will be hung up. Therefore, multiprocess programs are more robust than multithreaded programs, but they consume more resources and are less efficient when switching processes
Process and thread connections
- A thread can only belong to one process, and a process can have multiple threads, at least one thread
- Resources are allocated to the process, and all threads of the same process share all resources of the process
- What's really running in the processor is the thread
- Synchronization between threads of different processes by means of message communication
Implementation of thread
# How to create a sub thread # 1. Import the Thread class in the threading module from threading import Thread import time number = 1000 def task(): global number number = 200 print('Sub thread start') time.sleep(1) print('Sub thread end') if __name__ == '__main__': # 2. Create a sub thread object thread_obj = Thread(target=task) # 3. Start sub thread thread_obj.start() # 4. Set the end of sub thread, and the main thread can end thread_obj.join() print('Main process (main thread)') print(number) # Output: 200
# How to create a sub thread # 1. Import the Thread class in the threading module from threading import Thread import time # 2. Inherit Thread class class MyThread(Thread): def run(self): global number number = 200 print('Sub thread start') time.sleep(1) print('Sub thread end') if __name__ == '__main__': # Create a child thread object t = MyThread() # Open child thread t.start() t.join() print('Main process (main thread)') print(number) # Output: 200
Guardian sub thread: set the demon property of the sub thread object to True, that is
def task(): global number number = 200 print('Sub thread start') time.sleep(1) print('Sub thread end') if __name__ == '__main__': # 2. Create a sub thread object thread_obj = Thread(target=task) # 3. Start sub thread thread_obj.daemon = True # 4. Start the guard thread thread_obj.start() # 5. Set the end of sub thread before the end of main thread thread_obj.join() print('Main process (main thread)') print(number)
queue
Queue is equivalent to a third-party channel, which can store data and realize data transfer (that is, data interaction) between processes. First in, first out
It can be realized in three ways
- from multiprocessing import Queue
- from multiprocessing import JoinableQueue
- Import queue - Python built-in queue
Queue data
- put(obj): store data. The number of stored data exceeds the length set by the queue, and the process enters a blocking state
- Put ou nowait (obj): store data. When the number of stored data exceeds the length set by the queue, an error is reported
Queue fetching data
- get(): fetch data. After the records in the queue are fetched, continue fetching. The process enters a blocking state
- get_nowait(): get data. After the records in the queue are taken, continue to get data and report an error
Use
from multiprocessing import JoinableQueue # from multiprocessing import Queue # import queue from multiprocessing import Process # Storing data in a queue def task_put(queue): number_list = [10, 20, 30, 40] for i in number_list: # put() stores data. The number of stored data exceeds the length set by the queue, and the process enters a blocking state queue.put(i) print(f'Storage record:{i}') # Put 〝 nowait() stores data. When the number of stored data exceeds the length set by the queue, an error is reported # queue.put_nowait(i) # print(f 'deposit record: {i}') queue.put(1000) print(f'Storage record:{1000}') # Put ﹣ nowait() stores data. When the stored data exceeds the length set by the queue, an error is reported # queue.put_nowait(1000) # print(f 'deposit record: {1000}') # Get data from queue def task_get(queue): for i in range(5): # get() fetches data. After the records in the queue are fetched, continue fetching and the process enters a blocking state print(f'Removed the first{i+1}Record:{queue.get()}') # Get ﹣ nowait(), after the records in the queue have been fetched, continue fetching and report an error # Print (the {i+1} record retrieved by F ': {queue. Get ﹣ nowait()}') if __name__ == '__main__': # How to create queue objects from multiprocessing import JoinableQueue queue_obj = JoinableQueue(3) # The parameter is of type int, indicating the number of data stored in the queue # How to create queue objects from multiprocessing import Queue # queue_obj = Queue(4) # How import queue creates queue objects # queue_obj = queue.Queue(4) # Process 1 store data pro_obj1 = Process(target=task_put, args=(queue_obj,)) pro_obj1.start() pro_obj1.join() # Process 2 fetching data pro_obj2 = Process(target=task_get, args=(queue_obj,)) pro_obj2.start() pro_obj2.join()
Review:
Queue by list and ordered dictionary, FIFO
# Queue by list # Define an empty list as a queue queue = [] # Insert element into list queue.insert(0, 1) queue.insert(0, 2) queue.insert(0, "hello") print(queue) for index in range(len(queue)): print(f"The first{index+1}Elements:", queue.pop())
# Realization of queue mode 1 through ordered dictionary from collections import OrderedDict # Insert elements into an ordered dictionary ordered_dict = OrderedDict() ordered_dict[1] = 1 ordered_dict[2] = 2 ordered_dict[3] = 'hello' # Move the first inserted element to the last ordered_dict.move_to_end(2) ordered_dict.move_to_end(1) print(ordered_dict) for index in range(3): print(ordered_dict.pop(index + 1)) # Mode two # Queue implementation through ordered dictionary from collections import OrderedDict ordered_dict = OrderedDict() ordered_dict['1'] = 1 ordered_dict['2'] = 2 ordered_dict['3'] = 'hello' ordered_dict.move_to_end('2') ordered_dict.move_to_end('1') print(ordered_dict) ordered_dict.move_to_end('1') ordered_dict.move_to_end('2') ordered_dict.move_to_end('3') index = 1 for key in ordered_dict: print(f'The first{index}Elements:{key}') index += 1
IPC mechanism
IPC (inner process communication)
The communication between processes can be realized through queues. For details, see the example of queues
mutex
Mutual exclusion: several program fragments scattered among different tasks. When a task runs one program fragment, other tasks cannot run any program fragment among them. They can only run after the task runs the program fragment. The most basic scenario is: a common resource can only be used by one process or thread at the same time, and multiple processes or threads cannot use the common resource at the same time
Mutex: a simple method of locking to control access to shared resources. Mutex has only two states, namely locking [lock object. acquire()] and unlocking [lock object. release()]
Function: make concurrent serial, sacrifice execution efficiency, and ensure data security
Features: atomicity, uniqueness, non busy waiting
- Atomicity: if a process / thread locks a mutex, no other process / thread can successfully lock the mutex at the same time
- Uniqueness: if a process / thread locks a mutex, no other process / thread can lock the mutex before it is unlocked
- Non busy wait: if a process / thread has locked a mutex, and the second process / thread tries to lock the mutex, the second process / thread will be suspended (without any cpu resources) until the first process / thread is unlocked, and the second process / thread will be awakened and executed, and the mutex will be locked at the same time
Mutex operation flow:
- Create a mutex Lock object from the [Lock] class in module [multiprocessing]
- Lock the mutex before the critical area of shared resources
- Lock before access, use lock object. acquire(), unlock after access, and use lock object. release()
Process mutex: a small example of ticket purchase
# Content in data.json file: {"number": 1} # Ticket buying example from multiprocessing import Process # process from multiprocessing import Lock # Process mutex import datetime import json import random import time # Check tickets def check_ticket(name): with open('data.json', 'r', encoding='utf-8') as f: ticket_dic = json.load(f) print(f'[{datetime.datetime.now()}]user{name}Check tickets,' f'Current balance:{ticket_dic.get("number")}') # Ticket purchase def buy_ticket(name): # Get the number of current tickets with open('data.json', 'r', encoding='utf-8') as f: ticket_dic = json.load(f) number = ticket_dic.get('number') if number: number -= 1 # Network delay of simulated ticket purchase time.sleep(random.random()) ticket_dic['number'] = number # Successful ticket purchase with open('data.json', 'w', encoding='utf-8') as f: json.dump(ticket_dic, f) print(f'[{datetime.datetime.now()}]{name}Ticket snatch succeeded!') else: # Ticket failure print(f'[{datetime.datetime.now()}]{name}Ticket grabbing failed!') def main(name, lock): # Check tickets check_ticket(name) # Use mutex for ticket purchase # Lock up lock.acquire() buy_ticket(name) # Unlock lock.release() if __name__ == '__main__': pro_list = [] # Create mutex object lock = Lock() # Create 10 processes for i in range(9): pro_obj = Process(target=main, args=(f'pro_obj{i+1}', lock)) pro_obj.start() for pro in pro_list: pro.join()
Thread mutex example
""" //Open 10 threads to modify one data """ from threading import Lock from threading import Thread import time # Create thread mutex object lock = Lock() # Records to modify number = 100 # Thread task def task(): global number # Lock up # lock.acquire() # Modified value number2 = number time.sleep(1) number = number2 - 1 # Unlock # lock.release() if __name__ == '__main__': # Create 10 threads list1 = [] for line in range(10): t = Thread(target=task) t.start() list1.append(t) # The main thread can only end after the limit of child thread ends for t in list1: t.join() print(number) # With mutex, output: 90; without mutex, output: 99