18 Threads
18.1 Processes and Threads
Process: Open a program and there will be at least one process. The basic unit of resource allocation by the operating system
Threads: Threads are the basic unit of CPU scheduling, and each process has at least one thread.
Single thread: only one thread
def funa(): print(123) def funb(): print(456) funa() funb() # Execute funa first # Execute funb again
Multithreaded
Thread module: threading
import threading
Thread class Thread parameter:
target: Task name to execute
args: Pass task execution parameters in tuples
def funa(): print(123) time.sleep(2) print("It's over") def funb(): print(456) time.sleep(3) print("It's over") if __name__ == "__main__": # 1 Create Subthread t1 = threading.Thread(target=funa) #funa is a function name t2 = threading.Thread(target=funb) # 2. Open Subthread t1.start() t2.start()
Execution with parameters
def funa(a): print("How are you",a) time.sleep(2) print("I'm fine") def funb(b): print(b) time.sleep(2) print("A little sweet") if __name__ == "__main__": # The first way to pass in parameters f1 = Thread(target=funa,args=("Incoming parameters",)) f2 = Thread(target=funa,args=("Incoming parameter 2",)) f1.start() f2.start() # The second way to pass in parameters f1 = Thread(target=funa,kwargs={"a":"Parameters"}) f2 = Thread(target=funa,kwargs={"b":"Parameter 2"}) f1.start() f2.start()
18.2 Threads
step
- Create Subthread Thread()
- Open Subthread
18.2.1 Daemon thread, Blocked thread
Daemon thread: the main thread finishes executing and the child thread finishes immediately
Blocking Thread: Waiting for join_ Subthread in list executes before main thread executes
def funa(): print("start a") time.sleep(2) print("End a") def funb(): print("start b") time.sleep(2) print("End b") if __name__ == "__main__": t1 = threading.Thread(target=funa) t2 = threading.Thread(target=funb) #Open the daemon thread, the main thread finishes executing, and the sub-thread finishes t1.setDaemon(True) t2.setDaemon(True) t1.start() t2.start() # Blocks the main thread, suspends the action, and executes the main thread only when the join has finished executing t1.join() t2.join() t1.setName("Thread 1") t2.setName("Thread 2") # Get Thread Name t1.getName() t2.getName() print("This is the main thread, the last line of the program")
Unordered execution order of 18.2.2 threads
Two tasks are executed together and the execution between threads is out of order
def test(): time.sleep(1) print("The current thread is",threading,current_thread()) if __name__ =="__main__": for i in range(5): # Create Subthread s1 = threading.Thread(target=test) s1.start()
18.2.3 Create Thread Class
Encapsulation of thread execution code
- Inherit Thread Class
- Override run method
from threading import Thread # Define a Thread Class class Mythread(Thread): # Override the run method, specifying the name run, which represents the method of thread activity def run(self): print("Object-oriented") time.sleep(3) print("thread") if __name__ == "__main__": my = MyThread() my.start()
18.2.4 Resource Sharing
# resource sharing from threading import Thread #Import Thread Module import time li = [] # Write data def wdata(): for i in range(5): li.append(i) time.sleep(0.2) print("Written data is:",li) # Read data def rdata(): print("The data read is:",li) if __name__ == "__main__": wd = Thread(target=wdata) rd = Thread(target=radata) wd.start() wd.join() #You can only execute later code if you wait for the writing to complete rd.start() print("This is the last line")
18.2.5 Resource Sharing Causes Resource Competition
A is a shared resource, causing add2 and add2 threads to compete for resource a, resulting in different results.
from threading import Thread a = 0 n = 1000000 # Loop b times adds 1 to global variable a def add(): for i in range(n): global a #GlobalDeclare global Variables a += 1 print("For the first time",a) def add2(): for i in range(n): global a #GlobalDeclare global Variables a += 1 print("The second time",a) if __name__ == "__main__": # Create two sub-threads first = Thread(target=add) second = Thread(target=add2) # Start Thread first.start() second.start() # Run Results # First 1008170 # Second 1509617
18.2.6 How threads synchronize
-
Thread Wait (join)
-
mutex
The concept of synchronization:
There are two threads, Thread A writes and Thread B reads the values that Thread A writes. Thread A writes before thread B reads. There is a synchronization relationship between threads A and B.
18.2.7 Mutex
Ensure that multiple threads access shared data without data errors: Ensure that only one thread can operate at a time
The threading module defines the Lock function, which is called to obtain a mutex
The role of mutexes
- Ensure that only one thread is working on the shared data at the same time without any error problems.
- Using mutexes can affect code execution efficiency
Deadlock can occur if mutex is not used properly.
acquire() lock
release() unlock
Locking and unlocking must occur in pairs
from threading import Thread,Lock a = 0 n = 1000000 # Loop b times adds 1 to global variable a # Create Mutex lock = Lock() def add(): lock.acquire() #Locking for i in range(n): global a #GlobalDeclare global Variables a += 1 print("For the first time",a) lock.release() #Unlock def add2(): lock.acquire() #Locking for i in range(n): global a #GlobalDeclare global Variables a += 1 print("The second time",a) lock.release() #Unlock if __name__ == "__main__": # Create two sub-threads first = Thread(target=add) second = Thread(target=add2) # Start Thread first.start() second.start()
18.3 Process
18.2.1 Process Introduction
Running a program will have a process, a process will have a thread by default
Process: When a program is running, the code + resources used are called processes and are the basic unit of resource allocation by the operating system.
Status of the process
- Ready: Everything is ready, only cpu is owed
- Execution state: the cpu is performing its functions
- Waiting state: Waiting for certain conditions to be met
import time print("We're learning") name = input("Please enter your name") #User input, block, wait. print(name) #running state time.sleep(2) #Sleep for 2 seconds, blocked print("Song to wine, geometry of life") #running state
18.3.2 Process Creation
The multiprocessing module is a cross-platform version of the multiprocess module, which provides a Process class to represent a process object, which can be understood as a separate process and can perform other things.
from multiprocessing import Process # Process class parameters # target: call object, task to be performed by child process # args: pass values as tuples # kwargs: pass values as a dictionary # Common methods: # start() Opens a child process # is_alive() Determines if the child process is still alive and survives as True # The join master process waits for the child process to finish executing # Common attributes # name Alias of the current process # Process number of current pid process
import os from multiprocessing import Process def one(): print("This is subprocess one") print(f"Subprocess id{os.getpid()},Parent Process id{os.getppid()}") def two(): print("This is subprocess two") print(f"Subprocess id{os.getpid()},Parent Process id{os.getppid()}") if __name__ == "__main__": # Create Subprocess p1 = Process(target=one,name="Process Name 1") p2 = Process(target=two) #open p1.start() p2.start() print("p1 The child process name is:",p1.name) print("p2 The child process name is: ",p2.name) #View the process number of the child process print(p1.pid) print(p2.pid) print(f"Main Process{os.getpid()},Parent process:{os.getppid()}") # In cmd, enter tasklist and find pycharm.ext to see the process number
is_alive() and join()
def speak(name): print(f"Now?{name}Talking") def listen(name2): print(f"{name2}In class") if __name__ == "__main__": p1 = Process(target=speak, args=('Nine Odes',)) p2 = Process(target=listen, args=('Li Si',)) p1.start() p1.join() #Wait until p1 finishes executing before performing the following actions p2.start() print("p1 The status of is:",p1.is_alive()) print("p2 The status of is:",p2.is_alive())
18.3.3 Process Communication
Do not share global variables between processes
import time li = [] # Write data def wdata(): for i in range(5): li.append(i) time.sleep(0.2) print("Written data is",li) # Read data def rdata(): print("The data read is:",li) if __main__ == "__main__": p1 = Process(target=wdata) p2 = Process(target=rdata) p1.start() p1.join() p2.start()
Process communication ensures resource transfer
You can use the Queue of the multiprocessing module to transfer data between multiple processes, which is itself a message queue program
q.put() into data
q.get() takes out the data
# Entry q.put() #Put data in # Queue q.get() #Take out the data # Import Module from queue import Queue # Initialize a queue object q = Queue(3) #3 means that up to three messages are acceptable q.put('I went to infusion today, what kind of infusion, think of your night') q.put("You don't know what's hurting") q.put("It can be upsetting to be touched by someone, but it can also be sweet") # get() #take out print(q.get()) print(q.get()) print(q.get()) # q.empty() returns True if the queue is empty or False if it is empty # q.qsize() number in queue # q.full() determines if the queue is full and returns to True when it is full print("Current number of messages:",q.qsize())
# Delivery of messages by queue from multiprocessing import Process,Queue import time li = ["Steamed lamb","Steamed bear paw","Steamed Flower Duck"] # Write data def wdata(q): #q denotes a queue object for i in range(3): print(f"Breakfast{i}Put it in") q.put(i) time.sleep(0.2) # Read data def rdata(q): #q denotes a queue object # Get it out as long as there's news while True: if q.empty(): #Determine if the queue is empty break #Break else: print("Customers get from the queue:",q.get()) if __name__ == "__main__": # Create Queue Object q =Queue() #Omit parameters inside, no size limit p1 = Process(target=wdata,args=(q,)) p2 = Process(target=rdata,args=(q,)) p1.start() p2.start()
18.3.4 Process Pool
Place child processes in the process pool
The concept of a process pool
Define a pool with a fixed number of processes in it and, if needed, handle tasks with processes in a pool
When the process is finished, the process does not close, but puts the process back into the pool to continue waiting for tasks
Method:
p.apply_async() is asynchronous, non-blocking, and can switch processes at any time according to the system schedule without waiting for the current process to execute.
p.close() closes the process pool
p.join() main process blocked, waiting for all worker processes to exit, can only be called after close()
from multiprocessing import Pool import time def work(a): print("We are in class") time.sleep(2) return a * 3 if __name__ == "__main__": #Define a process pool with a maximum number of processes of 3 p = Pool(3) li = [] for i in range(6): #P.apply_ Async (target of call, parameter passed) res = p.apply_async(work,args=(i,)) #Asynchronous Run # Asynchronous: The process does not have to wait all the time, but proceeds with the following, regardless of the status of other processes li.append(res) #Save results print(res.get()) #Print results # Close process pool p.close() # Waiting for all subprocesses in the p process pool to finish executing, must be placed behind the close method p.join() # Use get to get apply_ Results of Async for i in li: print(i.get())
18.4 Protocol
18.4.1 Introduction
Protocol, also known as microthreading, fibers. English name Coroutine
Collaboration is another way in python to achieve multitasking, but takes up less execution units (understood as required resources) than threads. Why is it an execution unit because it has its own CPU context? This allows us to switch one protocol to another only at the right time.
The program can run as long as the CPU context is saved or restored during this process.
Concurrency under a single thread, also known as a microthread
For a collaboration, the programmer is God, and it executes wherever you want it to.
Use scenarios:
- If there are more io operations in a thread, the collaboration works better
- Suitable for high concurrency processing
Simple implementation protocol
import time def work1(): while True: yield 'Brother in distress' def work2(): while True: yield 'The Internet is too deep for you to hold it' if __name__ == "__main__": w1 = work1() w2 = work2() while True: print(next(w1)) print(next(w2)) # Programmers can simply control the execution order of w1 and w2 through code
18.4.2 greenlet
greenlet: is a program module implemented in C that switches between any function by setting switch()
It is a manual switch. When IO operation is encountered, the program will block instead of switching automatically
Installation command pip install greenlet
Uninstall command pip uninstall ungreenlet
View the command pip list of installed modules
from greenlet import greenlet def eat(): print("Start eating supper") g2.switch() #Switch to g2 and do not execute later print("I'm full") def study(): print("Start Learning") print("Finish Learning") # Instantiate a protocol object # Greenlet (task name) g1 = greenlet(eat) g2 = greenlet(study) g1.switch() # Switch to g1 # Output Results # Start eating supper # Start Learning # Finish Learning
18.4.3 gevent
greenlet is manual and cumbersome
gevent is switched automatically
gevent encounters an IO operation and switches automatically, which is an active switch.
The main pattern used in gevent is greenlet
pip install gevent
import gevent # Create a collaboration object # Gevent. Spawn (function name) # join blocked, waiting for a protocol to finish executing # The joinall parameter is a list of co-op objects that wait until all the co-ops have been executed before exiting
Execute A/B tasks. When A and B encounter time-consuming operations, gevent will let A continue executing and start executing B tasks at the same time
A completed the time-consuming operation, B completed the time-consuming operation in the corresponding time
# Keep in mind that py files do not rename third-party modules or built-in modules import gevent def write(): print("We're writing code") gevent.sleep(1) #While waiting, open all other protocols so that all others (g1,g2) execute concurrently print("Finished at last") def listen(): print("Now listen well") gevent.sleep(1) print("Break") g1 = gevent.spawn(write) g2 = gevent.spawn(listen) g1.join() #Waiting for g1 object to finish executing g2.join() #Waiting for g2 object execution to end # Run Results # We're writing code # Now listen well # Finished at last # Break
Join all needs to wait for all the coordinator objects to execute before exiting
def work(name): for i in range(3): gevent.sleep(1) #Start all other protocols while waiting for them to execute concurrently print(f'The function name is:{name},i The value of is:{i}') gevent.joinall([ gevent.spawn(work,'Small White'), gevent.spawn(work,'Goose'), ])
18.4.4 patches
Patch the program
monkey patch,
Monkey patch features:
- Has the ability to replace in the module
from gevent import monkey import gevent import time monkey.patch_all() # Put time.sleep() code, replace with gevent.sleep() code. Must be written at the top def work(name): for i in range(3): # Replace the code used for the time-consuming operation with the code inside the gevent that implements the time-consuming operation itself time.sleep(1) print(f'The function name is:{name},i The value of is:{i}') gevent.joinall([ gevent.spawn(work,'Small White'), gevent.spawn(work,'Goose'), ])
18.4.5 Comprehensive example
import gevent def funa(): print("wsc: Something's happening today, with grandchildren yn Make a phone call") #1 gevent.sleep(2) print("wsc: Why did you hang up suddenly and call back?...") #5 def funb(): print("Sun yn: wsc Call me now.") #2 gevent.sleep(3) print("Sun yn: He called again") #6 def func(): print("What are you doing, dear?") #3 gevent.sleep(1) print("You're here") #4 gevent.joinall([ gevent.spawn(funa), gevent.spawn(funb), gevent.spawn(func) ]) # Output Results wsc: Something's happening today, with grandchildren yn Make a phone call Sun yn: wsc Call me now. What are you doing, dear? You're here wsc: Why did you hang up suddenly and call back?... Sun yn: He called again
summary
- Process is the basic unit of resource allocation and thread is the basic unit of CPU scheduling
- Contrast:
- Process: Switching requires the most resources and is less efficient
- Threads: switch requires average resources and efficiency
- Protocol: Switching requires less resources and is more efficient
- Multithreaded: Suitable for IO-intensive operations (read and write more data, such as crawlers)
- Multiprocess: suitable for cpu intensive operations (scientific calculation, calculation of pi, HD decoding of video)
- A running program has at least one process and a process has at least one thread.