Python Foundation 10 - Threads, Processes, Processes

Posted by hbalagh on Mon, 24 Jan 2022 07:19:02 +0100

18 Threads

18.1 Processes and Threads

Process: Open a program and there will be at least one process. The basic unit of resource allocation by the operating system

Threads: Threads are the basic unit of CPU scheduling, and each process has at least one thread.

Single thread: only one thread

def funa():
    print(123)

def funb():
    print(456)
funa()
funb()

# Execute funa first
# Execute funb again

Multithreaded

Thread module: threading

import threading

Thread class Thread parameter:

target: Task name to execute

args: Pass task execution parameters in tuples

def funa():
    print(123)
    time.sleep(2)
    print("It's over")
    
def funb():
    print(456)
    time.sleep(3)
    print("It's over")
 
if __name__ == "__main__":
    # 1 Create Subthread
    t1 = threading.Thread(target=funa) #funa is a function name
	t2 = threading.Thread(target=funb)
    
    # 2. Open Subthread
    t1.start()
    t2.start()

Execution with parameters

def funa(a):
    print("How are you",a)
    time.sleep(2)
    print("I'm fine")

def funb(b):
    print(b)
    time.sleep(2)
    print("A little sweet")
    
    
if __name__ == "__main__":
    # The first way to pass in parameters
    f1 = Thread(target=funa,args=("Incoming parameters",))
    f2 = Thread(target=funa,args=("Incoming parameter 2",))
    
    f1.start()
    f2.start()
    
    
    # The second way to pass in parameters
    f1 = Thread(target=funa,kwargs={"a":"Parameters"})
    f2 = Thread(target=funa,kwargs={"b":"Parameter 2"})
    
    f1.start()
    f2.start()



18.2 Threads

step

  1. Create Subthread Thread()
  2. Open Subthread

18.2.1 Daemon thread, Blocked thread

Daemon thread: the main thread finishes executing and the child thread finishes immediately

Blocking Thread: Waiting for join_ Subthread in list executes before main thread executes

def funa():
    print("start a")
    time.sleep(2)
    print("End a")

def funb():
    print("start b")
    time.sleep(2)
    print("End b")

if __name__ == "__main__":
    t1 = threading.Thread(target=funa)
    t2 = threading.Thread(target=funb)

    #Open the daemon thread, the main thread finishes executing, and the sub-thread finishes
    t1.setDaemon(True)
    t2.setDaemon(True)

    t1.start()
    t2.start()
    
    # Blocks the main thread, suspends the action, and executes the main thread only when the join has finished executing
    t1.join()
    t2.join()
    
    t1.setName("Thread 1")
    t2.setName("Thread 2")
    
    # Get Thread Name
    t1.getName()
    t2.getName()

    print("This is the main thread, the last line of the program")

Unordered execution order of 18.2.2 threads

Two tasks are executed together and the execution between threads is out of order

def test():
    time.sleep(1)
    print("The current thread is",threading,current_thread())
   

if __name__ =="__main__":
    for i in range(5):
        # Create Subthread
        s1 = threading.Thread(target=test)
        s1.start()
    

18.2.3 Create Thread Class

Encapsulation of thread execution code

  1. Inherit Thread Class
  2. Override run method
from threading import Thread

# Define a Thread Class
class Mythread(Thread):
    
    # Override the run method, specifying the name run, which represents the method of thread activity
    def run(self):
        print("Object-oriented")
        time.sleep(3)
        print("thread")

if __name__ == "__main__":
    my = MyThread()
    my.start()

18.2.4 Resource Sharing

# resource sharing
from threading import Thread #Import Thread Module
import time

li = []
# Write data
def wdata():
    for i in range(5):
        li.append(i)
        time.sleep(0.2)
    print("Written data is:",li)

# Read data
def rdata():
    print("The data read is:",li)
    
if __name__ == "__main__":
    wd = Thread(target=wdata)
    rd = Thread(target=radata)
	wd.start()
    
    wd.join() #You can only execute later code if you wait for the writing to complete
    rd.start()
    
    
    print("This is the last line")

18.2.5 Resource Sharing Causes Resource Competition

A is a shared resource, causing add2 and add2 threads to compete for resource a, resulting in different results.

from threading import Thread

a = 0
n = 1000000
# Loop b times adds 1 to global variable a
def add():
    for i in range(n):
        global a  #GlobalDeclare global Variables
        a += 1
    print("For the first time",a)

def add2():
    for i in range(n):
        global a  #GlobalDeclare global Variables
        a += 1
    print("The second time",a)
    
    
if __name__ == "__main__":
    # Create two sub-threads
    first = Thread(target=add)
    second = Thread(target=add2)
    
    # Start Thread
    first.start()
    second.start()

# Run Results
# First 1008170
# Second 1509617
    

18.2.6 How threads synchronize

  1. Thread Wait (join)

  2. mutex

The concept of synchronization:

There are two threads, Thread A writes and Thread B reads the values that Thread A writes. Thread A writes before thread B reads. There is a synchronization relationship between threads A and B.

18.2.7 Mutex

Ensure that multiple threads access shared data without data errors: Ensure that only one thread can operate at a time

The threading module defines the Lock function, which is called to obtain a mutex

The role of mutexes

  1. Ensure that only one thread is working on the shared data at the same time without any error problems.
  2. Using mutexes can affect code execution efficiency

Deadlock can occur if mutex is not used properly.

acquire() lock

release() unlock

Locking and unlocking must occur in pairs

from threading import Thread,Lock

a = 0
n = 1000000
# Loop b times adds 1 to global variable a

# Create Mutex
lock = Lock()

def add():
    lock.acquire() #Locking
    for i in range(n):
        global a  #GlobalDeclare global Variables
        a += 1
    print("For the first time",a)
    lock.release() #Unlock

def add2():
    lock.acquire() #Locking
    for i in range(n):
        global a  #GlobalDeclare global Variables
        a += 1
    print("The second time",a)
    lock.release() #Unlock
    
    
if __name__ == "__main__":
    # Create two sub-threads
    first = Thread(target=add)
    second = Thread(target=add2)
    
    # Start Thread
    first.start()
    second.start()


18.3 Process

18.2.1 Process Introduction

Running a program will have a process, a process will have a thread by default

Process: When a program is running, the code + resources used are called processes and are the basic unit of resource allocation by the operating system.

Status of the process

  1. Ready: Everything is ready, only cpu is owed
  2. Execution state: the cpu is performing its functions
  3. Waiting state: Waiting for certain conditions to be met
import time
print("We're learning")
name = input("Please enter your name") #User input, block, wait.
print(name)    #running state
time.sleep(2)   #Sleep for 2 seconds, blocked
print("Song to wine, geometry of life") #running state

18.3.2 Process Creation

The multiprocessing module is a cross-platform version of the multiprocess module, which provides a Process class to represent a process object, which can be understood as a separate process and can perform other things.

from multiprocessing import Process

# Process class parameters
#	target: call object, task to be performed by child process
#    args: pass values as tuples
#    kwargs: pass values as a dictionary
 
    
# Common methods:
#	start() Opens a child process
#	is_alive() Determines if the child process is still alive and survives as True
#	The join master process waits for the child process to finish executing

# Common attributes
#	name Alias of the current process
#	Process number of current pid process


import os
from multiprocessing import Process

def one():
    print("This is subprocess one")
    print(f"Subprocess id{os.getpid()},Parent Process id{os.getppid()}")

def two():
    print("This is subprocess two")
    print(f"Subprocess id{os.getpid()},Parent Process id{os.getppid()}")
    
if __name__ == "__main__":
    # Create Subprocess
    p1 = Process(target=one,name="Process Name 1")
    p2 = Process(target=two)
    
    #open
    p1.start()
    p2.start()
    
    print("p1 The child process name is:",p1.name)
    print("p2 The child process name is: ",p2.name)
    
    #View the process number of the child process
    print(p1.pid)
    print(p2.pid)
    
	print(f"Main Process{os.getpid()},Parent process:{os.getppid()}")
    # In cmd, enter tasklist and find pycharm.ext to see the process number
    

is_alive() and join()

def speak(name):
    print(f"Now?{name}Talking")

def listen(name2):
    print(f"{name2}In class")

if __name__ == "__main__":
    
    p1 = Process(target=speak, args=('Nine Odes',))
    p2 = Process(target=listen, args=('Li Si',))
    
    p1.start()
    p1.join() #Wait until p1 finishes executing before performing the following actions
    p2.start()
    
    
    print("p1 The status of is:",p1.is_alive())
    print("p2 The status of is:",p2.is_alive())
    
    
    

18.3.3 Process Communication

Do not share global variables between processes

import time
li = []
# Write data
def wdata():
    for i in range(5):
        li.append(i)
        time.sleep(0.2)
    print("Written data is",li)

# Read data
def rdata():
    print("The data read is:",li)

if __main__ == "__main__":
    p1 = Process(target=wdata)
    p2 = Process(target=rdata)
    
    p1.start()
    p1.join()
    p2.start()
    

Process communication ensures resource transfer

You can use the Queue of the multiprocessing module to transfer data between multiple processes, which is itself a message queue program

q.put() into data

q.get() takes out the data

# Entry
q.put() #Put data in

# Queue
q.get() #Take out the data

# Import Module
from queue import Queue
# Initialize a queue object
q = Queue(3) #3 means that up to three messages are acceptable
q.put('I went to infusion today, what kind of infusion, think of your night')
q.put("You don't know what's hurting")
q.put("It can be upsetting to be touched by someone, but it can also be sweet")

# get() #take out
print(q.get())
print(q.get())
print(q.get())


# q.empty() returns True if the queue is empty or False if it is empty
# q.qsize() number in queue
# q.full() determines if the queue is full and returns to True when it is full

print("Current number of messages:",q.qsize())

# Delivery of messages by queue
from multiprocessing import Process,Queue
import time

li = ["Steamed lamb","Steamed bear paw","Steamed Flower Duck"]

# Write data
def wdata(q): #q denotes a queue object
    for i in range(3):
        print(f"Breakfast{i}Put it in")
        q.put(i)
        time.sleep(0.2)

# Read data
def rdata(q): #q denotes a queue object
    # Get it out as long as there's news
    while True:
        if q.empty(): #Determine if the queue is empty
           break  #Break
        else:
            print("Customers get from the queue:",q.get())
        


if __name__ == "__main__":
    # Create Queue Object
    q =Queue()  #Omit parameters inside, no size limit
    p1 = Process(target=wdata,args=(q,))
    p2 = Process(target=rdata,args=(q,))
    
    p1.start()
    p2.start()


18.3.4 Process Pool

Place child processes in the process pool

The concept of a process pool

Define a pool with a fixed number of processes in it and, if needed, handle tasks with processes in a pool

When the process is finished, the process does not close, but puts the process back into the pool to continue waiting for tasks

Method:

p.apply_async() is asynchronous, non-blocking, and can switch processes at any time according to the system schedule without waiting for the current process to execute.

p.close() closes the process pool

p.join() main process blocked, waiting for all worker processes to exit, can only be called after close()

from multiprocessing import Pool
import time

def work(a):
    print("We are in class")
    time.sleep(2)
    return a * 3

if __name__ == "__main__":
    #Define a process pool with a maximum number of processes of 3
    p = Pool(3)
    
    li = []
    for i in range(6):
        #P.apply_ Async (target of call, parameter passed)
        res = p.apply_async(work,args=(i,)) #Asynchronous Run
    	# Asynchronous: The process does not have to wait all the time, but proceeds with the following, regardless of the status of other processes
        
        li.append(res) #Save results
        print(res.get()) #Print results
    
    # Close process pool
    p.close()
    # Waiting for all subprocesses in the p process pool to finish executing, must be placed behind the close method
    p.join()
    
    # Use get to get apply_ Results of Async
    for i in li:
        print(i.get())

      

18.4 Protocol

18.4.1 Introduction

Protocol, also known as microthreading, fibers. English name Coroutine

Collaboration is another way in python to achieve multitasking, but takes up less execution units (understood as required resources) than threads. Why is it an execution unit because it has its own CPU context? This allows us to switch one protocol to another only at the right time.

The program can run as long as the CPU context is saved or restored during this process.

Concurrency under a single thread, also known as a microthread

For a collaboration, the programmer is God, and it executes wherever you want it to.

Use scenarios:

  1. If there are more io operations in a thread, the collaboration works better
  2. Suitable for high concurrency processing

Simple implementation protocol

import time

def work1():
    while True:
        yield 'Brother in distress'

def work2():
    while True:
        yield 'The Internet is too deep for you to hold it'

if __name__ == "__main__":
    
    w1 = work1()
    w2 = work2()
    
    while True:
        print(next(w1))
        print(next(w2))
# Programmers can simply control the execution order of w1 and w2 through code

18.4.2 greenlet

greenlet: is a program module implemented in C that switches between any function by setting switch()

It is a manual switch. When IO operation is encountered, the program will block instead of switching automatically

Installation command pip install greenlet

Uninstall command pip uninstall ungreenlet

View the command pip list of installed modules

from greenlet import greenlet
def eat():
    print("Start eating supper")
    g2.switch() #Switch to g2 and do not execute later
    print("I'm full")

def study():
    print("Start Learning")
	print("Finish Learning")


# Instantiate a protocol object
# Greenlet (task name)
g1 = greenlet(eat)
g2 = greenlet(study)

g1.switch() # Switch to g1


# Output Results
# Start eating supper
# Start Learning
# Finish Learning

18.4.3 gevent

greenlet is manual and cumbersome

gevent is switched automatically

gevent encounters an IO operation and switches automatically, which is an active switch.

The main pattern used in gevent is greenlet

pip install gevent

import gevent

# Create a collaboration object
# Gevent. Spawn (function name) 
# join blocked, waiting for a protocol to finish executing
# The joinall parameter is a list of co-op objects that wait until all the co-ops have been executed before exiting


Execute A/B tasks. When A and B encounter time-consuming operations, gevent will let A continue executing and start executing B tasks at the same time

A completed the time-consuming operation, B completed the time-consuming operation in the corresponding time

# Keep in mind that py files do not rename third-party modules or built-in modules

import gevent
def write():
    print("We're writing code")
    gevent.sleep(1)   #While waiting, open all other protocols so that all others (g1,g2) execute concurrently
    print("Finished at last")

def listen():
    print("Now listen well")
    gevent.sleep(1) 
    print("Break")
    
g1 = gevent.spawn(write)
g2 = gevent.spawn(listen)


g1.join() #Waiting for g1 object to finish executing
g2.join() #Waiting for g2 object execution to end


# Run Results
# We're writing code
# Now listen well
# Finished at last
# Break

Join all needs to wait for all the coordinator objects to execute before exiting

def work(name):
    for i in range(3):
        gevent.sleep(1) #Start all other protocols while waiting for them to execute concurrently
        print(f'The function name is:{name},i The value of is:{i}')
        
gevent.joinall([
    gevent.spawn(work,'Small White'),
    gevent.spawn(work,'Goose'),
    
])


18.4.4 patches

Patch the program

monkey patch,

Monkey patch features:

  1. Has the ability to replace in the module
from gevent import monkey
import gevent
import time


monkey.patch_all() # Put time.sleep() code, replace with gevent.sleep() code. Must be written at the top

def work(name):
    for i in range(3):
        # Replace the code used for the time-consuming operation with the code inside the gevent that implements the time-consuming operation itself
        time.sleep(1)
        print(f'The function name is:{name},i The value of is:{i}')

gevent.joinall([
    gevent.spawn(work,'Small White'),
    gevent.spawn(work,'Goose'),
    
])


18.4.5 Comprehensive example

import gevent

def funa():
	print("wsc: Something's happening today, with grandchildren yn Make a phone call")  #1
	gevent.sleep(2)
	print("wsc: Why did you hang up suddenly and call back?...") #5
    
def funb():
	print("Sun yn: wsc Call me now.") #2
	gevent.sleep(3)
	print("Sun yn: He called again")   #6

def func():
    print("What are you doing, dear?") #3
    gevent.sleep(1)
    print("You're here")     #4

gevent.joinall([
    gevent.spawn(funa),
    gevent.spawn(funb),
    gevent.spawn(func)
])

# Output Results
wsc: Something's happening today, with grandchildren yn Make a phone call
 Sun yn: wsc Call me now.
What are you doing, dear?
You're here
wsc: Why did you hang up suddenly and call back?...
Sun yn: He called again

summary

  1. Process is the basic unit of resource allocation and thread is the basic unit of CPU scheduling
  2. Contrast:
    1. Process: Switching requires the most resources and is less efficient
    2. Threads: switch requires average resources and efficiency
    3. Protocol: Switching requires less resources and is more efficient
  3. Multithreaded: Suitable for IO-intensive operations (read and write more data, such as crawlers)
  4. Multiprocess: suitable for cpu intensive operations (scientific calculation, calculation of pi, HD decoding of video)
  5. A running program has at least one process and a process has at least one thread.

Topics: Python Multithreading