1. course
1. Two ways of process creation
-
The first way to start the process is:
from multiprocessing import Process import random import time def task(name): print(f'{name} is running') time.sleep(random.randint(1, 3)) print(f'{name} is gone') if __name__ == '__main__': # In windows environment, the opening process must be under _name_=='_main_'. p = Process(target=task, args=('Chang Xin')) # Create a process object p.start() ''' //Just want the operating system to send out a signal to open up a sub-process, and then execute the next line. When the operating system receives the signal, it will open up a sub-process space from memory, then load all the data copy of the main process into the sub-process, and then call the cpu to execute the opening sub-process. ''' print('start') time.sleep(2) # So always execute the code of the main process first.
-
The second way to start the process is:
from multiprocessing import Process import random import time class MyProcess(Process): def __init__(self, name): super().__init__() self.name = name def run1(self): print(f'{self.name} is running') time.sleep(random.randint(1, 3)) print(f'{self.name} is gone') if __name__ == '__main__': p = MyProcess('Chang Xin') p.start() print('==main')
-
Simple application
# Simple application/ from multiprocessing import Process import time def task(name): print(f'{name} is running') time.sleep(1) print(f'{name} is gone') def task1(name): print(f'{name} is running') time.sleep(2) print(f'{name} is gone') def task2(name): print(f'{name} is running') time.sleep(3) print(f'{name} is gone') if __name__ == '__main__': p1 = Process(target=task, args=('Changxin No.1 Hero',)) p2 = Process(target=task, args=('Changxin No.2 Hero',)) start_time = time.time() task(1) task1(2) task2(3) print(f'Ending time{time.time() - start_time}') # Three processes perform three tasks concurrently or in parallel # Creating processes is parallel, not serial
2. Getting process pid
import os print(f'Subprocesses:{os.getpid()}') print(f'Main process:{os.getppid()}') cmd Command View pid tasklist View all processes pid tasklist|findstr pycharm See pycharm Of pid
from multiprocessing import Process import os print(f'Subprocesses:{os.getpid()}') print(f'Main process:{os.getppid()}') def task(name): print(f'Subprocesses:{os.getpid()}') print(f'Main process:{os.getppid()}') if __name__ == '__main__': p = Process(target=task, args=('Chang Xin',)) p.start() print('==Main Start') print(f'==main{os.getpid()}') print(f'===main{os.getppid()}')
3. Spatial isolation between validation processes
copy the main process at the beginning of the sub-process, and then the main process has no connection with the sub-process and does not share anything.
from multiprocessing import Process import time name = 'Chang Xin' def task(): global name name = 'Guo Ji' print(f'The name of the child process is: {name}') if __name__ == '__main__': p = Process(target=task) p.start() time.sleep(1) print(f'The name of the main process is: {name}') ----------------------------Partition line----------------------------- lst = ['Guo Suhui', ] def task1(): lst.append('Guo Ji') print(f'The name of the child process is: {lst}') if __name__ == '__main__': p = Process(target=task1) p.start() time.sleep(2) print(f'The name of the main process is: {lst}')
4. join
join lets the main process wait for the child process to finish before executing the main process Join is only for the main process, and if the join is joined several times below, it is not blocked. Join is blocking. The main process has join. The code under the main process is not executed until the process has finished executing.
# Correct Key Points from multiprocessing import Process import time def task(name): print(f'{name} is running') time.sleep(2) print(f'{name} is gone') if __name__ == '__main__': start_time = time.time() l1 = [] for i in range(1, 4): p = Process(target=task, args=(i,)) l1.append(p) p.start() for i in l1: i.join() print(f'==main{time.time() - start_time}') //Error demonstration: for i in range(1,4): p = Process(target=task,args=(i,)) p.start() p.join() ''' p1 = Process(target=task,args=(1,)) p1.start() p1.join() p2 = Process(target=task,args=(2,)) p2.start() p2.join() p3 = Process(target=task,args=(3,)) p3.start() p3.join() '''
# join lets the main process wait for the child process to finish before executing the main process from multiprocessing import Process import time def task(name): print(f'{name} is running') time.sleep(2) print(f'{name} is gone') if __name__ == '__main__': p = Process(target=task, args=('Chang Xin',)) p.start() p.join() print('==The main process begins') # Multiple processes use join def task(name, sec): print(f'{name} is running') time.sleep(sec) print(f'{name} is gone') if __name__ == '__main__': star_time = time.time() start_time = time.time() p1 = Process(target=task, args=('Chang Xin', 1)) p2 = Process(target=task, args=('Li Ye', 2)) p3 = Process(target=task, args=('Sea Dogs', 3)) p1.start() p2.start() p3.start() # Join is only for the main process, and if the join is joined several times below, it is not blocked. p1.join() p2.join() p3.join() print(f'==main{time.time()-start_time}') # ---------------------------------------------------------- def task(name, sec): print(f'{name}is running') time.sleep(sec) print(f'{name} is gone') if __name__ == '__main__': start_time = time.time() p1 = Process(target=task, args=('Chang Xin', 3)) p2 = Process(target=task, args=('Li Ye', 2)) p3 = Process(target=task, args=('Sea Dogs', 1)) p1.start() p2.start() p3.start() # join is blocking p1.join() # Wait 2s print(f'==Main 1:{time.time() - start_time}') p2.join() print(f'===Main 2:{time.time() - start_time}') p3.join() print(f'==Main 3:{time.time() - start_time}')
5. Other parameters of the process
p.terminate() # Kill the child process*** print(p.is_alive()) # *** Judging the False True of the child process p.join() # ***
from multiprocessing import Process import time def task(name): print(f'{name} is running') time.sleep(2) print(f'{name} is gone') if __name__ == '__main__': p = Process(target=task, args=('Chang Xin', ), name='Alex') p.start() time.sleep(1) p.terminate() # Kill the child process*** p.join() # *** time.sleep(1) print(p.is_alive()) # *** Judgement of child process False print(p.name) p.name = 'sb' print(p.name) print('Main Procedure Starts')
6. Daemon process
p.daemon = True Set the p subprocess to a daemon. As long as the main process ends, the daemon must be set before the subprocess opens.
from multiprocessing import Process import time def task(name): print(f'{name} is running') time.sleep(2) print(f'{name} is gone') if __name__ == '__main__': p = Process(target=task, args=('Chang Xin',)) # Create a process object p.daemon = True # Set the p subprocess to a daemon, and the daemon will end as soon as the main process is finished. p.start() # p.daemon = True # Be sure to set it before the child process opens time.sleep(1) print('===main')
7. Zombie Process Orphan Process
unix-based environment (linux, macOS)
-
The main process needs to wait for the child process to finish before the main process can finish.
== The main process monitors the running status of the sub-process at all times. When the sub-process is finished, the sub-process is recovered within a period of time.==
-
Why doesn't the main process reclaim the child process immediately after it's finished?
- The main process and the child process are asynchronous. The main process cannot immediately capture when the child process ends.
- If the resource is released in memory immediately after the end of the child process, the main process will not be able to monitor the status of the child process.
-
unix provides a mechanism for solving the above problems.
After all the sub-processes are finished, the operation links of the files and most of the data in memory will be released immediately, but some contents will be retained: = process number, end time, running status==, waiting for the main process to monitor and reclaim.
Botnet process: === After all the sub-processes are finished, they will enter the botnet state before they are reclaimed by the main process.==
-
Is the zombie process harmful???
If the parent process does not wait/wait pid the botnet process, a large number of Botnet processes will be generated, which will occupy memory and process pid number.
-
Orphan process:
The parent process ends for some reason, but your child processes are still running, so your child processes become orphan processes. If your father process ends, all your orphan processes will be recycled by the init process, and init will become your father process and recycle you.
-
How to solve the zombie process???
The parent process generates a large number of child processes, but does not recycle, so a large number of zombie processes will be formed. The solution is to kill the parent process directly ==, and turn all zombie processes into orphan processes, which will be recycled by init.
8. Mutex Lock
Mutex: A mechanism similar to "lock" that spreads a number of program fragments between different processes. When a process runs one of the program fragments, other processes cannot run any of them until the process has run the program fragment. Version 1: Now all processes are preemptive printers concurrently. Concurrent is efficiency first, but at present our demand is order first. When multiple processes strengthen a resource together, the priority should be guaranteed: serial, one by one. Version 2: We use join to solve the serial problem and ensure that the order is first, but the first one is fixed. This is unreasonable. When you are competing for the same resource, you should first come, first served and ensure fairness. The difference between lock and join. Common point: It can turn concurrency into serialization, which guarantees the order. Differences: join artificially sets the order, lock lets it contend for the order, guarantees fairness.
Version 3:(for i in loop) from multiprocessing import Process from multiprocessing import Lock import time import random import sys def task1(p, lock): lock.acquire() print(f'{p}It's starting to print.') time.sleep(random.randint(1, 3)) print(f'{p}It's starting to print.') lock.release() def task2(p, lock): lock.acquire() print(f'{p}It's starting to print.') time.sleep(random.randint(1, 3)) print(f'{p}It's starting to print.') lock.release() def task3(p, lock): lock.acquire() print(f'{p}It's starting to print.') time.sleep(random.randint(1, 3)) print(f'{p}It's starting to print.') lock.release() if __name__ == '__main__': mutex = Lock() for i in range(1, 4): p = Process(target=getattr(sys.modules[__name__], f'task{i}'), args=(i, mutex)) p.start()
9. Communication between processes
Processes are isolated at the memory level, but files are on disk
1. File-based Communication
Ticket snatching system. 1. First, you can check the tickets. Query the number of remaining tickets. Concurrent 2. Purchase, send the request to the server, the server receives the request, and return the number of tickets - 1 to the front end. [Serial]. When many processes have a single resource (data), you need to ensure that the sequence (data security) must be serial. Mutual exclusion lock: guarantees the order of fairness and the security of data. Communication between file-based processes: 1. Low efficiency. 2. It's troublesome to lock and deadlock easily.
from multiprocessing import Process from multiprocessing import Lock import random import time import json import os def search(): time.sleep(random.randint(1, 3)) # Analog Network Delay (Query Link) with open('db.json', encoding='utf-8') as f1: dic = json.load(f1) print(f'{os.getpid()}Checked the remaining tickets.,Remaining{dic["count"]}') def paid(): with open('db.json', encoding='utf-8') as f1: dic = json.load(f1) if dic['count'] > 0: dic['count'] -= 1 time.sleep(random.randint(1, 3)) with open('db.json', encoding='utf-8', mode='w') as f1: json.dump(dic, f1) print(f'{os.getpid()}Successful ticket purchase, Remaining{dic["count"]}ticket') else: time.sleep(1) print(f'{os.getpid()}Unsuccessful ticket purchase') def task(lock): search() lock.acquire() paid() lock.release() if __name__ == '__main__': mutex = Lock() for i in range(5): p = Process(target=task, args=(mutex,)) p.start()
2. Queue-based communication
Queue: Understand the queue as a container that can hold some data. Characteristics of the queue: First in, first out and always keep this data. FIFO badminton cone. q.put(5555) When the queue is full, the put data in the process will be blocked. print(q.get()) When the data is fetched, the process get data will also be blocked until a process put data. print(q.get(timeout=3)) blocks for 3 seconds, and after 3 seconds it blocks to report an error directly. Print (q. get (block = False) will report an error whenever it encounters a blocking.
Using queues Queue Improve the operation of ballot system: - Number of votes stored in a queue - Open multiple processes for voting, check votes for concurrent effect, buy votes for serial effect - Success and failure of purchase need prompt from multiprocessing import Process from multiprocessing import Queue import random import time import os def search(q): get = q.get() # take out print(f'{os.getpid()}Checked the remaining tickets.,Remaining{get["count"]}') q.put(get) # input def paid(q): time.sleep(random.randint(1, 3)) q_dic = q.get() # take out q.put(q_dic) # input if q_dic["count"] > 0: q_dic["count"] -= 1 print(f"{os.getpid()}Successful Purchase!{q_dic['count']} ") try: q.put(q_dic, block=False) except Exception: pass else: print(f"{os.getpid()}Failure to buy") def task(q): search(q) paid(q) if __name__ == '__main__': q = Queue(1) q.put({"count": 3}) for i in range(5): p = Process(target=task, args=(q,)) p.start()
Simulated double eleven queue to snap up millet mobile phones, multi-user snap-up, can only select the first 10 users: //Turn on multiple users to snap up mobile phones. //Only 10 people are allowed to buy. //Finally, the rankings of 10 users are displayed. import os from multiprocessing import Queue from multiprocessing import Process def task(q): try: q.put(f'{os.getpid()}', block=False) except Exception: return if __name__ == '__main__': q = Queue(10) for i in range(100): p = Process(target=task, args=(q,)) p.start() for i in range(1, 10): print(f'No. 1{i}Users are{q.get()}')
10. Producer-consumer module
Programming ideas, models, design patterns, theories and so on, are all given to you a programming method, after you encounter similar situations, you can apply it.
Three elements of producer-consumer model:
Producer: Generating data
Consumers: Receiving data for further processing
Container: basin (queue)
So what role does the queue container play? It acts as a buffer, balances productivity and consumption, and decouples.
from multiprocessing import Process from multiprocessing import Queue import random import time def producer(q, name): for i in range(1, 6): time.sleep(random.randint(1, 2)) res = f'{i}Steamed bun' q.put(res) print(f'Producer{name}Produced{res}') def consumer(q, name): while 1: try: food = q.get(timeout=3) time.sleep(random.randint(1, 3)) print(f'Consumer{name}Eat, eat, eat{food}') except Exception: pass if __name__ == '__main__': q = Queue() p1 = Process(target=producer, args=(q, 'Chang Xin')) p2 = Process(target=consumer, args=(q, 'Chang Xinxin')) p1.start() p2.start()
2. thread
1. Thread theory
1. What is threads: processes are resource units and threads are execution units. Process: The process will open up a process space in memory, copy all the data of the main process, and the thread will execute the code inside. 2. Thread vs process: 1. Opening a process costs a lot more than opening a thread. 2. Opening threads is very fast, tens to hundreds of times faster. 3. Threads and threads can share data, processes and processes need to use queues to achieve communication. 3. Application of threads: data sharing, low overhead, fast speed, Concurrency: a cpu seems to perform multiple tasks simultaneously, a single process opens three threads, and concurrently executes tasks The main thread sub-thread has no status, but who is working in a process? A main thread is working. When it is finished, you have to wait for other threads to finish working before you can finish the process.
-
== What is Thread==
A pipeline workflow.
Process: Open a process space in memory, copy all the resource data of the main process, and then call the cpu to execute the code.
The previous description is not specific enough:
Start a process:
Open a process space in memory, copy all the resource data of the main process, and then call the thread to execute the code.
Processes are resource units and threads are execution units.
Later you describe starting a process:
Open a process: The process opens up a process space in memory, duplicates all the data of the main process, and the thread executes the code inside.
-
== Thread vs process==
- Opening a process costs a lot more than opening a thread.
- Opening threads is very fast. It's tens to hundreds of times faster.
- Thread threads can share data, and processes need to communicate with each other through queues.
-
== Application of Threads==
-
Concurrency: A cpu looks like it performs multiple tasks at the same time.
A single process opens three threads. Concurrent execution tasks.
Open three concurrent execution tasks.
Text Editor:
- Enter text.
- Display on screen.
- Save it on disk.
It's great to open multithreading:
Data sharing, low cost and fast speed.
The main thread sub-thread has no status, but who is working in a process? A main thread is working. When it is finished, you have to wait for other threads to finish working before you can finish the process.
-
2. Two ways to open threads
The first way: from threading import Thread import time def task(name): print(f'{name} is running') time.sleep(1) print(f'{name} in gone') if __name__ == '__main__': p1 = Thread(target=task, args=('Chang Xin',)) p1.start() print('===Main thread')
The second way: from threading import Thread import time class MyThread(Thread): def __init__(self, name, l1, s1): super().__init__() self.name = name self.l1 = l1 self.s1 = s1 def run(self): print(f'{self.name} is running') print(f'{self.l1} is running') print(f'{self.s1} is running') time.sleep(1) print(f'{self.name} is gone') print(f'{self.l1} is gone') print(f'{self.s1} is gone') if __name__ == '__main__': p1 = MyThread('Chang Xin', [1, 2, 3], '100') p1.start() print('===Main thread')
3. Code comparison of thread vs process
-
Open Speed Contrast, Thread Contrast Process
from multiprocessing import Process def work(): print('hello') if __name__ == '__main__': # Open threads under the main process t = Process(target=work) t.start() print('Main thread/Main process')
from threading import Thread import time def task(name): print(f'{name} is running') time.sleep(1) print(f'{name} is gone') if __name__ == '__main__': t1 = Thread(target=task, args=('Sea Dogs',)) t1.start() print('===Main thread') # Threads are not primary or secondary.
-
Contrast PID = the same pid==
from threading import Thread import os def task(): print(os.getpid()) if __name__ == '__main__': t1 = Thread(target=task) t2 = Thread(target=task) t1.start() t2.start() print(f'===Main thread{os.getpid()}')
-
Sharing internal data with threads in the same process
Resource data within the same process is shared across multiple threads of the process. from threading import Thread x = 3 def task(): global x x = 100 if __name__ == '__main__': t1 = Thread(target=task) t1.start() t1.join() print(f'===Main thread{x}')
4. Other Thread-related Approaches (Understanding)
# Method of Thread Instance Object p1.setName('Subthread 1') # Set the thread name p1.getName() # Returns the thread name ---print(p1.name) # Get the thread name*** print(p1.isAlive()) # Returns whether the thread is active. # Some methods provided by threading module: print(current_thread()) # Get the object of the current thread print(currentThread()) # Get the object of the current thread print(enumerate()) # Returns a list of all threaded objects ---print(activeCount()) # *** returns the number of threading threading. enumerate ()) running, with the same result as len(threading.enumerate()).
from threading import Thread from threading import currentThread from threading import enumerate from threading import activeCount import os import time x = 9 def task(): print(currentThread()) time.sleep(1) print('666') if __name__ == '__main__': p1 = Thread(target=task, name='p1') # name Sets Thread name p2 = Thread(target=task, name='p2') # name Sets Thread name p1.start() p2.start() # Method of Thread Instance Object p1.setName('Subthread 1') # Set the thread name p2.setName('Subthread 1') # Set the thread name p1.getName() # Returns the thread name p2.getName() # Returns the thread name print(p1.name) # Get the thread name*** print(p2.name) # Get the thread name*** print(p1.isAlive()) # Returns whether the thread is active. print(p2.isAlive()) # Returns whether the thread is active. # Some methods provided by threading module: print(currentThread()) # Get the object of the current thread print(enumerate()) # Returns a list of all threaded objects print(activeCount()) # *** returns the number of threading threading. enumerate ()) running, with the same result as len(threading.enumerate()). print(f'Main thread{os.getpid()}')
5. Daemon threads (test points)
join: Blocking tells the main thread to wait until my sub-thread has finished executing before executing the main thread When does the main thread end??? The daemon thread waits for the non-daemon sub-thread and the main thread to finish.
from threading import Thread import time def foo(): print(123) # 1 time.sleep(1) print("end123") # 4 def bar(): print(456) # 2 time.sleep(2) print("end456") # 5 t1 = Thread(target=foo) t2 = Thread(target=bar) t1.daemon = True t1.start() t2.start() print("main-------") # 3 # Result: # 123 # 456 # main------- # end123 # end456
6. Mutex Lock (Test Point)
Programming Serial after Normal Locking It is not necessarily necessary to add delay after the lock, and sometimes queue jumping may occur.
from threading import Thread from threading import Lock import time import random x = 10 def task(lock): lock.acquire() time.sleep(random.randint(1, 3)) # Card point global x temp = x time.sleep(0.1) temp = temp - 1 x = temp lock.release() if __name__ == '__main__': mutex = Lock() l1 = [] for i in range(10): t = Thread(target=task, args=(mutex,)) l1.append(t) t.start() time.sleep(1) print(f'Main thread{x}')
7. Deadlock Phenomenon and Recursive Lock
- Deadlock phenomenon is: A process with A key to find B key, B process with B key to find A key
- Recursive Lock: Deadlock can be solved. When business needs multiple locks, priority should be given to recursive locks.
- Locks must be written as== lock_A = lock_B = RLock()== format, the principle is== pid=== the same, every lock once, the number of locks plus one, unlock time minus one. If the number of locks is not zero, other threading threading import RLock== import module.
Deadlock phenomenon: from threading import Thread from threading import Lock import time lock_A = Lock() lock_B = Lock() class MyThread(Thread): def run(self): self.f1() self.f2() def f1(self): lock_A.acquire() print(f'{self.name}Get it A') lock_B.acquire() print(f'{self.name}Get it B') lock_B.release() lock_A.release() def f2(self): lock_B.acquire() print(f'{self.name}Get it B') time.sleep(0.1) lock_A.acquire() print(f'{self.name}Get it A') lock_A.release() lock_B.release() if __name__ == '__main__': for i in range(3): t = MyThread() t.start()
recursive mutex: from threading import Thread from threading import RLock import time lock_A = lock_B = RLock() class MyThread(Thread): def run(self): self.f1() self.f2() def f1(self): lock_A.acquire() print(f'{self.name}Get it A') lock_B.acquire() print(f'{self.name}Get it B') lock_B.release() lock_A.release() def f2(self): lock_B.acquire() print(f'{self.name}Get it B') lock_A.acquire() print(f'{self.name}Get it A') time.sleep(1) lock_A.release() lock_B.release() if __name__ == '__main__': for i in range(3): t = MyThread() t.start()
8. Signals
It's also a lock that controls the number of concurrencies
== from threading import current_thread=== Get the object module of the current thread
== from threading import Semaphore==Import semaphore module
== sem = Semaphore(5) = = infinite when instantiated semaphores are written
== sem.acquire()=== Get semaphores in the function
from threading import Thread from threading import Semaphore from threading import current_thread import random import time sem = Semaphore(5) def task(): sem.acquire() print(f'{current_thread().name} Room') time.sleep(random.randint(1, 3)) sem.release() if __name__ == '__main__': for i in range(30): t = Thread(target=task, ) t.start()
9. ==GIL==Global Interpreter Lock
Many self-proclaimed gods say that the GIL lock is python's fatal flaw, Python can not be multi-core, concurrent, etc..
-
== In theory, multithreading of a single process can take advantage of multicore.==
However, programmers who develop Cpython interpreters lock threads that enter the interpreter.
-
Why lock?
- At that time, it was a single-core era, and the price of cpu was very expensive.
- If there is no global interpreter lock, the programmer who develops Cpython interpreter will actively lock, unlock, troublesome, deadlock phenomena and so on in the source code. In order to save time, he directly enters the interpreter and adds a lock to the thread.
- == Advantages: It guarantees the security of data resources of Cpython interpreter.==
- == Disadvantage: Multi-threading of a single process can't take advantage of multi-core.==
Jpython does not have GIL locks. pypy does not have GIL locks.
-
Now in the multi-core era, can I remove Cpython's GIL lock?
Because all the business logic of the Cpython interpreter is implemented around a single thread, it is almost impossible to remove the GIL lock.
== Multi-threading of a single process can be concurrent, but it can't take advantage of multi-core and can't be parallel. Multiple processes can be concurrent and parallel.==
== io-intensive: multi-threaded and concurrent execution of a single process is appropriate==
== Computing intensive: multi-process parallelism==
10.==GIL== Lock and== lock== Lock
- Similarity: All locks are identical, mutually exclusive.
- Difference:
- GIL locks global interpreter locks to protect the security of resource data within the interpreter.
- GIL locks and releases without manual operation.
- The mutex defined in your code protects the security of resource data in the process.
- Self-defined mutexes must be manually locked and unlocked.
- GIL locks global interpreter locks to protect the security of resource data within the interpreter.
11. Verify the efficiency of compute-intensive IO-intensive
== io-intensive: multi-threaded concurrency of a single process is efficient and appropriate. concurrent execution==
== Computing intensive: multi-process concurrent parallel efficiency. Parallel==
-
Code validation:
Computing intensive: Multithread concurrency of a single process vs Concurrent Parallelism of Multiple Processes from multiprocessing import Process from threading import Thread import time def task(): count = 0 for i in range(30000000): # (30 million) count += 1 if __name__ == '__main__': # Multi-process concurrency, parallel 2.3737263679504395 seconds start_time = time.time() l1 = [] for i in range(4): p = Process(target=task,) l1.append(p) p.start() for i in l1: i.join() print(f'execution time:{time.time()-start_time}') # Multithread concurrency 6.290118932723999 seconds start_time = time.time() l1 = [] for i in range(4): p = Thread(target=task,) l1.append(p) p.start() for i in l1: i.join() print(f'execution time:{time.time()-start_time}') //Computing-intensive: High concurrent and parallel efficiency of multi-process.
# IO-intensive: Multi-threaded concurrency of a single process and concurrent parallelism of multiple processes in vs from multiprocessing import Process from threading import Thread import time def task(): count = 0 time.sleep(1) count += 1 if __name__ == '__main__': # Multiprocess concurrency, parallel 3.01239587646484 seconds start_time = time.time() l1 = [] for i in range(50): p = Process(target=task, ) l1.append(p) p.start() for p in l1: p.join() print(f'Executive efficiency:{time.time() - start_time}') # Multithread concurrency 1.0087950229644775 seconds start_time = time.time() l1 = [] for i in range(50): p = Thread(target=task,) l1.append(p) p.start() for p in l1: p.join() print(f'Executive efficiency:{time.time()- start_time}') //For IO-intensive: multi-threaded concurrency of a single process is efficient.
12. Multithread implementation = = socket = = Communication
Whether multi-threaded or multi-process, if according to the previous writing, to a client request, I will open a thread, to a request to open a thread, should be like this: within the scope of your computer, the more threaded processes open, the better.
Server: from threading import Thread import socket def communicate(conn, addr): while 1: try: from_client_data = conn.recv(1024) print(f'come{addr[1]}Information{from_client_data.decode("utf-8")}') to_client_data = input('>>>').strip() conn.send(to_client_data.encode('utf-8')) except Exception: break conn.close() def _accket(): server = socket.socket() server.bind(('127.0.0.1', 8080)) server.listen(5) while 1: conn, addr = server.accept() t = Thread(target=communicate, args=(conn, addr)) t.start() if __name__ == '__main__': _accket()
Client: import socket client = socket.socket() client.connect(('127.0.0.1', 8080)) while 1: try: to_server_data = input('>>>').strip() client.send(to_server_data.encode('utf-8')) from_server_data = client.recv(1024) print(f'Messages from the server: {from_server_data.decode("utf-8")}') except Exception: break client.close()
13. Process pool thread pool
from concurrent.futures import ProcessPoolExecutor # Thread pool module from concurrent.futures import ThreadPoolExecutor # Process pool module p = ProcessPoolExecutor() # By default, the number of processes in the process pool is equal to the number of cpu cores (parallel + concurrent) t = ThreadPoolExecutor() # Default not to write, number of cpu cores * 5 threads (concurrent) print(os.cpu_count()) # Look at the computer cores
from concurrent.futures import ProcessPoolExecutor from concurrent.futures import ThreadPoolExecutor import random import time import os print(os.cpu_count()) # Look at the computer cores def task(): print(f'pid Number: {os.getpid()} Coming') time.sleep(random.randint(1, 3)) if __name__ == '__main__': # Open the process pool (parallel (parallel + concurrent) p = ProcessPoolExecutor() # By default, the number of processes in the process pool is equal to the number of CPUs for i in range(20): p.submit(task, ) # Open thread pool (concurrency) t = ThreadPoolExecutor() # Default not to write, number of cpu * 5 threads for i in range(40): t.submit(task, )
14. Blocking non-blocking asynchronous synchronization
What's the problem? 1. The process of analysis results is serial and inefficient. 2. After you crawl all the results successfully, put them in a list and analyze them. Question 1: Solution: In the open process pool, the process is reopened, which consumes resources. ''' It takes 2 seconds to crawl a web page and 10 pages to crawl concurrently: 2. more seconds. Analysis tasks: 1s. 10s. Total 12.2 seconds. Now this version of the process: Asynchronously send out 10 web crawling tasks, and then four processes concurrently (in parallel) first complete four web crawling tasks, then who ends first and who proceeds next Crawl until all 10 tasks are successful. Put 10 crawl results in a list and analyze them serially. It takes 2 seconds to crawl a web page. Analytical tasks: 1 seconds, a total of 3 seconds (open process wastage). . 10s. The next version of the process: Asynchronously send out 10 tasks of crawling web pages + analysis, and then four processes concurrently (in parallel) complete four tasks of crawling web pages + analysis. Then who will finish the next crawl + analysis task first, until all 10 crawl + analysis tasks are completed successfully. ''' Callback function is the main process to help you achieve, callback function to help you analyze tasks. Clear process tasks: only one network crawl. Analysis Task: Callback function is executed. Decoupling between functions. Extreme case: If the callback function is an IO task, then because your callback function is done by the main process, it may affect efficiency. Callback is not omnipotent, if the task of callback is IO, Then asynchronous + callback mechanism is not good. If you want efficiency, you can only sacrifice overhead and open a thread process pool. Asynchronism is callback! This is wrong!! Asynchronism, callback are two concepts. ''' If there are multiple tasks, multi-process and multi-threaded IO tasks. # 1. The remaining tasks are non-IO blocking. Asynchronous + callback mechanism # 2. The remaining tasks IO IO asynchronous + callback mechanism for multiple tasks # 3. The remaining task IO >= IO solution for multiple tasks, or two process thread pools. '''