Concurrent programming, summary

Posted by The Jackel on Sat, 24 Aug 2019 14:48:44 +0200

1. course

1. Two ways of process creation

  1. The first way to start the process is:

    from multiprocessing import Process
    import random
    import time
    
    
    def task(name):
        print(f'{name} is running')
        time.sleep(random.randint(1, 3))
        print(f'{name} is gone')
    
    
    if __name__ == '__main__':  # In windows environment, the opening process must be under _name_=='_main_'.
        p = Process(target=task, args=('Chang Xin'))  # Create a process object
        p.start()  
        '''
        //Just want the operating system to send out a signal to open up a sub-process, and then execute the next line. When the operating system receives the signal, it will open up a sub-process space from memory, then load all the data copy of the main process into the sub-process, and then call the cpu to execute the opening sub-process.
        '''
        print('start')
        time.sleep(2)
    # So always execute the code of the main process first.
  2. The second way to start the process is:

    from multiprocessing import Process
    import random
    import time
    
    
    class MyProcess(Process):
        def __init__(self, name):
            super().__init__()
            self.name = name
    
        def run1(self):
            print(f'{self.name} is running')
            time.sleep(random.randint(1, 3))
            print(f'{self.name} is gone')
    
    
    if __name__ == '__main__':
        p = MyProcess('Chang Xin')
        p.start()
        print('==main')
  3. Simple application

    # Simple application/
    from multiprocessing import Process
    import time
    
    
    def task(name):
        print(f'{name} is running')
        time.sleep(1)
        print(f'{name} is gone')
    
    
    def task1(name):
        print(f'{name} is running')
        time.sleep(2)
        print(f'{name} is gone')
    
    
    def task2(name):
        print(f'{name} is running')
        time.sleep(3)
        print(f'{name} is gone')
    
    
    if __name__ == '__main__':
        p1 = Process(target=task, args=('Changxin No.1 Hero',))
        p2 = Process(target=task, args=('Changxin No.2 Hero',))
        start_time = time.time()
        task(1)
        task1(2)
        task2(3)
        print(f'Ending time{time.time() - start_time}')
    # Three processes perform three tasks concurrently or in parallel
    # Creating processes is parallel, not serial

2. Getting process pid

import os

print(f'Subprocesses:{os.getpid()}')
print(f'Main process:{os.getppid()}')

cmd Command View pid
tasklist View all processes pid
tasklist|findstr pycharm  See pycharm Of pid
from multiprocessing import Process
import os

print(f'Subprocesses:{os.getpid()}')
print(f'Main process:{os.getppid()}')


def task(name):
    print(f'Subprocesses:{os.getpid()}')
    print(f'Main process:{os.getppid()}')


if __name__ == '__main__':
    p = Process(target=task, args=('Chang Xin',))
    p.start()
    print('==Main Start')
    print(f'==main{os.getpid()}')
    print(f'===main{os.getppid()}')

3. Spatial isolation between validation processes

copy the main process at the beginning of the sub-process, and then the main process has no connection with the sub-process and does not share anything.
from multiprocessing import Process
import time

name = 'Chang Xin'


def task():
    global name
    name = 'Guo Ji'
    print(f'The name of the child process is: {name}')


if __name__ == '__main__':
    p = Process(target=task)
    p.start()
    time.sleep(1)
    print(f'The name of the main process is: {name}')
----------------------------Partition line-----------------------------
lst = ['Guo Suhui', ]


def task1():
    lst.append('Guo Ji')
    print(f'The name of the child process is: {lst}')


if __name__ == '__main__':
    p = Process(target=task1)
    p.start()
    time.sleep(2)
    print(f'The name of the main process is: {lst}')

4. join

join lets the main process wait for the child process to finish before executing the main process
 Join is only for the main process, and if the join is joined several times below, it is not blocked.
Join is blocking. The main process has join. The code under the main process is not executed until the process has finished executing.
# Correct Key Points
from multiprocessing import Process
import time


def task(name):
    print(f'{name} is running')
    time.sleep(2)
    print(f'{name} is gone')


if __name__ == '__main__':
    start_time = time.time()
    l1 = []
    for i in range(1, 4):
        p = Process(target=task, args=(i,))
        l1.append(p)
        p.start()

    for i in l1:
        i.join()

    print(f'==main{time.time() - start_time}')

//Error demonstration:
for i in range(1,4):
    p = Process(target=task,args=(i,))
    p.start()
    p.join()
'''
p1 = Process(target=task,args=(1,))
p1.start()
p1.join()
p2 = Process(target=task,args=(2,))
p2.start()
p2.join()
p3 = Process(target=task,args=(3,))
p3.start()
p3.join()

'''
# join lets the main process wait for the child process to finish before executing the main process
from multiprocessing import Process
import time


def task(name):
    print(f'{name} is running')
    time.sleep(2)
    print(f'{name} is gone')


if __name__ == '__main__':
    p = Process(target=task, args=('Chang Xin',))
    p.start()
    p.join()
    print('==The main process begins')


# Multiple processes use join

def task(name, sec):
    print(f'{name} is running')
    time.sleep(sec)
    print(f'{name} is gone')


if __name__ == '__main__':
    star_time = time.time()
    start_time = time.time()
    p1 = Process(target=task, args=('Chang Xin', 1))
    p2 = Process(target=task, args=('Li Ye', 2))
    p3 = Process(target=task, args=('Sea Dogs', 3))
    p1.start()
    p2.start()
    p3.start()
    # Join is only for the main process, and if the join is joined several times below, it is not blocked.
    p1.join()
    p2.join()
    p3.join()
    print(f'==main{time.time()-start_time}')
# ----------------------------------------------------------
def task(name, sec):
    print(f'{name}is running')
    time.sleep(sec)
    print(f'{name} is gone')


if __name__ == '__main__':
    start_time = time.time()
    p1 = Process(target=task, args=('Chang Xin', 3))
    p2 = Process(target=task, args=('Li Ye', 2))
    p3 = Process(target=task, args=('Sea Dogs', 1))

    p1.start()
    p2.start()
    p3.start()
    # join is blocking

    p1.join()  # Wait 2s
    print(f'==Main 1:{time.time() - start_time}')
    p2.join()
    print(f'===Main 2:{time.time() - start_time}')
    p3.join()
    print(f'==Main 3:{time.time() - start_time}')

5. Other parameters of the process

 p.terminate()  # Kill the child process***
 print(p.is_alive())  # *** Judging the False True of the child process
 p.join()  # ***
from multiprocessing import Process
import time


def task(name):
    print(f'{name} is running')
    time.sleep(2)
    print(f'{name} is gone')


if __name__ == '__main__':
    p = Process(target=task, args=('Chang Xin', ), name='Alex')
    p.start()
    time.sleep(1)
    p.terminate()  # Kill the child process***
    p.join()  # ***
    time.sleep(1)
    print(p.is_alive())  # *** Judgement of child process False
    print(p.name)
    p.name = 'sb'
    print(p.name)
    print('Main Procedure Starts')

6. Daemon process

p.daemon = True
 Set the p subprocess to a daemon. As long as the main process ends, the daemon must be set before the subprocess opens.
from multiprocessing import Process
import time


def task(name):
    print(f'{name} is running')
    time.sleep(2)
    print(f'{name} is gone')


if __name__ == '__main__':
    p = Process(target=task, args=('Chang Xin',))  # Create a process object
    p.daemon = True  # Set the p subprocess to a daemon, and the daemon will end as soon as the main process is finished.
    p.start()
    # p.daemon = True  # Be sure to set it before the child process opens
    time.sleep(1)
    print('===main')

7. Zombie Process Orphan Process

unix-based environment (linux, macOS)

  • The main process needs to wait for the child process to finish before the main process can finish.

    == The main process monitors the running status of the sub-process at all times. When the sub-process is finished, the sub-process is recovered within a period of time.==

  • Why doesn't the main process reclaim the child process immediately after it's finished?

    1. The main process and the child process are asynchronous. The main process cannot immediately capture when the child process ends.
    2. If the resource is released in memory immediately after the end of the child process, the main process will not be able to monitor the status of the child process.
  • unix provides a mechanism for solving the above problems.

    After all the sub-processes are finished, the operation links of the files and most of the data in memory will be released immediately, but some contents will be retained: = process number, end time, running status==, waiting for the main process to monitor and reclaim.

  • Botnet process: === After all the sub-processes are finished, they will enter the botnet state before they are reclaimed by the main process.==

  • Is the zombie process harmful???

    If the parent process does not wait/wait pid the botnet process, a large number of Botnet processes will be generated, which will occupy memory and process pid number.

  • Orphan process:

    The parent process ends for some reason, but your child processes are still running, so your child processes become orphan processes. If your father process ends, all your orphan processes will be recycled by the init process, and init will become your father process and recycle you.

  • How to solve the zombie process???

    The parent process generates a large number of child processes, but does not recycle, so a large number of zombie processes will be formed. The solution is to kill the parent process directly ==, and turn all zombie processes into orphan processes, which will be recycled by init.

8. Mutex Lock

Mutex: 
    A mechanism similar to "lock" that spreads a number of program fragments between different processes. When a process runs one of the program fragments, other processes cannot run any of them until the process has run the program fragment.

Version 1:
    Now all processes are preemptive printers concurrently.
    Concurrent is efficiency first, but at present our demand is order first.
    When multiple processes strengthen a resource together, the priority should be guaranteed: serial, one by one.
Version 2:
    We use join to solve the serial problem and ensure that the order is first, but the first one is fixed.
    This is unreasonable. When you are competing for the same resource, you should first come, first served and ensure fairness.

The difference between lock and join.
Common point: It can turn concurrency into serialization, which guarantees the order.
Differences: join artificially sets the order, lock lets it contend for the order, guarantees fairness.
Version 3:(for i in loop)
from multiprocessing import Process
from multiprocessing import Lock
import time
import random
import sys


def task1(p, lock):
    lock.acquire()
    print(f'{p}It's starting to print.')
    time.sleep(random.randint(1, 3))
    print(f'{p}It's starting to print.')
    lock.release()


def task2(p, lock):
    lock.acquire()
    print(f'{p}It's starting to print.')
    time.sleep(random.randint(1, 3))
    print(f'{p}It's starting to print.')
    lock.release()


def task3(p, lock):
    lock.acquire()
    print(f'{p}It's starting to print.')
    time.sleep(random.randint(1, 3))
    print(f'{p}It's starting to print.')
    lock.release()


if __name__ == '__main__':
    mutex = Lock()
    for i in range(1, 4):
        p = Process(target=getattr(sys.modules[__name__], f'task{i}'), args=(i, mutex))
        p.start()

9. Communication between processes

Processes are isolated at the memory level, but files are on disk

1. File-based Communication

Ticket snatching system.
1. First, you can check the tickets. Query the number of remaining tickets. Concurrent
 2. Purchase, send the request to the server, the server receives the request, and return the number of tickets - 1 to the front end. [Serial].

When many processes have a single resource (data), you need to ensure that the sequence (data security) must be serial.
Mutual exclusion lock: guarantees the order of fairness and the security of data.
Communication between file-based processes:
    1. Low efficiency.
    2. It's troublesome to lock and deadlock easily.
from multiprocessing import Process
from multiprocessing import Lock
import random
import time
import json
import os


def search():
    time.sleep(random.randint(1, 3))  # Analog Network Delay (Query Link)
    with open('db.json', encoding='utf-8') as f1:
        dic = json.load(f1)
        print(f'{os.getpid()}Checked the remaining tickets.,Remaining{dic["count"]}')


def paid():
    with open('db.json', encoding='utf-8') as f1:
        dic = json.load(f1)
    if dic['count'] > 0:
        dic['count'] -= 1
        time.sleep(random.randint(1, 3))
        with open('db.json', encoding='utf-8', mode='w') as f1:
            json.dump(dic, f1)
        print(f'{os.getpid()}Successful ticket purchase, Remaining{dic["count"]}ticket')
    else:
        time.sleep(1)
        print(f'{os.getpid()}Unsuccessful ticket purchase')


def task(lock):
    search()
    lock.acquire()
    paid()
    lock.release()


if __name__ == '__main__':
    mutex = Lock()
    for i in range(5):
        p = Process(target=task, args=(mutex,))
        p.start()

2. Queue-based communication

Queue: Understand the queue as a container that can hold some data.
Characteristics of the queue: First in, first out and always keep this data. FIFO badminton cone.
    
q.put(5555) When the queue is full, the put data in the process will be blocked.
print(q.get()) When the data is fetched, the process get data will also be blocked until a process put data.

print(q.get(timeout=3)) blocks for 3 seconds, and after 3 seconds it blocks to report an error directly.
Print (q. get (block = False) will report an error whenever it encounters a blocking.
Using queues Queue Improve the operation of ballot system:
    - Number of votes stored in a queue
    - Open multiple processes for voting, check votes for concurrent effect, buy votes for serial effect
    - Success and failure of purchase need prompt
from multiprocessing import Process
from multiprocessing import Queue
import random
import time
import os


def search(q):
    get = q.get()  # take out
    print(f'{os.getpid()}Checked the remaining tickets.,Remaining{get["count"]}')
    q.put(get)  # input


def paid(q):
    time.sleep(random.randint(1, 3))
    q_dic = q.get()  # take out
    q.put(q_dic)  # input
    if q_dic["count"] > 0:
        q_dic["count"] -= 1
        print(f"{os.getpid()}Successful Purchase!{q_dic['count']} ")
        try:
            q.put(q_dic, block=False)
        except Exception:
            pass
    else:
        print(f"{os.getpid()}Failure to buy")


def task(q):
    search(q)
    paid(q)


if __name__ == '__main__':
    q = Queue(1)
    q.put({"count": 3})
    for i in range(5):
        p = Process(target=task, args=(q,))
        p.start()
Simulated double eleven queue to snap up millet mobile phones, multi-user snap-up, can only select the first 10 users:
    //Turn on multiple users to snap up mobile phones.
    //Only 10 people are allowed to buy.
    //Finally, the rankings of 10 users are displayed.

import os
from multiprocessing import Queue
from multiprocessing import Process


def task(q):
    try:
        q.put(f'{os.getpid()}', block=False)
    except Exception:
        return


if __name__ == '__main__':
    q = Queue(10)
    for i in range(100):
        p = Process(target=task, args=(q,))
        p.start()
    for i in range(1, 10):
        print(f'No. 1{i}Users are{q.get()}')

10. Producer-consumer module

Programming ideas, models, design patterns, theories and so on, are all given to you a programming method, after you encounter similar situations, you can apply it.

Three elements of producer-consumer model:

Producer: Generating data

Consumers: Receiving data for further processing

Container: basin (queue)

So what role does the queue container play? It acts as a buffer, balances productivity and consumption, and decouples.

from multiprocessing import Process
from multiprocessing import Queue
import random
import time


def producer(q, name):
    for i in range(1, 6):
        time.sleep(random.randint(1, 2))
        res = f'{i}Steamed bun'
        q.put(res)
        print(f'Producer{name}Produced{res}')


def consumer(q, name):
    while 1:
        try:
            food = q.get(timeout=3)
            time.sleep(random.randint(1, 3))
            print(f'Consumer{name}Eat, eat, eat{food}')
        except Exception:
            pass


if __name__ == '__main__':
    q = Queue()

    p1 = Process(target=producer, args=(q, 'Chang Xin'))
    p2 = Process(target=consumer, args=(q, 'Chang Xinxin'))

    p1.start()
    p2.start()

2. thread

1. Thread theory

1. What is threads: processes are resource units and threads are execution units.
    Process: The process will open up a process space in memory, copy all the data of the main process, and the thread will execute the code inside.
        
2. Thread vs process:
    1. Opening a process costs a lot more than opening a thread.
    2. Opening threads is very fast, tens to hundreds of times faster.
    3. Threads and threads can share data, processes and processes need to use queues to achieve communication.
    
3. Application of threads: data sharing, low overhead, fast speed,
    Concurrency: a cpu seems to perform multiple tasks simultaneously, a single process opens three threads, and concurrently executes tasks
 The main thread sub-thread has no status, but who is working in a process? A main thread is working. When it is finished, you have to wait for other threads to finish working before you can finish the process.
  1. == What is Thread==

    A pipeline workflow.

    Process: Open a process space in memory, copy all the resource data of the main process, and then call the cpu to execute the code.

    The previous description is not specific enough:

    Start a process:

    Open a process space in memory, copy all the resource data of the main process, and then call the thread to execute the code.

    Processes are resource units and threads are execution units.

    Later you describe starting a process:

    Open a process: The process opens up a process space in memory, duplicates all the data of the main process, and the thread executes the code inside.

  2. == Thread vs process==

    1. Opening a process costs a lot more than opening a thread.
    2. Opening threads is very fast. It's tens to hundreds of times faster.
    3. Thread threads can share data, and processes need to communicate with each other through queues.
  3. == Application of Threads==

    1. Concurrency: A cpu looks like it performs multiple tasks at the same time.

      A single process opens three threads. Concurrent execution tasks.

      Open three concurrent execution tasks.

      Text Editor:

      1. Enter text.
      2. Display on screen.
      3. Save it on disk.

      It's great to open multithreading:

      Data sharing, low cost and fast speed.

    The main thread sub-thread has no status, but who is working in a process? A main thread is working. When it is finished, you have to wait for other threads to finish working before you can finish the process.

2. Two ways to open threads

The first way:
    
from threading import Thread
import time


def task(name):
    print(f'{name} is running')
    time.sleep(1)
    print(f'{name} in gone')


if __name__ == '__main__':
    p1 = Thread(target=task, args=('Chang Xin',))
    p1.start()
    print('===Main thread')
The second way:
    
from threading import Thread
import time


class MyThread(Thread):
    def __init__(self, name, l1, s1):
        super().__init__()
        self.name = name
        self.l1 = l1
        self.s1 = s1

    def run(self):
        print(f'{self.name} is running')
        print(f'{self.l1} is running')
        print(f'{self.s1} is running')
        time.sleep(1)
        print(f'{self.name} is gone')
        print(f'{self.l1} is gone')
        print(f'{self.s1} is gone')


if __name__ == '__main__':
    p1 = MyThread('Chang Xin', [1, 2, 3], '100')
    p1.start()
    print('===Main thread')

3. Code comparison of thread vs process

  1. Open Speed Contrast, Thread Contrast Process

    from multiprocessing import Process
    
    
    def work():
        print('hello')
    
    
    if __name__ == '__main__':  # Open threads under the main process
        t = Process(target=work)
        t.start()
        print('Main thread/Main process')
    from threading import Thread
    import time
    
    
    def task(name):
        print(f'{name} is running')
        time.sleep(1)
        print(f'{name} is gone')
    
    
    if __name__ == '__main__':
        t1 = Thread(target=task, args=('Sea Dogs',))
        t1.start()
        print('===Main thread')  # Threads are not primary or secondary.
  2. Contrast PID = the same pid==

    from threading import Thread
    import os
    
    
    def task():
        print(os.getpid())
    
    
    if __name__ == '__main__':
        t1 = Thread(target=task)
        t2 = Thread(target=task)
        t1.start()
        t2.start()
        print(f'===Main thread{os.getpid()}')
  3. Sharing internal data with threads in the same process

    Resource data within the same process is shared across multiple threads of the process.
    
    from threading import Thread
    
    x = 3
    
    
    def task():
        global x
        x = 100
    
    
    if __name__ == '__main__':
        t1 = Thread(target=task)
        t1.start()
        t1.join()
        print(f'===Main thread{x}')

4. Other Thread-related Approaches (Understanding)

    # Method of Thread Instance Object
    p1.setName('Subthread 1')  # Set the thread name
    p1.getName()  # Returns the thread name
 ---print(p1.name)  # Get the thread name***
    print(p1.isAlive())  # Returns whether the thread is active.

    # Some methods provided by threading module:
    print(current_thread())  # Get the object of the current thread
    print(currentThread())  # Get the object of the current thread
    print(enumerate())  # Returns a list of all threaded objects
 ---print(activeCount())  # *** returns the number of threading threading. enumerate ()) running, with the same result as len(threading.enumerate()).
from threading import Thread
from threading import currentThread
from threading import enumerate
from threading import activeCount
import os
import time

x = 9


def task():
    print(currentThread())
    time.sleep(1)
    print('666')


if __name__ == '__main__':
    p1 = Thread(target=task, name='p1')  # name Sets Thread name
    p2 = Thread(target=task, name='p2')  # name Sets Thread name
    p1.start()
    p2.start()

    # Method of Thread Instance Object
    p1.setName('Subthread 1')  # Set the thread name
    p2.setName('Subthread 1')  # Set the thread name
    p1.getName()  # Returns the thread name
    p2.getName()  # Returns the thread name
    print(p1.name)  # Get the thread name***
    print(p2.name)  # Get the thread name***
    print(p1.isAlive())  # Returns whether the thread is active.
    print(p2.isAlive())  # Returns whether the thread is active.

    # Some methods provided by threading module:
    print(currentThread())  # Get the object of the current thread
    print(enumerate())  # Returns a list of all threaded objects
    print(activeCount())  # *** returns the number of threading threading. enumerate ()) running, with the same result as len(threading.enumerate()).
    print(f'Main thread{os.getpid()}')

5. Daemon threads (test points)

join: Blocking tells the main thread to wait until my sub-thread has finished executing before executing the main thread
 When does the main thread end???
The daemon thread waits for the non-daemon sub-thread and the main thread to finish.
from threading import Thread
import time


def foo():
    print(123)  # 1
    time.sleep(1)
    print("end123")  # 4


def bar():
    print(456)  # 2
    time.sleep(2)
    print("end456")  # 5


t1 = Thread(target=foo)
t2 = Thread(target=bar)

t1.daemon = True
t1.start()
t2.start()
print("main-------")  # 3
# Result:
# 123
# 456
# main-------
# end123
# end456

6. Mutex Lock (Test Point)

Programming Serial after Normal Locking
 It is not necessarily necessary to add delay after the lock, and sometimes queue jumping may occur.
from threading import Thread
from threading import Lock
import time
import random

x = 10


def task(lock):
    lock.acquire()
    time.sleep(random.randint(1, 3))  # Card point
    global x
    temp = x
    time.sleep(0.1)
    temp = temp - 1
    x = temp
    lock.release()


if __name__ == '__main__':
    mutex = Lock()
    l1 = []
    for i in range(10):
        t = Thread(target=task, args=(mutex,))
        l1.append(t)
        t.start()

        time.sleep(1)
        print(f'Main thread{x}')

7. Deadlock Phenomenon and Recursive Lock

  1. Deadlock phenomenon is: A process with A key to find B key, B process with B key to find A key
  2. Recursive Lock: Deadlock can be solved. When business needs multiple locks, priority should be given to recursive locks.
  3. Locks must be written as== lock_A = lock_B = RLock()== format, the principle is== pid=== the same, every lock once, the number of locks plus one, unlock time minus one. If the number of locks is not zero, other threading threading import RLock== import module.
Deadlock phenomenon:
    
from threading import Thread
from threading import Lock
import time

lock_A = Lock()
lock_B = Lock()


class MyThread(Thread):
    def run(self):
        self.f1()
        self.f2()

    def f1(self):
        lock_A.acquire()
        print(f'{self.name}Get it A')
        lock_B.acquire()
        print(f'{self.name}Get it B')

        lock_B.release()
        lock_A.release()

    def f2(self):
        lock_B.acquire()
        print(f'{self.name}Get it B')
        time.sleep(0.1)
        lock_A.acquire()
        print(f'{self.name}Get it A')

        lock_A.release()
        lock_B.release()


if __name__ == '__main__':
    for i in range(3):
        t = MyThread()
        t.start()
recursive mutex:
    
from threading import Thread
from threading import RLock
import time

lock_A = lock_B = RLock()


class MyThread(Thread):
    def run(self):
        self.f1()
        self.f2()

    def f1(self):
        lock_A.acquire()
        print(f'{self.name}Get it A')
        lock_B.acquire()
        print(f'{self.name}Get it B')

        lock_B.release()
        lock_A.release()

    def f2(self):
        lock_B.acquire()
        print(f'{self.name}Get it B')
        lock_A.acquire()
        print(f'{self.name}Get it A')

        time.sleep(1)
        lock_A.release()
        lock_B.release()


if __name__ == '__main__':
    for i in range(3):
        t = MyThread()
        t.start()

8. Signals

It's also a lock that controls the number of concurrencies

== from threading import current_thread=== Get the object module of the current thread

== from threading import Semaphore==Import semaphore module

== sem = Semaphore(5) = = infinite when instantiated semaphores are written

== sem.acquire()=== Get semaphores in the function

from threading import Thread
from threading import Semaphore
from threading import current_thread
import random
import time

sem = Semaphore(5)


def task():
    sem.acquire()
    print(f'{current_thread().name} Room')
    time.sleep(random.randint(1, 3))
    sem.release()


if __name__ == '__main__':
    for i in range(30):
        t = Thread(target=task, )
        t.start()

9. ==GIL==Global Interpreter Lock

  1. Many self-proclaimed gods say that the GIL lock is python's fatal flaw, Python can not be multi-core, concurrent, etc..

  2. == In theory, multithreading of a single process can take advantage of multicore.==

    However, programmers who develop Cpython interpreters lock threads that enter the interpreter.

  3. Why lock?

    1. At that time, it was a single-core era, and the price of cpu was very expensive.
    2. If there is no global interpreter lock, the programmer who develops Cpython interpreter will actively lock, unlock, troublesome, deadlock phenomena and so on in the source code. In order to save time, he directly enters the interpreter and adds a lock to the thread.
    3. == Advantages: It guarantees the security of data resources of Cpython interpreter.==
    4. == Disadvantage: Multi-threading of a single process can't take advantage of multi-core.==
  4. Jpython does not have GIL locks. pypy does not have GIL locks.

  5. Now in the multi-core era, can I remove Cpython's GIL lock?

    Because all the business logic of the Cpython interpreter is implemented around a single thread, it is almost impossible to remove the GIL lock.

  6. == Multi-threading of a single process can be concurrent, but it can't take advantage of multi-core and can't be parallel. Multiple processes can be concurrent and parallel.==

== io-intensive: multi-threaded and concurrent execution of a single process is appropriate==

== Computing intensive: multi-process parallelism==

10.==GIL== Lock and== lock== Lock

  1. Similarity: All locks are identical, mutually exclusive.
  2. Difference:
    1. GIL locks global interpreter locks to protect the security of resource data within the interpreter.
    2. GIL locks and releases without manual operation.
    3. The mutex defined in your code protects the security of resource data in the process.
    4. Self-defined mutexes must be manually locked and unlocked.

11. Verify the efficiency of compute-intensive IO-intensive

  1. == io-intensive: multi-threaded concurrency of a single process is efficient and appropriate. concurrent execution==

  2. == Computing intensive: multi-process concurrent parallel efficiency. Parallel==

  3. Code validation:

    Computing intensive: Multithread concurrency of a single process vs Concurrent Parallelism of Multiple Processes
    
    from multiprocessing import Process
    from threading import Thread
    import time
    
    
    def task():
        count = 0
        for i in range(30000000):  # (30 million)
            count += 1
    
    
    if __name__ == '__main__':
    
        # Multi-process concurrency, parallel 2.3737263679504395 seconds
        start_time = time.time()
        l1 = []
        for i in range(4):
            p = Process(target=task,)
            l1.append(p)
            p.start()
        for i in l1:
            i.join()
        print(f'execution time:{time.time()-start_time}')
    
        # Multithread concurrency 6.290118932723999 seconds
        start_time = time.time()
        l1 = []
        for i in range(4):
            p = Thread(target=task,)
            l1.append(p)
            p.start()
        for i in l1:
            i.join()
        print(f'execution time:{time.time()-start_time}')
    
    //Computing-intensive: High concurrent and parallel efficiency of multi-process.
    # IO-intensive: Multi-threaded concurrency of a single process and concurrent parallelism of multiple processes in vs
    from multiprocessing import Process
    from threading import Thread
    import time
    
    
    def task():
        count = 0
        time.sleep(1)
        count += 1
    
    
    if __name__ == '__main__':
    
        # Multiprocess concurrency, parallel 3.01239587646484 seconds
        start_time = time.time()
        l1 = []
        for i in range(50):
            p = Process(target=task, )
            l1.append(p)
            p.start()
    
        for p in l1:
            p.join()
    
        print(f'Executive efficiency:{time.time() - start_time}')
    
        # Multithread concurrency 1.0087950229644775 seconds
        start_time = time.time()
        l1 = []
        for i in range(50):
            p = Thread(target=task,)
            l1.append(p)
            p.start()
    
        for p in l1:
            p.join()
    
        print(f'Executive efficiency:{time.time()- start_time}')
    
    //For IO-intensive: multi-threaded concurrency of a single process is efficient.

12. Multithread implementation = = socket = = Communication

Whether multi-threaded or multi-process, if according to the previous writing, to a client request, I will open a thread, to a request to open a thread, should be like this: within the scope of your computer, the more threaded processes open, the better.

Server:
    
from threading import Thread
import socket


def communicate(conn, addr):
    while 1:
        try:
            from_client_data = conn.recv(1024)
            print(f'come{addr[1]}Information{from_client_data.decode("utf-8")}')
            to_client_data = input('>>>').strip()
            conn.send(to_client_data.encode('utf-8'))
        except Exception:
            break
    conn.close()


def _accket():
    server = socket.socket()
    server.bind(('127.0.0.1', 8080))
    server.listen(5)
    while 1:
        conn, addr = server.accept()
        t = Thread(target=communicate, args=(conn, addr))
        t.start()


if __name__ == '__main__':
    _accket()
Client:
    
import socket
client = socket.socket()
client.connect(('127.0.0.1', 8080))
while 1:
    try:
        to_server_data = input('>>>').strip()
        client.send(to_server_data.encode('utf-8'))
        from_server_data = client.recv(1024)
        print(f'Messages from the server: {from_server_data.decode("utf-8")}')
    except Exception:
        break
client.close()

13. Process pool thread pool

from concurrent.futures import ProcessPoolExecutor  # Thread pool module
from concurrent.futures import ThreadPoolExecutor  # Process pool module
p = ProcessPoolExecutor()  # By default, the number of processes in the process pool is equal to the number of cpu cores (parallel + concurrent)
t = ThreadPoolExecutor()  # Default not to write, number of cpu cores * 5 threads (concurrent)
print(os.cpu_count())  # Look at the computer cores
from concurrent.futures import ProcessPoolExecutor
from concurrent.futures import ThreadPoolExecutor
import random
import time
import os


print(os.cpu_count())  # Look at the computer cores


def task():
    print(f'pid Number: {os.getpid()} Coming')

    time.sleep(random.randint(1, 3))


if __name__ == '__main__':
    # Open the process pool (parallel (parallel + concurrent)
    p = ProcessPoolExecutor()  # By default, the number of processes in the process pool is equal to the number of CPUs
    for i in range(20):
        p.submit(task, )

    # Open thread pool (concurrency)
    t = ThreadPoolExecutor()  # Default not to write, number of cpu * 5 threads
    for i in range(40):
        t.submit(task, )

14. Blocking non-blocking asynchronous synchronization

What's the problem?
1. The process of analysis results is serial and inefficient.
2. After you crawl all the results successfully, put them in a list and analyze them.
Question 1: Solution:
In the open process pool, the process is reopened, which consumes resources.

'''
It takes 2 seconds to crawl a web page and 10 pages to crawl concurrently: 2. more seconds.
Analysis tasks: 1s. 10s. Total 12.2 seconds.

Now this version of the process:
    Asynchronously send out 10 web crawling tasks, and then four processes concurrently (in parallel) first complete four web crawling tasks, then who ends first and who proceeds next
    Crawl until all 10 tasks are successful.
    Put 10 crawl results in a list and analyze them serially.
    
It takes 2 seconds to crawl a web page. Analytical tasks: 1 seconds, a total of 3 seconds (open process wastage).
.    10s.
The next version of the process:
    Asynchronously send out 10 tasks of crawling web pages + analysis, and then four processes concurrently (in parallel) complete four tasks of crawling web pages + analysis.
    Then who will finish the next crawl + analysis task first, until all 10 crawl + analysis tasks are completed successfully.
    '''
Callback function is the main process to help you achieve, callback function to help you analyze tasks. Clear process tasks: only one network crawl.
Analysis Task: Callback function is executed. Decoupling between functions.

Extreme case: If the callback function is an IO task, then because your callback function is done by the main process, it may affect efficiency.

Callback is not omnipotent, if the task of callback is IO,
Then asynchronous + callback mechanism is not good. If you want efficiency, you can only sacrifice overhead and open a thread process pool.


Asynchronism is callback! This is wrong!! Asynchronism, callback are two concepts.
'''
If there are multiple tasks, multi-process and multi-threaded IO tasks.
# 1. The remaining tasks are non-IO blocking. Asynchronous + callback mechanism
 # 2. The remaining tasks IO IO asynchronous + callback mechanism for multiple tasks
 # 3. The remaining task IO >= IO solution for multiple tasks, or two process thread pools.
'''

15. Asynchronous invocation mechanism


16. Event== Event==

17. Initial Knowledge of the Co-operation


18. Co-operation

Topics: Python JSON socket encoding Programming