python foundation concurrent programming 02

Posted by tekrscom on Wed, 11 Dec 2019 02:57:55 +0100

Concurrent programming

Two methods of subprocess recycling

  • join() allows the main process to wait for the end of the subprocess and reclaim the subprocess resources. The main process then ends and reclaims the resources

    from multiprocessing import Process
    import time
    
    
    def task(name):
        print(f'Child process{name}: starting......')
        time.sleep(1)
        print(f'Child process{name}: end......')
    
    
    if __name__ == '__main__':
        print('Enter the main process')
        pro_list = []
        for i in range(3):
            pro_obj = Process(target=task, args=(i,))
            pro_list.append(pro_obj)
            pro_obj.start()
    
        for pro in pro_list:
            # Only when the subprocess is forced to finish, can the main process finish and realize the subprocess resource recovery
            pro.join()
    
        print('End main process')
    
  • The main process ends normally, and the sub process and the main process are recycled together

Understanding knowledge

Zombie process: after the sub process ends, the main process does not end normally, and the sub process PID will not be recycled.

Disadvantages: the PID number in the operating system is limited. Only the PID number, that is, the resource is occupied, may cause the failure to create a new process

Orphan process: the subprocess is not ended, the main process is not ended normally, the subprocess PID will not be recycled, and will be recycled by the operating system optimization mechanism.

Operating system optimization mechanism: when the main process terminates unexpectedly, the operating system will detect whether there are running subprocesses. If there are, the operating system will put them into the optimization mechanism for recycling

Daemon

When the main process is ended, the child process must end. Daemons must be set before child processes are started

from multiprocessing import Process
import time


# Process task
def task():
    print('starting......')
    time.sleep(2)
    print('ending......')


if __name__ == '__main__':
    print('Enter the main process')
    obj_list = []
    for i in range(2):
        # Create process
        pro_obj = Process(target=task)
        obj_list.append(pro_obj)
        # Start Daemons
        pro_obj.daemon = True
        # The daemons must be set before the process is started
        pro_obj.start()

    for obj in obj_list:
        obj.join()

    print('End of main process')

Data is isolated between processes, code demonstration

from multiprocessing import Process

count = 0


def func1():
    global count
    count += 10
    print(f'func1:{count}')


def func2(count):
    count += 100
    print(f'func2:{count}')


if __name__ == '__main__':
    # Create subprocess 1
    pro_obj1 = Process(target=func1)
    # Create subprocess 2
    pro_obj2 = Process(target=func2, args=(count,))
    # Subprocess 1 on
    pro_obj1.start()
    pro_obj1.join()
    # Subprocess 2 on
    pro_obj2.start()
    pro_obj2.join()

    print(f'Main process:{count}')

Output result

func1:10
func2:100
 Main process: 0

thread

Reference resources: https://blog.csdn.net/daaikuaichuan/article/details/82951084

Generally speaking, processes and threads are divided

Process: the operating system allocates system resources (CPU time slice, memory and other resources) based on the process. The process is the minimum unit of resource allocation

Threads: sometimes referred to as lightweight processes, are the smallest unit of execution for operating system scheduling (CPU scheduling)

Difference between process and thread

  • Scheduling: threads as the basic unit of scheduling and allocation, and processes as the basic unit of owning resources
  • Concurrency: concurrent execution can be performed between processes, or between multiple threads of the same process
  • Own resources: a process is an independent unit that owns resources. A thread does not own system resources, but can access resources belonging to a process.
  • Overhead: the system allocates and reclaims resources for processes when they are created or undone. Threads are just different execution paths in a process. If a process is hung up, all threads will be hung up. Therefore, multiprocess programs are more robust than multithreaded programs, but they consume more resources and are less efficient when switching processes

Process and thread connections

  • A thread can only belong to one process, and a process can have multiple threads, at least one thread
  • Resources are allocated to the process, and all threads of the same process share all resources of the process
  • What's really running in the processor is the thread
  • Synchronization between threads of different processes by means of message communication

Implementation of thread

# How to create a sub thread
# 1. Import the Thread class in the threading module
from threading import Thread
import time

number = 1000


def task():
    global number
    number = 200
    print('Sub thread start')
    time.sleep(1)
    print('Sub thread end')


if __name__ == '__main__':
    # 2. Create a sub thread object
    thread_obj = Thread(target=task)
    # 3. Start sub thread
    thread_obj.start()
    # 4. Set the end of sub thread, and the main thread can end
    thread_obj.join()
    print('Main process (main thread)')
    print(number)           # Output: 200
# How to create a sub thread
# 1. Import the Thread class in the threading module
from threading import Thread
import time

# 2. Inherit Thread class
class MyThread(Thread):
    def run(self):
        global number
        number = 200
        print('Sub thread start')
        time.sleep(1)
        print('Sub thread end')


if __name__ == '__main__':
    # Create a child thread object
    t = MyThread()
    # Open child thread
    t.start()
    t.join()
    print('Main process (main thread)')
    print(number)           # Output: 200

Guardian sub thread: set the demon property of the sub thread object to True, that is

def task():
    global number
    number = 200
    print('Sub thread start')
    time.sleep(1)
    print('Sub thread end')


if __name__ == '__main__':
    # 2. Create a sub thread object
    thread_obj = Thread(target=task)
    # 3. Start sub thread
    thread_obj.daemon = True
    # 4. Start the guard thread
    thread_obj.start()
    # 5. Set the end of sub thread before the end of main thread
    thread_obj.join()
    print('Main process (main thread)')
    print(number)

queue

Queue is equivalent to a third-party channel, which can store data and realize data transfer (that is, data interaction) between processes. First in, first out

It can be realized in three ways

  • from multiprocessing import Queue
  • from multiprocessing import JoinableQueue
  • Import queue - Python built-in queue

Queue data

  • put(obj): store data. The number of stored data exceeds the length set by the queue, and the process enters a blocking state
  • Put ou nowait (obj): store data. When the number of stored data exceeds the length set by the queue, an error is reported

Queue fetching data

  • get(): fetch data. After the records in the queue are fetched, continue fetching. The process enters a blocking state
  • get_nowait(): get data. After the records in the queue are taken, continue to get data and report an error

Use

from multiprocessing import JoinableQueue
# from multiprocessing import Queue
# import queue
from multiprocessing import Process


# Storing data in a queue
def task_put(queue):
    number_list = [10, 20, 30, 40]
    for i in number_list:
        # put() stores data. The number of stored data exceeds the length set by the queue, and the process enters a blocking state
        queue.put(i)
        print(f'Storage record:{i}')
        # Put 〝 nowait() stores data. When the number of stored data exceeds the length set by the queue, an error is reported
        # queue.put_nowait(i)
        # print(f 'deposit record: {i}')

    queue.put(1000)
    print(f'Storage record:{1000}')
    # Put ﹣ nowait() stores data. When the stored data exceeds the length set by the queue, an error is reported
    # queue.put_nowait(1000)
    # print(f 'deposit record: {1000}')


# Get data from queue
def task_get(queue):
    for i in range(5):
        # get() fetches data. After the records in the queue are fetched, continue fetching and the process enters a blocking state
        print(f'Removed the first{i+1}Record:{queue.get()}')
        # Get ﹣ nowait(), after the records in the queue have been fetched, continue fetching and report an error
        # Print (the {i+1} record retrieved by F ': {queue. Get ﹣ nowait()}')


if __name__ == '__main__':
    # How to create queue objects from multiprocessing import JoinableQueue
    queue_obj = JoinableQueue(3)  # The parameter is of type int, indicating the number of data stored in the queue
    # How to create queue objects from multiprocessing import Queue
    # queue_obj = Queue(4)
    # How import queue creates queue objects
    # queue_obj = queue.Queue(4)

    # Process 1 store data
    pro_obj1 = Process(target=task_put, args=(queue_obj,))
    pro_obj1.start()
    pro_obj1.join()

    # Process 2 fetching data
    pro_obj2 = Process(target=task_get, args=(queue_obj,))
    pro_obj2.start()
    pro_obj2.join()

Review:

Queue by list and ordered dictionary, FIFO

# Queue by list
# Define an empty list as a queue
queue = []
# Insert element into list
queue.insert(0, 1)
queue.insert(0, 2)
queue.insert(0, "hello")
print(queue)
for index in range(len(queue)):
    print(f"The first{index+1}Elements:", queue.pop())
# Realization of queue mode 1 through ordered dictionary
from collections import OrderedDict

# Insert elements into an ordered dictionary
ordered_dict = OrderedDict()
ordered_dict[1] = 1
ordered_dict[2] = 2
ordered_dict[3] = 'hello'
# Move the first inserted element to the last
ordered_dict.move_to_end(2)
ordered_dict.move_to_end(1)
print(ordered_dict)
for index in range(3):
    print(ordered_dict.pop(index + 1))
    
# Mode two
# Queue implementation through ordered dictionary
from collections import OrderedDict

ordered_dict = OrderedDict()
ordered_dict['1'] = 1
ordered_dict['2'] = 2
ordered_dict['3'] = 'hello'
ordered_dict.move_to_end('2')
ordered_dict.move_to_end('1')
print(ordered_dict)

ordered_dict.move_to_end('1')
ordered_dict.move_to_end('2')
ordered_dict.move_to_end('3')
index = 1
for key in ordered_dict:
    print(f'The first{index}Elements:{key}')
    index += 1

IPC mechanism

IPC (inner process communication)

The communication between processes can be realized through queues. For details, see the example of queues

mutex

Mutual exclusion: several program fragments scattered among different tasks. When a task runs one program fragment, other tasks cannot run any program fragment among them. They can only run after the task runs the program fragment. The most basic scenario is: a common resource can only be used by one process or thread at the same time, and multiple processes or threads cannot use the common resource at the same time

Mutex: a simple method of locking to control access to shared resources. Mutex has only two states, namely locking [lock object. acquire()] and unlocking [lock object. release()]

Function: make concurrent serial, sacrifice execution efficiency, and ensure data security

Features: atomicity, uniqueness, non busy waiting

  • Atomicity: if a process / thread locks a mutex, no other process / thread can successfully lock the mutex at the same time
  • Uniqueness: if a process / thread locks a mutex, no other process / thread can lock the mutex before it is unlocked
  • Non busy wait: if a process / thread has locked a mutex, and the second process / thread tries to lock the mutex, the second process / thread will be suspended (without any cpu resources) until the first process / thread is unlocked, and the second process / thread will be awakened and executed, and the mutex will be locked at the same time

Mutex operation flow:

  1. Create a mutex Lock object from the [Lock] class in module [multiprocessing]
  2. Lock the mutex before the critical area of shared resources
  3. Lock before access, use lock object. acquire(), unlock after access, and use lock object. release()

Process mutex: a small example of ticket purchase

# Content in data.json file: {"number": 1}
# Ticket buying example
from multiprocessing import Process  # process
from multiprocessing import Lock  # Process mutex
import datetime
import json
import random
import time


# Check tickets
def check_ticket(name):
    with open('data.json', 'r', encoding='utf-8') as f:
        ticket_dic = json.load(f)
    print(f'[{datetime.datetime.now()}]user{name}Check tickets,'
          f'Current balance:{ticket_dic.get("number")}')


# Ticket purchase
def buy_ticket(name):
    # Get the number of current tickets
    with open('data.json', 'r', encoding='utf-8') as f:
        ticket_dic = json.load(f)
    number = ticket_dic.get('number')
    if number:
        number -= 1
        # Network delay of simulated ticket purchase
        time.sleep(random.random())
        ticket_dic['number'] = number
        # Successful ticket purchase
        with open('data.json', 'w', encoding='utf-8') as  f:
            json.dump(ticket_dic, f)
        print(f'[{datetime.datetime.now()}]{name}Ticket snatch succeeded!')
    else:
        # Ticket failure
        print(f'[{datetime.datetime.now()}]{name}Ticket grabbing failed!')


def main(name, lock):
    # Check tickets
    check_ticket(name)
    # Use mutex for ticket purchase
    # Lock up
    lock.acquire()
    buy_ticket(name)
    # Unlock
    lock.release()


if __name__ == '__main__':
    pro_list = []
    # Create mutex object
    lock = Lock()
    # Create 10 processes
    for i in range(9):
        pro_obj = Process(target=main, args=(f'pro_obj{i+1}', lock))
        pro_obj.start()

    for pro in pro_list:
        pro.join()

Thread mutex example

"""
//Open 10 threads to modify one data
"""
from threading import Lock
from threading import Thread
import time

# Create thread mutex object
lock = Lock()
# Records to modify
number = 100


# Thread task
def task():
    global number
    # Lock up
    # lock.acquire()
    # Modified value
    number2 = number
    time.sleep(1)
    number = number2 - 1
    # Unlock
    # lock.release()


if __name__ == '__main__':
    # Create 10 threads
    list1 = []
    for line in range(10):
        t = Thread(target=task)
        t.start()
        list1.append(t)

    # The main thread can only end after the limit of child thread ends
    for t in list1:
        t.join()

    print(number)  # With mutex, output: 90; without mutex, output: 99

Topics: Python JSON Fragment encoding Programming