Python Network Programming

Posted by boiy on Sat, 15 Jan 2022 03:29:04 +0100

Transmission model

hierarchical model

Description of each layer	OSI seven layer network model	TCP/IP four layer network model	Corresponding protocol
Standard interface for network communication of operating system or application	application layer	application layer	Telnet,FTP,SMTP,DNS,HTTP
Convert different coding methods into standard forms used in network communication (e.g. UTF-8)	Presentation layer	application layer	Telnet,FTP,SMTP,DNS,HTTP
Establish or disconnect connections between different processes on different PC s and insert synchronization points	Session layer	application layer	Telnet,FTP,SMTP,DNS,HTTP
End to end data connection and transmission between two hosts	Transport layer	Transport layer	TCP,UDP
Select the route and find the target host correctly	network layer	network layer	IP,ARP,RARP,ICMP
Accurate data transmission between two adjacent nodes	data link layer	data link layer	Network communication hardware and interface
Transmission of raw bit data on physical media	physical layer	data link layer	Network communication hardware and interface

application layer

name	explain
http	Hypertext transfer protocol uses a browser to query the transfer protocol between web servers
ftp	File transfer protocol, a protocol for file exchange between network and host
smtp	Simple mail transfer protocol, which defines how mail is transmitted between mail servers
pop	Mail protocol, which defines the protocol for downloading users from the mail server to the local server
Telent	Remote login protocol, which uses the protocol used by other computers on the Internet remotely to obtain the information running or stored on other computers
DNS	The domain name system realizes the conversion from domain name to IP address

Transport layer

Network protocol refers to the set of conventions and communication rules that communication parties must abide by together on how to communicate. Only when the two sides of communication on the network abide by the same protocol can they correctly exchange information

TCP

Establish a stable connection through the triple handshake protocol
The application data is divided into the most appropriate data blocks for transmission
The connection is reliable, and it is not easy to have disorder, loss, etc
Connection and inspection take a lot of time, so the efficiency will be reduced

UDP

Send all data at once without establishing a connection
The server can send to multiple clients at the same time
Without inspection, it is easy to lose information
Low resource consumption and fast processing speed

network layer

name	explain
IP	Every computer and other device on the Internet has an address called "IP address"
ARP	Address translation protocol, which completes the conversion from IP address to physical address
RARP	Reverse address translation protocol, which completes the conversion from physical address to IP address
ICMP	Control message protocol, send messages, and report packet transmission errors

IPV4 and IPV6

ipv4 is the first widely used IP. IP is the network layer protocol in TCP/IP protocol family and the core protocol of TCP/IP protocol family

The address length of ipv4 is specified as 32 bits, which is divided into four segments. Each segment is 8 bits, separated by dots and expressed in decimal form. Therefore, the value range of each segment is 0 ~ 255, that is, the minimum is 0.0.0.0 and the maximum is 255.255.255.255

The address block 127.0.0.1 is reserved as loopback credit, and 0.0.0.0 is used arbitrarily
- Applicable to general computer networks
  - Class A, 1 ~ 127 Lord Lord Primary = = > subnet mask 255.0.0.0
  - Class B, 128 ~ 191 Net Lord Primary = = > subnet mask 255.255.0.0
  - Class C, 192 ~ 223 Net Net Primary = = > subnet mask 255.255.255.0
- Multicast and scientific research
  - Class D, 224 ~ 239 multicast
  - Class E, 240 ~ 254 scientific research
ipv6 is four times the length of ipv4, expressed in hexadecimal form. It is divided into eight segments, each separated by a colon

Due to the 4 bits of ipv6, the address resources have been allocated, so ipv6 is generated

port

When multiple programs use the network at the same time, ports are added to ensure that the information is sent to the correct place. Different software uses different ports. Although the same network is used, because the ports are different, it can also ensure that the information is sent to the correct person

The port number ranges from 0 to 65535 (2 ^ 16-1), such as port 80 for web browsing service, port 21 for FTP service, etc

ip address is used to distinguish different hosts, while port number is used to distinguish different network services under a host

TCP protocol

Establish connection

Step 1: the client sends SYN message to the server and enters SYN_SEND status.
Step 2: the server receives a SYN message, responds to a SYN ACK message, and enters SYN_RECV status.
Step 3: the client receives the SYN message from the server, responds to an ACK message and enters the Established state.
After the three steps are completed, the TCP client and server successfully establish a connection and can start transmitting data

Long connection and short connection

Short link
- Establish connection - data transfer - close connection... Establish connection - data transfer - close connection
Long link
- Establish connection - data transfer... (keep connection)... Data transfer - close connection

Disconnect

Step 1: first, the client sends a FIN to the server to request to close the data transmission.
Step 2: when the server receives the FIN of the client, it sends an ACK to the client, where the value of ACK is equal to FIN+SEQ
Step 3: then the server sends a FIN to the client to tell the client application to close.
Step 4: when the client receives the FIN from the server, it replies an ACK to the server. Where ack is equal to FIN+SEQ

Create socket

Server

client

Peer to peer connection socket

Data interaction

clinet.send(b"hello")  # Must be byte type data
client.recv(1024)  # To receive data, specify the maximum length of the data
client.close()  # Close connection

block

When the client is not connected or sends data, the server will block the execution of accept and recv

Simplex: broadcast, only receive each other's information, only send but not receive
Half duplex: only one person can speak at a time
Full duplex: it can talk and receive at the same time

Basic IO model

Data flow concept

data stream is an ordered sequence of bytes with start and end points. Is an ordered sequence of points that can only be read once or a few times. It includes input stream and output stream
Data streams are divided into input stream and output stream. The input stream can only read but not write, while the output stream can only write but not read. Usually, the program uses the input stream to read data and the output stream to write data, just as the data flows into and out of the program. Data flow is used to make the input and output operation of the program independent of related equipment
The input stream can obtain data from the keyboard or file, and the output stream can transfer data to the display, printer or file

IO interpretation and IO interaction

IO is input and output. In the unix world, everything is a file. And what is the file? A file is a string of binary streams. No matter socket, FIFO, pipeline or terminal, everything is a file and everything is a stream. In the process of information exchange, send and receive data to these streams, which is referred to as I/O operation (input and output)
Read the data into the stream, and the system calls read. To write data, the system calls write. But there are so many streams in the computer, how do you know which stream to operate? This descriptor is what fd usually does. An fd is an integer, so the operation on this integer is the operation on this file (stream). When a socket is created, a file descriptor will be returned through the system call, and the remaining operations on the socket will be transformed into operations on this descriptor

Blocking IO

In the actual situation summary, many times the data has not arrived at the beginning. At this time, the kernel has to wait for enough data to arrive

On the user process side, the whole process will be blocked. When the kernel waits until the data is ready, it will copy the data from the kernel to the user memory, and then return the results. The user process will unblock and run again

Non blocking IO model and non blocking socket

Non blocking IO model

From the perspective of the user process, after initiating a read operation, it does not need to wait, but immediately gets a result. When the user process judges that the result is an error, it knows that the data is not ready
So it can send the read operation again. Once the data in the kernel is ready and the system call of the user process is received again, it immediately copies the data to the user memory and returns. The significant difference between the non blocking interface and the blocking interface is that it returns immediately after being called

non-blocking IO

Use exception handling to handle exceptions generated by non blocking IO

Concurrency and parallelism

Concurrent

It refers to that several programs in a period of time are in the period from Startup to completion, and these programs are running on the same processor, but only one program runs on the processor at any time point

The code implementation is very simple, but we should understand the process
Mainly use the idea of circulation to solve the problem

parallel

It refers to that several programs in a period of time are running from Startup to completion, and these programs are running on different processors, and multiple programs are running on different processors at any time point

IO multiplexing

In the previous non blocking IO model, you can check whether there is data through continuous query, which will cause a waste of resources

The viewing process is changed from active query to multiplexer, which can save system resources and have better performance

epoll

Non blocking socket and multiplexing
- The non blocking socket needs to traverse each peer connection socket by itself, and each time it is an IO operation
  The multiplexer does not need to perform a lot of IO operations. The multiplexer will tell you which peer connection socket has data, and then process it
epoll is a lazy event callback, that is, the callback process is called by the user, and the operating system only plays the role of notification
epoll is the best IO multiplexer on Linux, but it is only available on Linux and not elsewhere

Multitasking

CPU time slice

Switching priority scheduling algorithm based on time slice

In single core, the basic of parallelism is concurrency. Only multi-core can truly realize parallelism

process

Running programs

Each process has its own independent address space, memory, data lines, and other auxiliary data used to track execution
Each process is independent and does not affect each other

Multi process

Use multiple processes to share time-consuming tasks and run time-consuming tasks in another process. In this way, the main process will not be affected.
When the subprocess is completed, the running results of the subprocess are returned

import multiprocessing, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


def func1(a):
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print(a)
    print("Internal end:", time.time())


# Simulate what needs to be done outside the process
print("External start", time.time())
p1 = multiprocessing.Process(target=func)  # Instantiate a new process object, child process
p2 = multiprocessing.Process(target=func1, args=("nihao",))  # Instantiate a child process and pass parameters through args
p1.start()  # Start the process and use it to help us share the task
p2.start()  # Start process
time.sleep(5)  # Simulate time-consuming operations
print("External end", time.time())

# Output: total time: 5 seconds

The first is multiprocessing Process instantiation, and specify callback function and parameter list

After instantiation, the run can be called directly, which realizes multi process running and saves running time

Parallelism here is only Python level, not actual level

When the total number of processes is more than the number of cores, there is no effect

Multiple processes are scheduled by the operating system

Multithreading

If the process is compared to a factory, the thread is the worker in the factory, that is, a process can contain multiple threads

The workshop space is shared by workers. For example, many rooms are accessible to every worker. This means that the memory space of a process is shared, and each thread can use these shared memory

A thread can be preempted (interrupted) or temporarily suspended (sleep), etc

Threads are scheduled by the Python interpreter, while processes are scheduled by the operating system

code implementation

import threading, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


def func1(a):
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print(a)
    print("Internal end:", time.time())


print("External start", time.time())
t1 = threading.Thread(target=func)  # Instantiate a thread object to share the task
t2 = threading.Thread(target=func1, args=("nihao",))  # Instantiate a child process and pass parameters through args
t1.start()
t2.start()  # Open thread
time.sleep(5)  # Simulate time-consuming operations
print("External end:", time.time())

# The output time is still 5 seconds

GIL global interpretation lock

At the beginning of Python's invention, there was no concept of multi-core CPU. To take advantage of multicore, python began to support multithreading. The simplest way to solve the data integrity and state synchronization between multithreads is naturally locking. So there was the GIL super lock

GIL lock requires that only one thread can execute in any process at a time. Therefore, multiple CPU s cannot be allocated to multiple threads. Therefore, threads in Python can only achieve concurrency, not true parallelism

Multitask operation control

Wait for the subtask to end

join method

After a process or thread adds a join method, it will wait for the subtask to end. If it does not end, it will block until the subtask ends,
Therefore, the join is usually placed at the end of the program

import threading, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


def func1(a):
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print(a)
    print("Internal end:", time.time())


print("External start", time.time())
t1 = threading.Thread(target=func)  # Instantiate a thread object to share the task
t2 = threading.Thread(target=func1, args=("nihao",))  # Instantiate a child process and pass parameters through args
t1.start()
t2.start()  # Open thread
time.sleep(5)  # Simulate time-consuming operations
t1.join()
t2.join()  # Control the main process to wait for the child thread to end and then end
print("External end:", time.time())

# The output time is still 5 seconds

Terminate task

Under normal circumstances, the end of the main process will not affect the child process, but the child process can also be forcibly terminated after the end of the main process.

Note that the thread cannot be terminated and can only wait for the end-

import multiprocessing, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


def func1(a):
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print(a)
    print("Internal end:", time.time())


# Simulate what needs to be done outside the process
print("External start", time.time())
p1 = multiprocessing.Process(target=func)  # Instantiate a new process object, child process
p2 = multiprocessing.Process(target=func1, args=("nihao",))  # Instantiate a child process and pass parameters through args
p1.start()  # Start the process and use it to help us share the task
p2.start()  # Start process
time.sleep(5)  # Simulate time-consuming operations
# p1.join()
# p2.join()  # Control the main process and wait for the child process to end
p1.terminate()
p2.terminate()  # Main process shutdown is to force the termination of child processes, which is invalid for threads

print("External end", time.time())

# Output: total time: 5 seconds

Task name

Add and change names

import mltiprocessing

# Define a name during initialization
p1 = multiprocessing.Process(name="processName")

print(p1, p1.name)

# You can also change the name directly during operation
p1.name = "NewName"
print(p1.name)

Get current process

Get the current process in the process content to facilitate problem finding

Use mltiprocessing cuttent_ process()

Multitask identification

pid of process

In Linux, as soon as the process is created, the system will assign a pid, and the pid will not change during the program running

You can use pid to view the process's use of resources, or you can use pid to control the process's operation

import multiprocessing, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


# Simulate what needs to be done outside the process
print("External start", time.time())
p1 = multiprocessing.Process(target=func)  # Instantiate a new process object, child process
print("Before startup:", p1.pid)
p1.start()  # Start the process and use it to help us share the task
print("After startup:", p1.pid)
time.sleep(5)  # Simulate time-consuming operations
# p1.join()
# p1.terminate()

print("External end", time.time())

# Output: total time: 5 seconds

ident of thread

The thread is still in a process, so there will be no pid

Threads are scheduled by the python interpreter. For scheduling convenience, there will be ident, which is similar to pid in the operating system

import threading, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


print("External start", time.time())
t1 = threading.Thread(target=func)  # Instantiate a thread object to share the task
print("Before startup:", t1.ident) 
t1.start()
print("After startup:", t1.ident)
time.sleep(5)  # Simulate time-consuming operations
# t1.join()
print("External end:", time.time())

# The output time is still 5 seconds

Life cycle

The life cycle of a process starts at start. After instantiation, the process does not start. The life cycle starts only after start

import multiprocessing, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


# Simulate what needs to be done outside the process
print("External start", time.time())
p1 = multiprocessing.Process(target=func)  # Instantiate a new process object, child process
print("Before startup:", p1.is_alive())
p1.start()  # Start the process and use it to help us share the task
print("After startup:", p1.is_alive())  # Check whether there is a life cycle, return a Boolean value, and check whether the process is started
time.sleep(5)  # Simulate time-consuming operations
# p1.join()
# p1.terminate()

print("External end", time.time())

# Output: total time: 5 seconds

Guard mode

After the daemon mode is turned on, the main process ends and the child process ends automatically

For the main process, running completion refers to the completion of the main process code

The main process is finished running after its code is finished (the daemon is recycled at this time), and then the main process will wait until the non daemon sub processes are finished running to recycle the resources of the sub processes (otherwise, a zombie process will be generated)

For the main thread, running completion means that all non daemon threads in the process where the main thread is located have finished running, and the main thread is considered to have finished running

The main thread runs only after other non daemon threads run (the daemon thread is recycled at this time). Because the end of the main thread means the end of the process, the overall resources of the process will be recycled, and the process must ensure that all non daemon threads run before it can end

import multiprocessing, time


def func():
    print("Internal start:", time.time())
    time.sleep(5)  # Time consuming simulation operation
    print("Internal end:", time.time())


# Simulate what needs to be done outside the process
print("External start", time.time())
p1 = multiprocessing.Process(target=func, daemon=True)  # Instantiate a new process object, child process, and enable daemon mode
p1.start()  # Start the process and use it to help us share the task
time.sleep(5)  # Simulate time-consuming operations
# p1.join()
# p1.terminate()

print("External end", time.time())

# Output: total time: 5 seconds

object-oriented programming

When using multiple processes or threads, the corresponding module can be used directly or customized after inheritance

Processes communicate with each other

Communication isolation

Isolation between processes: it can be seen that in different processes, even if global variables are declared, they still have no effect.

So how to solve it?

Manager is a common solution for inter process communication. It realizes inter process communication through public space

import multiprocessing  # Process module


mg = multiprocessing.Manger()  # Create a public space and return a manager, communication medium, which can be a dictionary or list

def func():
    dict_var.update({"a": 3, "b": 4})
    
dict_var = mg.dict()  # Open and match a dictionary space, return a proxy, and operate the dictionary subspace through the proxy
print(dict_var)  # Print dictionary space, empty dictionary

# The main process modifies the dictionary space
dict_var.update({"a": 1, "b": 2})
p1 = multiprocessing.Process(target=func, args= (dict_var, ))
p1.start()
p1.join()

print(dict_var)

Thread communication

Thread sharing

For threads, they are always in the same process, so they share the same memory space, so they can access the data in the main process

Thread resource contention

A resource error occurs at this time, because the CPU calculation is a combination of multiple instructions. Therefore, if other instructions are inserted during the operation, unforeseen results will be caused

mutex

Special resources can be locked to protect resources and ensure the integrity of each operation

import threading


var = 1
# Add a mutex and get the lock
lock = threading.Lock()


# Define the tasks to be used by two threads
def func1():
    global var  # Declare global variables
    for i in range(1000000):
        lock.acquire()  # Lock before operation
        var += i
        lock.release()  # Release the lock after operation
        
        
def func2():
    global var  # Declare global variables
    for i in range(1000000):       
		lock.acquire()  # Lock before operation    
        var -= i
        lock.release()  # Release the lock after operation
        
        
# Create 2 threads
t1 = threading.Thread(target=func1)
t2 = threading.Thread(target=func2)
t1.start()
t2.start()
t1.join()
t2.join()
print(var)

queue

concept

fifo

realization

Queue put(item)
Get out of line ()
Test empty empty() approximation
Test full()
Approximate queue length qsize()
Task end task_done() thread use
Wait for the join() thread to complete

import queue

q = queue.Queue()  # Numbers in parentheses can limit the number of elements
q.put(1)
q.put(12)
q.get()
q.get()

Queue counter

import queue


q = queue.Queue(3)
q.join()  # It won't block at this time
q.put("a")  # After put, it will block, because there is a counter in the queue, and each put will increase by 1
q.get()  # The counter will not decrease by one when get
q.task_done()  # Call this method to subtract one

Queue counter

There is a counter inside the queue itself. When the count is 0, the join will not block, otherwise it will block

Producer and consumer model

Main thread: similar to the producer, it is used to generate tasks
Thread: similar to consumer, used to process tasks

import time, queue, threading


class MyThread(threading.Thread):

    def __init__(self):
        super().__init__()
        self.daemon = True  # Enable guard mode
        self.queue = queue.Queue(3)  # Open the queue object and store three tasks
        self.start()  # When instantiating, the thread is started directly, and there is no need to start the thread manually

    def run(self) -> None:  # The run method is the method of the thread. It is a built-in method and will be called automatically when the thread runs
        while True:  # Continuous processing of tasks
            func, args, kwargs = self.queue.get()
            func(*args, **kwargs)  # Call the function to execute the task tuple with variable length. Remember to unpack
            self.queue.task_done()  # Solve a task by decrementing the counter to avoid blocking

    # Producer model
    def submit_tasks(self, func, args=(), kwargs={}):  # func for the task to be executed, add the variable length parameter (the default parameter is used by default)
        self.queue.put((func, args, kwargs))  # Submit task

    # Override join method
    def join(self) -> None:
        self.queue.join()  # Check whether the queue timer is 0. If the task is empty, close the queue


def f1():
    time.sleep(2)  # The task takes two seconds
    print("Mission accomplished")


def f2(*args, **kwargs):
    time.sleep(2)
    print("Task 2 complete", args, kwargs)


print(time.ctime())  # Print thread start time

# Instantiate thread object
mt = MyThread()
# Submit task
mt.submit_tasks(f1)
mt.submit_tasks(f2, args=("aa", "aasd"), kwargs={"a": 2, "s": 3})

# Let the main thread wait until the child thread ends
mt.join()
# Thread end time
print(time.ctime())

Thread reuse

The reuse of threads is realized by using the producer and consumer model. The main process can be regarded as a producer, which mainly produces tasks, and the thread is used as a consumer to process tasks

Single threaded reuse

Thread pool

Producer and consumer model

Main thread: similar to the producer, it is used to generate tasks
Thread: similar to consumer, used to process tasks

How to open multiple threads

import time, queue, threading


class MyPool():

    def __init__(self, n):
        self.queue = queue.Queue(3)  # Open the queue object and store three tasks
        for i in range(n):  # n represents how many threads we need to manually start
            threading.Thread(daemon=True).start()  # Automatically start the thread without manually starting the thread

    # Consumer model
    def run(self) -> None:
        while True:  # Continuous processing of tasks
            func, args, kwargs = self.queue.get()
            func(*args, **kwargs)  # Call the function to execute the task tuple with variable length. Remember to unpack
            self.queue.task_done()  # Solve a task by decrementing the counter to avoid blocking

    # Producer model
    def submit_tasks(self, func, args=(), kwargs={}):  # func for the task to be executed, add the variable length parameter (the default parameter is used by default)
        self.queue.put((func, args, kwargs))  # Submit task

    # Override join method
    def join(self) -> None:
        self.queue.join()  # Check whether the queue timer is 0. If the task is empty, close the queue


def f1():
    time.sleep(2)  # The task takes two seconds
    print("Mission accomplished")


def f2(*args, **kwargs):
    time.sleep(2)
    print("Task 2 complete", args, kwargs)


print(time.ctime())  # Print thread start time

# Instantiate thread object
mt = MyPool(2)
# Submit task
mt.submit_tasks(f1)
mt.submit_tasks(f2, args=("aa", "aasd"), kwargs={"a": 2, "s": 3})

# Let the main thread wait until the child thread ends
mt.join()
# Thread end time
print(time.ctime())

Built in thread pool

import time
from multiprocessing.pool import ThreadPool  # Thread pool module


def f1():
    time.sleep(2)  # The task takes two seconds
    print("Mission accomplished")


def f2(*args, **kwargs):
    time.sleep(2)
    print("Task 2 complete", args, kwargs)


print(time.ctime())  # Print thread start time

# Instantiate thread object
pool = ThreadPool(2)
# Submit task
pool.apply_async(f1)
pool.apply_async(f2, args=("aa", "aasd"), kwds={"a": 2, "s": 3})  # Pay attention to changing to kwds for dictionary variable length parameter transmission

# It is required that the queue must be closed before the join method to make it no longer accept tasks
pool.close()

# Let the main thread wait until the child thread ends
pool.join()
# Thread end time
print(time.ctime())

Process pool

Built in process pool

import time
from multiprocessing.pool import Pool  # Process pool module


def f1():
    time.sleep(2)  # The task takes two seconds
    print("Mission accomplished")


def f2(*args, **kwargs):
    time.sleep(2)
    print("Task 2 complete", args, kwargs)


print(time.ctime())  # Print thread start time

# Instantiate process object
pool = Pool(2)
# Submit task
pool.apply_async(f1)
pool.apply_async(f2, args=("aa", "aasd"), kwds={"a": 2, "s": 3})  # Pay attention to changing to kwds for dictionary variable length parameter transmission

# It is required that the queue must be closed before the join method to make it no longer accept tasks
pool.close()

# Let the main process wait until the child process ends
pool.join()
# Process end time
print(time.ctime())

Pool concurrent server

import socket  # Socket module
from multiprocessing.pool import ThreadPool  # Thread pool module
from multiprocessing import cpu_count  # cpu core number module


# Create server
server = socket.socket()  # Server socket object
server.bind(("127.0.0.1", 8989))
server.listen(10)


def socket_recv(conn):  # Peer to peer connection socket
    # Process data multiple times and join the loop
    while True:
        recv_data = conn.recv(1024)  # receive data 
        if recv_data:  # If there is data, it will be printed and sent to the client
            print(recv_data)
        else:  # If there is no data, the conn socket is closed and the loop ends
            conn.colse()
            break


n = cpu_count()  # Get the number of cpu cores
pool = ThreadPool(n)  # How many threads are opened according to the number of cpu cores


while True:
    conn, addr = server.accept()  # Create a peer-to-peer connection socket
    pool.apply_async(socket_recv, args=(conn, ))  # The task of thread pool is data

import socket  # Socket module
from multiprocessing.pool import ThreadPool  # Thread pool module
from multiprocessing import cpu_count,Pool  # Import the module to get the number of cpu cores and add it to the process pool module

# Create server
server = socket.socket()  # Server socket object
server.bind(('127.0.0.1',8989))
server.listen(10)

# Server data processing
def socket_recv(conn):  # Peer to peer connection socket
    # Process data multiple times and join the loop
    while True:
        recv_data = conn.recv(1024)  # receive data 
        if recv_data:  # If there is data, print and send it to the client
            print(recv_data)
            conn.send(recv_data)
        else:  # If there is no data, the conn socket is closed and the loop ends
            conn.close()
            break

# Define the tasks to be done by a thread
def accpet_process(server):
    tp = ThreadPool(cpu_count()*2)
    while True:
        conn,addr = server.accept()  # Create a peer-to-peer connection socket
        tp.apply_async(socket_recv,args=(conn,))

n = cpu_count()  # Get the number of cpu cores
pool = Pool(n)  # Open process pool n is the number of process pools
for i in range(n):
    pool.apply_async(accpet_process,args=(server,))

# It is required that the queue must be closed before the join method to admit that it will no longer receive tasks
pool.close()
pool.join()

Synergetic process

concept

Coroutine is concurrency under single thread, also known as micro thread

A coroutine is a lightweight thread in user mode, that is, it is controlled and scheduled by the user program itself

advantage

The switching overhead of coroutine is smaller, which belongs to program level switching, which is completely invisible to the operating system, so it is more lightweight
The effect of concurrency can be realized in a single thread to maximize the use of cpu

shortcoming

The essence of a collaboration process is that under a single thread, multiple cores cannot be used. One program can start multiple processes, multiple threads in each process, and a collaboration process can be started in each thread
A coroutine refers to a single thread, so once a coroutine is blocked, the whole thread will be blocked

characteristic

Concurrency must be implemented in only one single thread
1. No lock is required to modify shared data
2. The user program stores the context stack of multiple control flows
Additional: when a collaboration process encounters IO operation, it will automatically switch to other collaboration processes (how to detect IO, yield and greenlet cannot be realized, so gevent module (select mechanism) is used)

generator

Through the multi generator to realize the repeated input and output of the function, such input and output can also be used to transfer data

def func():
    print(1)
    yield 2
    print(3)
    yield 4
    print(5)


a = func()  # Generator object
for i in a:  # You can use the for loop to take values
    print(i)
print(next(a))  # You can also use the next value

"""Transfer data"""
def func():
    while True:
        y = yield  # Receive values through generator objects
        print(y)


a = func()  # Generator object
next(a)  # Taking the yield object is also used to activate the yield object
a.send(111)  # send data

Producer and consumer model

import time


# Define consumer model
def consumer():
    while True:
        y = yield
        time.sleep(1)  # Process data in one second
        print("Processed data", y)


# Define producer model
def producer():
    con = consumer()  # Get the consumer object and activate it
    next(con)
    for i in range(10):
        time.sleep(1)
        print("send data", i)
        con.send(i)


producer()  # Call producer model directly

The producers and consumers here are neither process implementation nor thread implementation, but the two generators cooperate with each other. This is called synergetic process

greenlet

Third party module, to be installed

It is very inconvenient for the generator to implement the cooperation process, especially when there are many cooperation processes, it is much more convenient to use the greenlet
This module comes from a derivative version of Python Stackless Python's native coroutine (usually CPython in standard Python). Its coroutine is separately packaged into a module, so its performance is much better than that of the generator

import time
from greenlet import greenlet  # Import greenlet module


# Define consumer model
def consumer():
    while True:
        var = pro.switch()  # Switch the task to the producer, enter wait, and receive data
        print("consume:", var)


# Define producer model
def producer():
    for i in range(10):
        time.sleep(1)
        print("produce:", i)
        con.switch(i)  # Pass value to yield
        print(f"{i}Completion of production and consumption")


# Let our greenlet module do our collaborative tasks
con = greenlet(consumer)
pro = greenlet(producer)  # producer

# The greenlet module switches tasks through the switch method. When there are five parameters in the switch, the task will be switched and is in the state of waiting to receive data. If there are parameters, the parameters will be sent to the switch without parameters as data
con.switch()

The IO blocking problem is not solved here, but we use this time to do other things. Generally, in our work, we implement concurrency in the way of process + thread + CO process to achieve the best concurrency effect

gevent

gevent concurrent server

# The third-party module needs to be downloaded: pip install gevent
# monkey patch 
from gevent import monkey;monkey.patch_all()  # Monkey patch is required when using gevent module
import gevent, socket  # Co process module, socket module


# Create server
server = socket.socket()  # Server socket
server.bind(("127.0.0.1", 8989))
server.listen(10)


# Server data processing
def socket_recv(conn):  # Peer to peer connection socket
    # Process data multiple times and join the loop
    while True:
        recv_data = conn.recv(1024)  # receive data 
        if recv_data:  # If there is data, print and send it to the client
            print(recv_data)
            conn.send(recv_data)
        else:  # If there is no data, the conn socket is closed and the loop ends
            conn.close()
            break


while True:
    conn, addr = server.accept()
    # Instantiate a coroutine object
    gevent.spawn(socket_recv, conn)

gevent encapsulates epoll and greenlet, which is more convenient to use
At the same time, it realizes the automatic switching when IO is blocked

Producer and consumer model

# The third-party module needs to be downloaded: pip install gevent
# monkey patch 
from gevent import monkey;monkey.patch_all()  # Monkey patch is required when using gevent module
import gevent  # Synergetic module
from gevent.queue import Queue  # gevent's own queue module

que = Queue(3)


# Producer model
def producer():
    for i in range(20):
        print("produce:", i)
        que.put(i)


# Consumer model
def consumer():
    for i in range(20):
        var = que.get()
        print("consume:", var)


# gevent module is used to do our collaborative tasks
pro = gevent.spawn(producer, que)
con = gevent.spawn(consumer, que)
# Wait until all cooperation processes are completed
gevent.joinall([pro, con])

Implement asynchrony

Synchronization: all methods are executed in sequence, and the total time spent is the sum of all methods

Asynchrony: corresponding to synchronization, asynchrony means that the CPU temporarily suspends the response of the current request, processes the next request, and starts running after receiving the callback notification through polling or other methods. Multitasking is done asynchronously in subtasks

# gevent implements synchronization and asynchrony
# monkey patch 
from gevent import monkey;monkey.patch_all()  # Monkey patch is required when using gevent module
import gevent, time  # Synergetic module


# Define what a task needs to do both synchronously and asynchronously
def task(i):
    time.sleep(1)  # Analog blocking
    print(f"{i}task over")


# Define a synchronization method to simulate synchronous submission of tasks
def syn():
    for i in range(10):
        task(i)


# Define an asynchronous method to simulate asynchronous task submission
def asy():
    gl = [gevent.spawn(task, i) for i in range(10)]  # List derivation takes less time and is more efficient
    gevent.joinall(gl)


print("Start running")
start = time.time()  # At the beginning of the code
# syn()
asy()
end = time.time()  # At the end of the code
print(f"Running time{start-end}")

Topics: Python network Network Protocol

Programmer Think

Python Network Programming

Transmission model

hierarchical model

application layer

Transport layer

TCP

UDP

network layer

IPV4 and IPV6

port

TCP protocol

Establish connection

Long connection and short connection

Disconnect

Create socket

Server

client

Peer to peer connection socket

Data interaction

block

Basic IO model

Data flow concept

IO interpretation and IO interaction

Blocking IO

Non blocking IO model and non blocking socket

Non blocking IO model

non-blocking IO

Concurrency and parallelism

Concurrent

parallel

IO multiplexing

epoll

Multitasking

CPU time slice

process

Multi process

Multithreading

code implementation

GIL global interpretation lock

Multitask operation control

Wait for the subtask to end

Terminate task

Task name

Multitask identification

pid of process

ident of thread

Life cycle

Guard mode

object-oriented programming

Processes communicate with each other

Communication isolation

Thread communication

Thread sharing

Thread resource contention

mutex

queue

concept

realization

Queue counter

Single threaded reuse

Thread pool

Built in thread pool

Process pool

Pool concurrent server

Synergetic process

concept

Producer and consumer model

greenlet

gevent

Implement asynchrony

Hot Topics