python base (30): other methods of sticking, socket s

Posted by justice1 on Sat, 16 Nov 2019 10:44:02 +0100

1. Sticky

1.1 sticking phenomenon

Let's start by making a program that executes commands remotely based on tcp (commands ls-l; l l l l l; pwd)

When multiple commands are executed at the same time, it is likely that only part of the result will be obtained, and when other commands are executed, another part of the result will be received, which is the glue.

1.1.1 Packets Implemented Based on TCP Protocol

server:

#_*_coding:utf-8_*_
from socket import *
import subprocess

ip_port=('127.0.0.1',8888)
BUFSIZE=1024

tcp_socket_server=socket(AF_INET,SOCK_STREAM)
tcp_socket_server.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
tcp_socket_server.bind(ip_port)
tcp_socket_server.listen(5)

while True:
    conn,addr=tcp_socket_server.accept()
    print('Client',addr)

    while True:
        cmd=conn.recv(BUFSIZE)
        if len(cmd) == 0:break

        res=subprocess.Popen(cmd.decode('utf-8'),shell=True,
                         stdout=subprocess.PIPE,
                         stdin=subprocess.PIPE,
                         stderr=subprocess.PIPE)

        stderr=res.stderr.read()
        stdout=res.stdout.read()
        conn.send(stderr)
        conn.send(stdout)

client:

#_*_coding:utf-8_*_
import socket
BUFSIZE=1024
ip_port=('127.0.0.1',8888)

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(ip_port)

while True:
    msg=input('>>: ').strip()
    if len(msg) == 0:continue
    if msg == 'quit':break

    s.send(msg.encode('utf-8'))
    act_res=s.recv(BUFSIZE)

    print(act_res.decode('utf-8'),end='')

1.1.2 sticky package based on UDP protocol

server:

#_*_coding:utf-8_*_
from socket import *
import subprocess

ip_port=('127.0.0.1',9000)
bufsize=1024

udp_server=socket(AF_INET,SOCK_DGRAM)
udp_server.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
udp_server.bind(ip_port)

while True:
    #Receive messages
    cmd,addr=udp_server.recvfrom(bufsize)
    print('User Commands----->',cmd)

    #Logical Processing
    res=subprocess.Popen(cmd.decode('utf-8'),shell=True,stderr=subprocess.PIPE,stdin=subprocess.PIPE,stdout=subprocess.PIPE)
    stderr=res.stderr.read()
    stdout=res.stdout.read()

    #Send a message
    udp_server.sendto(stderr,addr)
    udp_server.sendto(stdout,addr)
udp_server.close()

client:

from socket import *
ip_port=('127.0.0.1',9000)
bufsize=1024

udp_client=socket(AF_INET,SOCK_DGRAM)


while True:
    msg=input('>>: ').strip()
    udp_client.sendto(msg.encode('utf-8'),ip_port)
    err,addr=udp_client.recvfrom(bufsize)
    out,addr=udp_client.recvfrom(bufsize)
    if err:
        print('error : %s'%err.decode('utf-8'),end='')
    if out:
        print(out.decode('utf-8'), end='')

Note: Only TCP sticks, UDP never sticks.

Causes of 1.2 Glue

Data Transfer in 1.2.1 TCP Protocol

(1) Unpacking mechanism of TCP protocol

When the length of the sender buffer is larger than the MTU of the network card, tcp will split the sent data into several packets and send them out.MTU is the abbreviation of Maximum Transmission Unit.This means the largest packet transmitted over the network.The unit of MTU is bytes.Most network devices have a MTU of 1500.If the MTU of the local machine is larger than that of the gateway, large packets will be unpacked for transmission, which will result in a lot of packet fragmentation, increase the packet loss rate and slow down the network speed.

(2) Stream-oriented communication features and Nagle algorithm

TCP (transport control protocol) is a connection-oriented, stream-oriented and highly reliable service.
There is a pair of socket s on both sides of the receiver and receiver (client and server), so in order to send multiple packets to the receiver more efficiently, the sender uses an optimization method (Nagle algorithm), which combines data with small intervals and a small amount of data into a large data block and then packages it.
In this way, the receiver is difficult to distinguish, and a scientific unpacking mechanism must be provided.That is, stream-oriented communication protects boundaries without messages.
For empty messages: tcp is based on data streams, so messages sent and received cannot be empty. This requires adding a mechanism for handling empty messages on both the client and the server to prevent the program from getting stuck. udp is based on datagrams and can be sent even if you enter empty content (return directly). udp protocol can help you encapsulate the message header and send it to you.
Reliable sticky tcp protocol: tcp protocol data will not be lost, no package is received, next receive will continue the last receive, the client will always clear the buffer content when the ack is received.The data is reliable, but sticky.

(3) Causes of sticking based on tcp protocol features

 

 

Explanation of user state and kernel state in socket data transmission:

The sender can send data one kilogram at a time, while the receiver's application can fetch data two kilograms at a time, or it can fetch data three or six kilograms at a time, or only a few bytes at a time.
That is, the data an application sees is a whole, or a stream, and how many bytes of a message is not visible to the application, so the TCP protocol is a stream-oriented protocol, which is also the cause of the sticky package problem.
UDP is a message-oriented protocol. Every UDP segment is a message. Applications must extract data in units of messages. They cannot extract any byte of data at a time, which is different from TCP.
How do you define messages?You can think of the other party's one-time write/send data as a message, but you need to understand that when the other party send a message, no matter how the underlying fragmentation is, the TCP protocol layer will sort the data segments that make up the entire message before rendering them in the kernel buffer.

For example, a tcp-based socket client uploads a file to the server, which is sent as a byte stream. When the receiver sees it, he does not know where the byte stream of the file starts or ends.

In addition, the sticky packets caused by the sender are caused by the TCP protocol itself. To improve the transmission efficiency, the sender often needs to collect enough data before sending a TCP segment.If send data is scarce several times in a row, TCP usually synthesizes the data into a TCP segment and sends it out one time based on the optimization algorithm, so the receiver receives the sticky data.

1.2.2 UDP does not stick

UDP (user datagram protocol) is connectionless, message-oriented, and provides efficient services.
The block merge optimization algorithm is not used, because UDP supports a one-to-many mode, the skbuff (socket buffer) on the receiving side uses a chain structure to record each incoming UDP packet, with a message header (message source address, port, and so on) in each UDP packet, making it easy to distinguish between the processing on the receiving side.That is, message-oriented communication has a message protection boundary.
For empty messages: tcp is based on data streams, so messages sent and received cannot be empty. This requires adding a mechanism for handling empty messages on both the client and the server to prevent the program from getting stuck. udp is based on datagrams and can be sent even if you enter empty content (return directly). udp protocol can help you encapsulate the message header and send it to you.
Unreliable non-sticky UDP protocol: udp's recvfrom is blocked, a recvfrom(x) must complete on the only sendinto(y) that has received x bytes of data. If y;x data is lost, this means UDP will not stick to the package at all, but will lose data, which is unreliable.

When sending with UDP protocol, the maximum length of data that can be sent with sendto function is 65535- IP header (20) - UDP header (8) = 65507 bytes.When sending data with the sendto function, the function returns an error if the sending data length is greater than that value.(Discard this package and do not send it).

When sending with the TCP protocol, there is no packet size limit (regardless of buffer size) since TCP is a data flow protocol. This means that the data length parameter is not limited when using the send function.In fact, the specified data is not necessarily sent out at once. If the data is long, it will be sent in segments. If it is short, it may wait for the next data to be sent.

1.2.3 There are two cases of stickiness.

(1) Scenario 1: Caching mechanism of sender

The sender needs to wait until the buffer is full to send out, causing a glue packet (the time interval between sending data is short, the data is small, and will merge to produce a glue packet).

server:

#_*_coding:utf-8_*_
from socket import *
ip_port=('127.0.0.1',8080)

tcp_socket_server=socket(AF_INET,SOCK_STREAM)
tcp_socket_server.bind(ip_port)
tcp_socket_server.listen(5)

conn,addr=tcp_socket_server.accept()

data1=conn.recv(10)
data2=conn.recv(10)

print('----->',data1.decode('utf-8'))
print('----->',data2.decode('utf-8'))

conn.close()

client:

#_*_coding:utf-8_*_
import socket
BUFSIZE=1024
ip_port=('127.0.0.1',8080)

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(ip_port)

s.send('hello'.encode('utf-8'))
s.send('egg'.encode('utf-8'))

(2) Scenario 2: Receiver's cache mechanism

The recipient did not receive the buffer's packets in time, resulting in multiple packet receipts (the client sent a piece of data, the server only received a small part of it, and the next time the server received it, it would still pick up the last remaining data from the buffer, resulting in a sticky package).

server:

#_*_coding:utf-8_*_
from socket import *
ip_port=('127.0.0.1',8080)

tcp_socket_server=socket(AF_INET,SOCK_STREAM)
tcp_socket_server.bind(ip_port)
tcp_socket_server.listen(5)


conn,addr=tcp_socket_server.accept()


data1=conn.recv(2) #Not received completely at one time
data2=conn.recv(10)#Next time,Old data will be retrieved first,Then take a new one

print('----->',data1.decode('utf-8'))
print('----->',data2.decode('utf-8'))

conn.close()

client:

#_*_coding:utf-8_*_
import socket
BUFSIZE=1024
ip_port=('127.0.0.1',8080)

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(ip_port)


s.send('hello egg'.encode('utf-8'))

1.2.4 Cause Summary of Sticky Package

Sticky phenomena occur only in the tcp protocol:

1. Apparently, the sticky problem is mainly due to the sender and receiver's caching mechanism and the characteristics of tcp protocol for flow communication.

2. In fact, it is mainly because the recipient does not know the boundaries between messages or how many bytes of data to fetch at one time.

Solution for 1.3 sticky packs

1.3.1 Solution 1

The root of the problem is that the receiver does not know the length of the byte stream that the sender will transmit, so the solution to the sticky package is to let the sender know the total size of the byte stream it will send before sending the data, and then the receiver receives all the data in a dead loop.

 

 server:

#_*_coding:utf-8_*_
import socket,subprocess
ip_port=('127.0.0.1',8080)
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

s.bind(ip_port)
s.listen(5)

while True:
    conn,addr=s.accept()
    print('Client',addr)
    while True:
        msg=conn.recv(1024)
        if not msg:break
        res=subprocess.Popen(msg.decode('utf-8'),shell=True,\
                            stdin=subprocess.PIPE,\
                         stderr=subprocess.PIPE,\
                         stdout=subprocess.PIPE)
        err=res.stderr.read()
        if err:
            ret=err
        else:
            ret=res.stdout.read()
        data_length=len(ret)
        conn.send(str(data_length).encode('utf-8'))
        data=conn.recv(1024).decode('utf-8')
        if data == 'recv_ready':
            conn.sendall(ret)
    conn.close()

client:

#_*_coding:utf-8_*_
import socket,time
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(('127.0.0.1',8080))

while True:
    msg=input('>>: ').strip()
    if len(msg) == 0:continue
    if msg == 'quit':break

    s.send(msg.encode('utf-8'))
    length=int(s.recv(1024).decode('utf-8'))
    s.send('recv_ready'.encode('utf-8'))
    send_size=0
    recv_size=0
    data=b''
    while recv_size < length:
        data+=s.recv(1024)
        recv_size+=len(data)

    print(data.decode('utf-8'))

Problems:

The program runs much faster than the network, so send the byte stream length before sending a byte, which amplifies the performance loss caused by network latency.

1.3.2 Scheme II

Just now, the problem is that we are sending.

We can use a module that converts the length of the data to be sent into a fixed-length byte.This way, as long as the client accepts this fixed-length byte before receiving the message each time to see the size of the information to be received next, as long as the ultimate accepted data reaches this value, it will be able to receive just as many complete data.

(1) Strct module

The module can convert a type, such as a number, to bytes of a fixed length.

>>> struct.pack('i',1111111111111)

struct.error: 'i' format requires -2147483648 <= number <= 2147483647 #This is the range

import json,struct
#Suppose you upload 1 through the client T:1073741824000 Files a.txt

#To avoid sticking,Must customize headers
header={'file_size':1073741824000,'file_name':'/a/b/c/d/e/a.txt','md5':'8f6fbf8347faa4924a76856701edb0f3'} #1T data,File Path and md5 value

#For this header to transmit,Requires serialization and conversion to bytes
head_bytes=bytes(json.dumps(header),encoding='utf-8') #Serialize and convert to bytes,For transmission

#To let the client know the length of the header,use struck Convert the header length number to a fixed length:4 Bytes
head_len_bytes=struct.pack('i',len(head_bytes)) #These four bytes contain only one number,This number is the length of the header

#Client starts sending
conn.send(head_len_bytes) #Length of the advance header,4 individual bytes
conn.send(head_bytes) #Byte format of the redistribution header
conn.sendall(File Content) #Then send the real content in byte format

#Server Starts Receiving
head_len_bytes=s.recv(4) #First four bytes,Get byte format for header length
x=struct.unpack('i',head_len_bytes)[0] #Extract header length

head_bytes=s.recv(x) #By header length x,Header collector bytes format
header=json.loads(json.dumps(header)) #Extract headers

#Finally, extract the real data from the content of the header,such as
real_data_len=s.recv(header['file_size'])
s.recv(real_data_len)

Detailed usage of struct:

#_*_coding:utf-8_*_
#http://www.cnblogs.com/coser/archive/2011/12/17/2291160.html
__author__ = 'Linhaifeng'
import struct
import binascii
import ctypes

values1 = (1, 'abc'.encode('utf-8'), 2.7)
values2 = ('defg'.encode('utf-8'),101)
s1 = struct.Struct('I3sf')
s2 = struct.Struct('4sI')

print(s1.size,s2.size)
prebuffer=ctypes.create_string_buffer(s1.size+s2.size)
print('Before : ',binascii.hexlify(prebuffer))
# t=binascii.hexlify('asdfaf'.encode('utf-8'))
# print(t)

s1.pack_into(prebuffer,0,*values1)
s2.pack_into(prebuffer,s1.size,*values2)

print('After pack',binascii.hexlify(prebuffer))
print(s1.unpack_from(prebuffer,0))
print(s2.unpack_from(prebuffer,s1.size))

s3=struct.Struct('ii')
s3.pack_into(prebuffer,0,123,123)
print('After pack',binascii.hexlify(prebuffer))
print(s3.unpack_from(prebuffer,0))

(2) Resolving stickies with struct

With the struct module, we know that length numbers can be converted to a standard size of 4 bytes.This feature can be used to pre-send data lengths.

 

 server:

import socket,struct,json
import subprocess
phone=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
phone.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1) #That's it, in bind preposed

phone.bind(('127.0.0.1',8080))

phone.listen(5)

while True:
    conn,addr=phone.accept()
    while True:
        cmd=conn.recv(1024)
        if not cmd:break
        print('cmd: %s' %cmd)

        res=subprocess.Popen(cmd.decode('utf-8'),
                             shell=True,
                             stdout=subprocess.PIPE,
                             stderr=subprocess.PIPE)
        err=res.stderr.read()
        print(err)
        if err:
            back_msg=err
        else:
            back_msg=res.stdout.read()

        conn.send(struct.pack('i',len(back_msg))) #Initial back_msg Length
        conn.sendall(back_msg) #In Real Content

    conn.close()

client:

#_*_coding:utf-8_*_
import socket,time,struct

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
res=s.connect_ex(('127.0.0.1',8080))

while True:
    msg=input('>>: ').strip()
    if len(msg) == 0:continue
    if msg == 'quit':break

    s.send(msg.encode('utf-8'))

    l=s.recv(4)
    x=struct.unpack('i',l)[0]
    print(type(x),x)
    # print(struct.unpack('I',l))
    r_s=0
    data=b''
    while r_s < x:
        r_d=s.recv(1024)
        data+=r_d
        r_s+=len(r_d)

    # print(data.decode('utf-8'))
    print(data.decode('gbk')) #windows default gbk Code

We can also make a header into a dictionary that contains details of the actual data that will be sent, then json serializes it, and then uses struck to package the serialized data into four bytes (four are sufficient on our own).

 

 server:

import socket
import struct
import json
import subprocess
import os

class MYTCPServer:
    address_family = socket.AF_INET

    socket_type = socket.SOCK_STREAM

    allow_reuse_address = False

    max_packet_size = 8192

    coding='utf-8'

    request_queue_size = 5

    server_dir='file_upload'

    def __init__(self, server_address, bind_and_activate=True):
        """Constructor.  May be extended, do not override."""
        self.server_address=server_address
        self.socket = socket.socket(self.address_family,
                                    self.socket_type)
        if bind_and_activate:
            try:
                self.server_bind()
                self.server_activate()
            except:
                self.server_close()
                raise

    def server_bind(self):
        """Called by constructor to bind the socket.
        """
        if self.allow_reuse_address:
            self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.socket.bind(self.server_address)
        self.server_address = self.socket.getsockname()

    def server_activate(self):
        """Called by constructor to activate the server.
        """
        self.socket.listen(self.request_queue_size)

    def server_close(self):
        """Called to clean-up the server.
        """
        self.socket.close()

    def get_request(self):
        """Get the request and client address from the socket.
        """
        return self.socket.accept()

    def close_request(self, request):
        """Called to clean up an individual request."""
        request.close()

    def run(self):
        while True:
            self.conn,self.client_addr=self.get_request()
            print('from client ',self.client_addr)
            while True:
                try:
                    head_struct = self.conn.recv(4)
                    if not head_struct:break

                    head_len = struct.unpack('i', head_struct)[0]
                    head_json = self.conn.recv(head_len).decode(self.coding)
                    head_dic = json.loads(head_json)

                    print(head_dic)
                    #head_dic={'cmd':'put','filename':'a.txt','filesize':123123}
                    cmd=head_dic['cmd']
                    if hasattr(self,cmd):
                        func=getattr(self,cmd)
                        func(head_dic)
                except Exception:
                    break

    def put(self,args):
        file_path=os.path.normpath(os.path.join(
            self.server_dir,
            args['filename']
        ))

        filesize=args['filesize']
        recv_size=0
        print('----->',file_path)
        with open(file_path,'wb') as f:
            while recv_size < filesize:
                recv_data=self.conn.recv(self.max_packet_size)
                f.write(recv_data)
                recv_size+=len(recv_data)
                print('recvsize:%s filesize:%s' %(recv_size,filesize))

tcpserver1=MYTCPServer(('127.0.0.1',8080))

tcpserver1.run()

#The following code is not relevant to this topic
class MYUDPServer:

    """UDP server class."""
    address_family = socket.AF_INET

    socket_type = socket.SOCK_DGRAM

    allow_reuse_address = False

    max_packet_size = 8192

    coding='utf-8'

    def get_request(self):
        data, client_addr = self.socket.recvfrom(self.max_packet_size)
        return (data, self.socket), client_addr

    def server_activate(self):
        # No need to call listen() for UDP.
        pass

    def shutdown_request(self, request):
        # No need to shutdown anything.
        self.close_request(request)

    def close_request(self, request):
        # No need to close anything.
        pass

client:

import socket
import struct
import json
import os

class MYTCPClient:
    address_family = socket.AF_INET

    socket_type = socket.SOCK_STREAM

    allow_reuse_address = False

    max_packet_size = 8192

    coding='utf-8'

    request_queue_size = 5

    def __init__(self, server_address, connect=True):
        self.server_address=server_address
        self.socket = socket.socket(self.address_family,
                                    self.socket_type)
        if connect:
            try:
                self.client_connect()
            except:
                self.client_close()
                raise

    def client_connect(self):
        self.socket.connect(self.server_address)

    def client_close(self):
        self.socket.close()

    def run(self):
        while True:
            inp=input(">>: ").strip()
            if not inp:continue
            l=inp.split()
            cmd=l[0]
            if hasattr(self,cmd):
                func=getattr(self,cmd)
                func(l)

    def put(self,args):
        cmd=args[0]
        filename=args[1]
        if not os.path.isfile(filename):
            print('file:%s is not exists' %filename)
            return
        else:
            filesize=os.path.getsize(filename)

        head_dic={'cmd':cmd,'filename':os.path.basename(filename),'filesize':filesize}
        print(head_dic)
        head_json=json.dumps(head_dic)
        head_json_bytes=bytes(head_json,encoding=self.coding)

        head_struct=struct.pack('i',len(head_json_bytes))
        self.socket.send(head_struct)
        self.socket.send(head_json_bytes)
        send_size=0
        with open(filename,'rb') as f:
            for line in f:
                self.socket.send(line)
                send_size+=len(line)
                print(send_size)
            else:
                print('upload successful')


client=MYTCPClient(('127.0.0.1',8080))

client.run()

2. Introduction to other socket methods

Server-side socket functions
 s.bind() bind (host, port number) to socket
 s.listen() Starts TCP listening
 s.accept() passively accept s connections from TCP clients, (blocking) waits for connections to arrive

Client Socket Functions
 s.connect() actively initializes TCP server connections
 An extended version of the s.connect_ex() connect() function that returns an error code when an error occurs rather than throwing an exception

Socket functions for public use
 s.recv() receives TCP data
 s.send() send s TCP data
 s.sendall() sends TCP data
 s.recvfrom() receives UDP data
 s.sendto() Sends UDP data
 The address where s.getpeername() connects to the remote end of the current socket
 s.getsockname() Address of the current socket
 s.getsockopt() returns the parameters of the specified socket
 s.setsockopt() Sets the parameters for the specified socket
 s.close() close s the socket

Lock-oriented socket method
 s.setblocking() sets the blocking and non-blocking mode of the socket
 s.settimeout() Sets the timeout for blocking socket operations
 s.gettimeout() obtains the timeout for blocking socket operations

Functions for file-oriented sockets
 File descriptor for s.fileno() socket
 s.makefile() creates a file associated with the socket

send and sendall methods:

Official Document Pairs socket Under module socket.send()and socket.sendall()Explain as follows:

socket.send(string[, flags])
Send data to the socket. The socket must be connected to a remote socket. The optional flags argument has the same meaning as for recv() above. Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data.

send()The return value of is the number of bytes sent, which may be less than the number to be sent string Number of bytes, that is, you may not be able to send string All data in.An exception is thrown if there are errors.

–

socket.sendall(string[, flags])
Send data to the socket. The socket must be connected to a remote socket. The optional flags argument has the same meaning as for recv() above. Unlike send(), this method continues to send data from string until either all data has been sent or an error occurs. None is returned on success. On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent.

//Attempts to send all string data return None if successful and throws an exception if failed.

//Therefore, the following two pieces of code are equivalent:

#sock.sendall('Hello world\n')

#buffer = 'Hello world\n'
#while buffer:
#    bytes = sock.send(buffer)
#    buffer = buffer[bytes:]

Topics: Python socket JSON network shell