Introduction *: tcp is a protocol of network communication. For programmers, the protocol is to write a program according to the agreed protocol. For specific coding, compared with the previous demo of socket communication, socket communication will not use a new API, but add corresponding information to the header of the sent message according to the tcp protocol
Network programming layer 4 TCP/ip
First cp a picture that everyone has seen
OSI seven layer protocol model (open system interconnection)
Application layer - provides services for application data
Presentation layer -- data format conversion, data encryption
Session layer - establish, maintain and manage sessions
Transport layer - establish, maintain and manage end-to-end links and control the way of data transmission
Network layer - data transmission line selection, IP address and routing
Data link layer - the sending of physical paths and the division of data packets, and the addition of Mac addresses to data packets
Physical layer -- conversion of 01 bit stream
Data transmission is from top to bottom, and the lower layer provides services for the upper layer
TCP/IP four layer protocol model
Application layer - responsible for handling specific application details, such as FTP, HTTP, SMTP, SSH, etc
Transport layer - mainly provides end-to-end communication for applications on two hosts, such as TCP and UDP.
Network layer (Internet layer) - handle the activities of packets in the network, such as packet routing.
Link layer (data link layer / network interface layer) - including the device driver in the operating system, the corresponding network interface card in the computer, and the conversion of 01 bit stream
Protocol encapsulation
The lower layer protocol provides services for the upper layer protocol through encapsulation. Application data will be passed from top to bottom along the protocol stack before being sent to the physical network. Each layer protocol will add its own header information (sometimes including tail information) on the basis of the upper layer data to realize the functions of this layer.
TCP protocol header
TCP protocol header
Source port number and destination port number: add the source IP address and destination IP address of IP header to uniquely determine a TCP connection
Data sequence number: the sequence number of the first data byte in this message segment
Confirmation serial number: valid only when ACK flag is 1. The confirmation number indicates the sequence number of the next byte expected to be received (this will be analyzed in detail below)
Offset: it is the length of the header. It has 4 bits. It is the same as the IP header and is in 4 bytes. The maximum is 60 bytes
Reserved bit: 6 bits, must be 0
6 flag bits:
URG emergency pointer valid
ACK - confirm that the serial number is valid
The PSH receiver shall deliver this message to the application layer as soon as possible
RST connection reset
SYN synchronization sequence number is used to initiate a connection
FIN - terminate a connection
Window field: 16 bits, which represents the byte capacity of the window, that is, the maximum size of TCP standard window is 2 ^ 16 - 1 = 65535 bytes
Checksum: the source machine calculates a value based on the data content, and the receiver must have the same value result as the source machine, so as to prove the effectiveness of the data. Check and cover the whole TCP message segment: This is a mandatory field, which must be calculated and stored by the sender and verified by the receiver.
Emergency pointer: it is a positive offset added with the value in the sequence number field to represent the sequence number of the last byte of emergency data. The emergency mode of TCP is a way that the sending end sends emergency data to the other end
Options and padding (must be an integer multiple of 4 bytes, not enough to supplement 0):
The longest message size MSS (Maximum Segment Size) of the most common optional field. Each connector usually indicates this option in a message segment. It indicates the maximum length of message segment that can be received by the local end.
If this option is not set, it defaults to 536 (20 + 20 + 536 = 576 byte IP datagram)
Three handshakes
(1) The boy likes girls, so he wrote a letter to tell the girl: I like you, please associate with me! After writing the letter, the boy waited anxiously because he didn't know whether the letter could be successfully conveyed to the girl.
(2) The girl was elated after receiving the boy's love letter. It turned out that we were in love! So I wrote a reply to the boy: I received your love letter and understood your mind. In fact, I like you too! I am willing to associate with you!;
After writing the letter, the girl also waited anxiously, because she didn't know whether the reply could be successfully conveyed to the boy.
(3) The boy was very happy after receiving the reply, because the girl received the love letter and learned from the reply that the girl liked herself and was willing to associate with herself. Then the boy wrote another letter to the girl: I have received your heart and letter. Thank you and I love you!
After the girl received the boy's reply, she was also very happy because the boy received the love letter. Therefore, both boys and girls knew each other's feelings, and then they exchanged happily~~
The so-called triple handshake is the establishment of TCP connection. This connection must be opened actively by one party and passively by the other.
Four waves
1. The client sends the message of disconnecting TCP connection request, in which the message contains seq serial number, which is randomly generated by the sender, and the FIN field in the message is set to 1, indicating that the TCP connection needs to be disconnected. (FIN=1, seq=x, X is randomly generated by the client)
2. The server will reply to the TCP disconnection request message sent by the client, which contains the seq serial number, which is randomly generated by the replying end, and will generate an ACK field. The ACK field value is added 1 to the seq serial number sent by the client to reply, so that when the client receives the information, it knows that its TCP disconnection request has been verified. (FIN=1, ACK=x+1, seq=y, y is randomly generated by the server)
3. After the server replies to the client's TCP disconnection request, it will not disconnect the TCP connection immediately. The server will first ensure whether all the data transmitted to A has been transmitted before disconnection. Once it is confirmed that the data transmission is completed, it will set the FIN field of the reply message to 1 and generate A random seq sequence number. (FIN=1, ACK=x+1, seq=z, z is randomly generated by the server)
4. After receiving the TCP disconnection request from the server, the client will reply to the disconnection request from the server, including the randomly generated seq field and ACK field. The ACK field will add 1 to the seq of the TCP disconnection request from the server, so as to complete the verification reply requested by the server. (FIN=1, ACK=z+1, seq=h, h is randomly generated by the client)
So far, the four waving processes of TCP disconnection are completed.
Subcontracting and sticking
TCP subcontracting
Scenario: the sender sends the string "Hello world", but the receiver receives two data packets respectively: the string "hello" and "world". The sender sends a large amount of data. When the receiver reads the data, the data arrives in batches, resulting in one transmission and multiple readings;
TCP sends data in segments. After establishing a TCP link, there is a maximum message length (MSS). If the application layer packet exceeds MSS, the application layer packet will be split and sent in two segments
At this time, the application layer at the receiving end must splice the two TCP packets in order to correctly process the data.
Related, the router has an MTU (maximum transmission unit), which is generally 1500 bytes, excluding 20 bytes in the IP header,
Only MTU-20 bytes are left for TCP. Therefore, the MSS of general TCP is MTU-20=1460 bytes
When the application layer data exceeds 1460 bytes, TCP will send it in multiple packets.
TCP sticky packet
**Scenario: * * the sender sends the string "Hello world", but the receiver receives two strings "hello" and "world"
The sending end sends data several times, and the receiving end reads all data at one time, resulting in multiple sending and one reading; It is usually network traffic optimization, which gathers multiple small data segments to a certain amount of data, so as to reduce the transmission times in the network link
Causes of TCP packet sticking:
In order to improve the utilization of the network, TCP will use an algorithm called Nagle. This algorithm means that even if the sender has little data to send, it will delay sending. If the application layer transmits data to TCP quickly, it will "stick" the two application layer packets together, and TCP will only send one TCP packet to the receiver
Subcontracting and sticking solutions
Before sending data, append two bytes of length to the data:
4 bytes N bytes
FBEB data length N data content
- Package identification: a special identification of the package header, which is used to identify the beginning of the package
- Data length: packet size, fixed length, 2, 4, or 8 bytes.
- Data content: data content. The length is the length defined by the data header.
The actual operation is as follows:
a) Sender: first send the packet representation and length, and then send the data content.
b) Receiver: first analyze the size N of the packet, and then read N bytes, which are a complete data content.
The specific process is as follows:
Network communication tcp Protocol segment sending test and its code
This program is improved on the basis of the previous echo server, and the header of the request is added to send data in segments
Expected realization
After the server starts, wait for the client to connect
The client carries in the data to be sent through the command line
The server uses tcp protocol data header to identify the identifier, which is defined as: fbeb data length is defined as 4 bytes
+Information
Header identification identifier + data length to be sent + data received by the data client, which is divided into the first five data and the data after the fifth data
The test results are as follows:
Client code
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #define SERVER_PORT 8888 #define SERVER_IP "127.0.0.1" #define DATE_LEN_BYTES 4 const char* TRG = "febe"; int main(int argc, char *argv[]){ int sockfd; char *message; struct sockaddr_in servaddr; int n; char * buf=NULL; if(argc != 2){ fputs("Usage: ./echo_client message \n", stderr); exit(1); } message = argv[1]; printf("message: %s\n", message); sockfd = socket(AF_INET, SOCK_STREAM, 0); memset(&servaddr, '\0', sizeof(struct sockaddr_in)); servaddr.sin_family = AF_INET; inet_pton(AF_INET, SERVER_IP, &servaddr.sin_addr); servaddr.sin_port = htons(SERVER_PORT); connect(sockfd, (struct sockaddr *)&servaddr, sizeof(servaddr)); // write(sockfd, message, strlen(message)); //Send TRG and date first_ LEN_ BYTES int ms_len = strlen(message); int tag_len = strlen(TRG); buf = (char*)malloc(ms_len+tag_len+DATE_LEN_BYTES); strcpy(buf,TRG); *(int*)(buf+tag_len) = ms_len; memcpy(buf+tag_len+DATE_LEN_BYTES,message,ms_len); write(sockfd,buf,tag_len+DATE_LEN_BYTES); write(sockfd,buf+tag_len+DATE_LEN_BYTES,5); sleep(2); write(sockfd,buf+tag_len+DATE_LEN_BYTES+5,ms_len-5); n = read(sockfd, buf,tag_len + DATE_LEN_BYTES + ms_len-1); if(n>0){ buf[n]='\0'; printf("receive: %s\n", buf); }else { perror("error!!!"); } printf("finished.\n"); close(sockfd); }
Server side code
#include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <string.h> #include <ctype.h> #include <arpa/inet.h> #include<stdlib.h> #define SERVER_PORT 8888 #define DATE_LEN_BYTES 4 const char*TRG = "febe"; int read_len(int client_sock,char *buf,unsigned int len){ int tag_len = strlen(TRG); //Read packet header int readlen = read(client_sock , buf,tag_len + DATE_LEN_BYTES); if(strncmp(buf,TRG,4)==0){//legitimate int date_len = *(int*)(buf+tag_len); int count = 0; while(count<date_len){ readlen = read(client_sock,buf+count,date_len-count); printf("readlen:%d\n",readlen); count += readlen; } return date_len; } } int main(void){ int sock;//Representative mailbox struct sockaddr_in server_addr; //1. Create mailbox sock = socket(AF_INET, SOCK_STREAM, 0); //2. Empty the label and write the address and port number bzero(&server_addr, sizeof(server_addr)); server_addr.sin_family = AF_INET;//Select protocol family IPV4 server_addr.sin_addr.s_addr = htonl(INADDR_ANY);//Listen to all local IP addresses server_addr.sin_port = htons(SERVER_PORT);//Binding port number //The label is pasted on the receiving mailbox bind(sock, (struct sockaddr *)&server_addr, sizeof(server_addr)); //Hang the mailbox in the reception room so that you can receive letters listen(sock, 128); //Everything is ready, just waiting for a letter printf("Waiting for client connection\n"); int done =1; while(done){ struct sockaddr_in client; int client_sock, len, i; char client_ip[64]; char buf[256]; socklen_t client_addr_len; client_addr_len = sizeof(client); client_sock = accept(sock, (struct sockaddr *)&client, &client_addr_len); //Print IP address and port number of customer service terminal printf("client ip: %s\t port : %d\n", inet_ntop(AF_INET, &client.sin_addr.s_addr,client_ip,sizeof(client_ip)), ntohs(client.sin_port)); /*Read data sent by client*/ //len = read(client_sock, buf, sizeof(buf)-1); len = read_len(client_sock,buf,sizeof(buf)); buf[len] = '\0'; printf("receive[%d]: %s\n", len, buf); //Convert to uppercase for(i=0; i<len; i++){ /*if(buf[i]>='a' && buf[i]<='z'){ buf[i] = buf[i] - 32; }*/ buf[i] = toupper(buf[i]); } len = write(client_sock, buf, len); printf("finished. len: %d\n", len); close(client_sock); } close(sock); return 0; }
It is concluded that the use of tcp protocol communication transmission ensures the reliability of communication transmission, but the disadvantage is that the fluency of communication can not be guaranteed. It is suitable for occasions with high requirements for data reliability but low requirements for flow smoothness