I/O multiplexing model [select] [poll] [epoll]

Posted by Mastermind on Tue, 21 Sep 2021 07:13:30 +0200

catalogue

select:

Function parsing:

Parameter type analysis:

fd_set operation function:  

  Monitoring process:

Advantages and disadvantages:

poll:

Function parsing:  

  Event structure resolution:

  Advantages and disadvantages:

epoll:

Function parsing:

Principle:

epoll features:

LT and ET modes:

reference resources:

Multiplexing is also called multiplexing; Multiple file descriptors can be monitored. When events (readable, writable and abnormal) occur in the monitored file descriptor, the caller will be notified;

This can prevent the process from operating on a file descriptor that is not ready, resulting in a blocking situation

select:

  select is an implementation of multiplexing; It mainly copies the file descriptors concerned by the user to the kernel for monitoring. When the kernel monitors that some file descriptors are generated in time, it will return the corresponding file descriptors, and the user will operate on the returned file descriptors;

Function parsing:

int select(int nfds, fd_set *readfds, fd_set *writefds,fd_set *exceptfds, struct timeval *timeout);

Parameter resolution:
nfds: It can improve the monitoring efficiency; Can be set to the maximum monitor file descriptor+1
readfds:Read event collection
writedfs:Write event collection
exceptfds:Exception event collection
timeout:structural morphology,Used to set the timeout
         timeout = NULL               Blocking monitoring
         timeout The time represented in is 0       Non blocking monitoring
         timeout The time represented in is greater than 0    Monitoring with timeout    
Return value:
     Greater than 0: number of file descriptors ready
     Equal to 0: Monitoring timeout
     Less than 0: monitoring error

Parameter type analysis:

fd_set:

typedef struct
{
 #ifdef __USE_XOPEN
    __fd_mask fds_bits[__FD_SETSIZE / __NFDBITS];
 # define __FDS_BITS(set) ((set)->fds_bits)
 #else
    __fd_mask __fds_bits[__FD_SETSIZE / __NFDBITS];
 # define __FDS_BITS(set) ((set)->__fds_bits)
 #endif
} fd_set;

//Structure FD_ There is one in the set__ fd_ An array of type mask, with an array size of__ FD_SETSIZE / __NFDBITS;

typedef long int __fd_mask;
#define __NFDBITS       (8 * (int) sizeof (__fd_mask))
#define __FD_SETSIZE  1024

  Through the above source code analysis: fd_set is an array     long   fds_bits[1024/64] ;

This array is used in the form of bitmap, which also explains why we need to pass in three event sets to be monitored; Therefore, under Linux, the maximum number of nfds through source code analysis is 1024

 struct  timeval:

struct timeval {
               long    tv_sec;         /* seconds */
               long    tv_usec;        /* microseconds */
           };

  Note: the file descriptors returned to us are only the file descriptors with events. The file descriptors without events will be erased from the event collection

fd_set operation function:  

void FD_CLR(int fd, fd_set *set);      //Deletes a file descriptor from the event collection 

int  FD_ISSET(int fd, fd_set *set);    
//Judge whether the fd descriptor is in the set set     
//0: indicates that it is not in the set. Non 0: indicates that it is in the set


void FD_SET(int fd, fd_set *set);  //Add fd to the set
void FD_ZERO(fd_set *set);        //empty

  Monitoring process:

When the kernel returns the ready file descriptor, the user needs to traverse the event set in order to obtain the ready file descriptor; When monitoring again, you need to reset the event set

Advantages and disadvantages:

1. Advantages:

          It follows posix standard and can be used across platforms

          Timeout can be precise to subtle

2. Disadvantages:

          select requires polling and traversal, and the monitoring efficiency decreases with the increase of file descriptors

          The maximum file descriptor monitored by select is limited

            When select ing the monitoring event set, it needs to be copied to the kernel. Similarly, if the ready file descriptor is returned, it also needs to be copied; Efficiency affected

            There are only ready file descriptors in the event set returned by select. When monitoring again, you need to reset the event set

            In the set of ready events returned by select, you need to poll and traverse to know which file descriptors are ready  

poll:

  Poll is also an implementation of multiplexing. Unlike select, poll is only used under linux, and its cross platform performance is not as good as select; Similarly, poll and select also adopt polling traversal, so there is little improvement in efficiency;

However, it improves a major disadvantage of select by not limiting the number of file descriptors monitored; Each file descriptor corresponds to an event structure;

You only need to set the monitored file descriptor and which event (readable, writable or abnormal) to monitor the file descriptor

Function parsing:  

#include <poll.h>
int poll(struct pollfd *fds, nfds_t nfds, int timeout);
Parameter resolution:
fds: Array of file descriptor event structures to monitor
nfds:fds Number of elements in
timeout:Timeout in milliseconds
     timeout<0   Non blocking monitoring
     timeout=0   Blocking monitoring
     timeout>0   Monitoring with timeout
 Return value:
  Less than 0: monitoring error
  Equal to 0: monitoring timeout
  Greater than 0: returns the number of ready times

  Event structure resolution:

struct pollfd {
               int   fd;         /* file descriptor */   
               short events;     /* requested events */
               short revents;    /* returned events */
           };


fd:    Monitored file descriptor
events:Which event do you want to monitor for this file descriptor
     eg:
          POLLIN:Readable event
          POLLOUT:Writable event
  To monitor multiple events, you can use |  Operators

revents: 
         When the concerned file descriptor generates the corresponding concerned event: put the occurred event into revents Return to the caller; 
         revents Every time poll When monitoring, it will be initialized to null

  Advantages and disadvantages:

1. Advantages:

        The event structure is used to monitor the file descriptor, which simplifies the writing of the code

        The number of file descriptors monitored is no longer limited

        When monitoring multiple times, the user does not need to set the monitoring event again

2. Disadvantages:

        Polling traversal is also used in poll monitoring. With the increase of monitored file descriptors, the efficiency decreases

        When poll monitors the event set, it needs to be copied to the kernel. Similarly, if the ready file descriptor is returned, it also needs to be copied; Low efficiency

        When poll returns ready events, the user also needs to poll and traverse to query ready events;

        poll only supports Linux platforms

epoll:

Function parsing:

1. Create the operation handle of epoll

#include <sys/epoll.h>

int epoll_create(int size)
parameter:size from Linux Kernel 2.6.8 after size It doesn't make any sense, just transfer in size>0 that will do;
     //At the beginning, size indicates the number of event structures monitored by epoll. Later, epoll adopts the capacity expansion method, so size has no meaning, just 
     //It should be greater than 0
 Return value:
     -1: establish epoll Operation handle failed
         If successful, it will return epoll Operation handle for

2. Add, delete or modify the event structure to the red black tree of the kernel

int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
parameter:
    epfd:epoll Operation handle for
    op:   Common macros:   EPOLL_CTL_ADD :Add event structure
                      EPOLL_CTL_DEL :Delete event structure
                      EPOLL_CTL_MOD:Modify event structure
    fd : File descriptor to monitor
    event:Event structure


typedef union epoll_data {
               void        *ptr;
               int          fd;
               uint32_t     u32;
               uint64_t     u64;
           } epoll_data_t;

           struct epoll_event {
               uint32_t     events;      /* Epoll events */
               epoll_data_t data;        /* User data variable */
           };
//uint32_t events is the event monitored by the file descriptor
                  EPOLLIN:Readable event
                  EPOLLOUT: Writable event
//epoll_data_t data is a consortium, which is characterized by sharing the memory of the consortium
//In other words, we can choose one of ptr and fd variables. When using ptr, we can define a structure, and the members of the structure must contain fd

3. Copy the ready event structure from the two-way linked list to events

int epoll_wait(int epfd, struct epoll_event *events,int maxevents, int timeout);
parameter:
   epfd:epoll Operation handle for
   events: Output parameters to obtain the ready event structure,You can pass in an array
   maxevents: The maximum number of ready event structures is obtained to prevent too many ready event structures, events The array cannot be placed. It is out of bounds
   timeout: Timeout
    timeout<0 Blocking monitoring
    timeout=0  Non blocking monitoring
    timeout>0  Monitoring with timeout in milliseconds
 Return value:
    Greater than 0: returns the number of ready file descriptors
    Equal to 0: monitoring timeout
    Less than 0 : Monitoring error

   

Principle:

We use epoll_ When the create function creates an epoll handle, it will create an eventpoll structure in the kernel; There are two important data structures in this structure, a red black tree, which stores the event structure we need to monitor; Once the file descriptor in the red black tree is ready, a corresponding event structure will be copied to the two-way linked list; That is, we use epoll_ The ready event structure obtained by wait is obtained from the two-way linked list, and there is no need to traverse the red black tree, which reflects the high performance of epoll;

In addition, when monitoring, the time complexity of traversing the red black tree is O(logN), which is smaller than o (N), which also reflects the high performance of epoll

epoll features:

1. Limit on the number of unmonitored file descriptors (even if there is a memory limit, it must be far more than 1024)

2. The kernel has high monitoring efficiency, and the time complexity of traversing the red black tree is O(logN)

3. The efficiency of returning the ready event structure is high. The ready event is obtained from the two-way linked list. The time complexity is O(1). The user directly obtains the event structure without polling and traversal

4.epoll is only used on linux

5. During timeout monitoring, the timeout event unit of epoll is milliseconds

LT and ET modes:

reference resources:

Summary of the advantages and disadvantages of three types of I/O reuse (select poll epoll) & why epoll is so efficient: Summary of the advantages and disadvantages of three kinds of I/O reuse (select poll epoll) & why epoll is so efficient_ Spring breeze blog - CSDN blog

The difference between select, poll and epoll (Sogou interview)    : The difference between select, poll and epoll (Sogou interview) - aspirant - blog Garden

Topics: Linux Back-end