From libuv source code, we can see several problems of nodejs event loop

Posted by TEENFRONT on Fri, 05 Nov 2021 22:31:28 +0100

preface

This article is to solve the problem of unclear understanding by combining official documents and other materials when learning nodejs event cycle

1. Whether the callback function of the poll callback queue will be executed if the poll phase does not block (the blocking time timeout is 0) or infinite blocking (the blocking time timout is - 1)

1.1 what is I / O and what is the file descriptor?

I/O is short for input and output. In Linux system, everything is regarded as a file. A file (regular file, socket, FIFO, pipeline, terminal...) is a string of binary streams. In information exchange, we send and receive data from these streams, which is I/O operation. When a process opens an existing file or creates a new file, the kernel returns a file operator to the process. The file operator is an index, which is an integer and points to the system level file description table. It contains information such as file operation, file type, access permission and so on. All system calls that perform I/O operations pass through the file descriptor.

1.2 what is epoll

Epoll is an extensible I/O event notification mechanism of Linux kernel. In the browser environment, when we want to listen for mouse events, we will use element.addEventListener('click', cbFn), which is the browser's event notification mechanism. Similarly, libuv calls epoll related APIs to implement I/O event notification. The Observer registers with the Subject. When the Subject changes, the registered Observer will be notified to execute callback.

1.3 epoll workflow

epoll has three steps:

epoll_create, establish a file node in the epoll file system, open up epoll's own kernel high-speed cache area, establish a red black tree, allocate memory objects of the desired size, and establish a list linked list to store ready events.
epoll_ctl, put the file to be monitored on the corresponding red black tree, register a callback function for the kernel interrupt handler, and notify the kernel that if the data of this handle arrives, it will be placed in the ready list.
epoll_wait, observe whether there is data in the ready list, extract and empty the ready list.

1.4 libuv implementation of poll phase

void uv__io_poll(uv_loop_t* loop, int timeout) {
  // ...
  // If there are no observers, return directly
  if (loop->nfds == 0) {
    assert(QUEUE_EMPTY(&loop->watcher_queue));
    return;
  }
  memset(&e, 0, sizeof(e));
  // Register all I/O observers with the epoll system
  while (!QUEUE_EMPTY(&loop->watcher_queue)) {
    // Get the queue header and remove the queue from loop - > watcher_ Queue remove
    q = QUEUE_HEAD(&loop->watcher_queue);
    QUEUE_REMOVE(q);
    QUEUE_INIT(q);
    // Get I/O observer structure
    w = QUEUE_DATA(q, uv__io_t, watcher_queue);
    assert(w->pevents != 0);
    assert(w->fd >= 0);
    assert(w->fd < (int) loop->nwatchers);
    e.events = w->pevents;
    e.data.fd = w->fd;
    if (w->events == 0)
      op = EPOLL_CTL_ADD;
    else
      op = EPOLL_CTL_MOD;
    // epoll_ctl operation, register file descriptors and I/O events to be monitored with epoll
    if (epoll_ctl(loop->backend_fd, op, w->fd, &e)) {
      if (errno != EEXIST)
        abort();
      assert(op == EPOLL_CTL_ADD);
      // loop->backend_ FD through epoll_create create
      if (epoll_ctl(loop->backend_fd, EPOLL_CTL_MOD, w->fd, &e))
        abort();
    }
    w->events = w->pevents;
  }
  // ...
  // Record the current time so that you can jump out of the following loop after calculating the arrival time
  base = loop->time;
  // The count is reduced to 0, and the following loop jumps out
  count = 48; /* Benchmarks suggest this gives the best throughput. */
  real_timeout = timeout;
  /* 
    Enter epoll_pwait polling I/O Events
    The following loops mainly control whether to jump out by timeout and count, which is consistent with the whole event loop
  */
  for (;;) {
    // ...
    // nfds indicates the number of file descriptors that generate I/O events. 0 means no event occurs. It may be because the timeout has expired or timeout=0
    // Events holds a collection of events from the kernel
    nfds = epoll_pwait(loop->backend_fd,
                       events,
                       ARRAY_SIZE(events),
                       timeout,
                       psigset);
    // ...
    // No I/O Events
    if (nfds == 0) {
      // ...
      // If timeout is - 1, the loop continues
      if (timeout == -1)
        continue;
      // If timeout is 0, the function returns directly
      if (timeout == 0)
        return;
      // Update next epoll_ timeout time of pwait
      goto update_timeout;
    }
    // epoll_pwait returned an error
    if (nfds == -1) {
      if (errno != EINTR)
        abort();
      // If timeout is - 1, the loop continues
      if (timeout == -1)
        continue;
      // If timeout is 0, the function returns directly
      if (timeout == 0)
        return;
      // Update next epoll_ timeout time of pwait
      goto update_timeout;
    }
    // ...
    // Get the I/O observer and call the associated callback function
    for (i = 0; i < nfds; i++) {
      pe = events + i;
      fd = pe->data.fd;
      // ...
      // If there are valid events
      if (pe->events != 0) {
        if (w == &loop->signal_IOWatcher)
          have_signals = 1;
        else
          // Execute callback
          w->cb(loop, w, pe->events);
        nevents++;
      }
    }
    // ...
    if (nevents != 0) {
      // If events are generated on all file descriptors and count is not 0, recycle once
      if (nfds == ARRAY_SIZE(events) && --count != 0) {
        /* Poll for more events but don't block this time. */
        timeout = 0;
        continue;
      }
      return;
    }
    // If timeout is 0, the function returns directly
    if (timeout == 0)
      return;
    // If timeout is - 1, the loop continues
    if (timeout == -1)
      continue;

// Recalculate timeout
update_timeout:
    assert(timeout > 0);

    real_timeout -= (loop->time - base);
    if (real_timeout <= 0)
      return;
    // timeout remaining
    timeout = real_timeout;
  }
}

epoll registered I/O observer
Call epoll_ctl, register file descriptors and I/O events to be monitored
Enter the loop and call epoll_pwait polling I/O Events
3.1 if there is no I/O event and the timeout is 0, the polling will exit directly. If the timeout is - 1, the polling will continue
3.2 if epoll_pwait returns an error. If the timeout is 0, the polling will exit directly. If the timeout is - 1, the polling will continue
3.3 if there is an I/O event, call the associated callback function
3.4 if timeout is 0, exit polling directly
3.5 if timeout is - 1, continue polling

Therefore, when the blocking time timeout is 0, if there is an I/O event, the callback will be executed, and then the next stage will be entered. If there is no I/O event, the next stage will be entered directly; When the blocking time timeout is - 1, I/O events are always polled and callbacks are executed.

2. When does pending callbacks register the callback queue

static int uv__run_pending(uv_loop_t* loop) {
  QUEUE* q;
  QUEUE pq;
  uv__io_t* w;

  if (QUEUE_EMPTY(&loop->pending_queue))
    return 0;

  QUEUE_MOVE(&loop->pending_queue, &pq);

  while (!QUEUE_EMPTY(&pq)) {
    q = QUEUE_HEAD(&pq);
    QUEUE_REMOVE(q);
    QUEUE_INIT(q);
    w = QUEUE_DATA(q, uv__io_t, pending_queue);
    w->cb(loop, w, POLLOUT);
  }

  return 1;
}

This function traverses loop - > pending_ The queue queue node, after obtaining the I/O observer, calls cb. After searching, only UV__ io_ Loop - > pending exists in the feed_ The code of the queue insertion node is as follows

void uv__io_feed(uv_loop_t* loop, uv__io_t* w) {
  if (QUEUE_EMPTY(&w->pending_queue))
    QUEUE_INSERT_TAIL(&loop->pending_queue, &w->pending_queue);
}

Continue searching for uv__io_feed, the place called is as follows

// src/unix/pipe.c
void uv_pipe_connect(uv_connect_t* req,
                    uv_pipe_t* handle,
                    const char* name,
                    uv_connect_cb cb) {
  // ...
  if (err)
    uv__io_feed(handle->loop, &handle->io_watcher);
}

// src/unix/stream.c
static void uv__write_req_finish(uv_write_t* req) {
  // ...
  uv__io_feed(stream->loop, &stream->io_watcher);
}

// src/unix/tpc.c
int uv__tcp_connect(uv_connect_t* req,
                    uv_tcp_t* handle,
                    const struct sockaddr* addr,
                    unsigned int addrlen,
                    uv_connect_cb cb) {
  // ...
  if (handle->delayed_error)
    uv__io_feed(handle->loop, &handle->io_watcher);
  // ...
}

Three more are called at src/unix/udp.c

Therefore, in the pending callbacks phase, the callbacks are registered in the following scenarios:

Error connecting pipe
When the stream write request is completed
When tcp connection has delay error
Several scenarios of udp

3. Execute as soon as possible after the timer threshold is reached, which may delay their operating system scheduling or other running callbacks. What do you mean?

cpp source code interpretation can be seen here: Portal
TL;DR; The process is as follows:

setTimeout/setInterval is implemented through the built-in class Timeout. Its time threshold is 1 ~ 231-1 ms and is an integer. Therefore, setTimeout(callback, 0) will be converted to setTimeout(callback, 1)
After entering the tick, the start time of the tick will be obtained through UV__ When the hrtime function calls the system time, it may be affected by other applications
libuv all timers are stored in a binary minimum heap structure composed of execution time nodes. Binary minimum heap is characterized in that the parent node is always smaller than the child node, so the root node is the smallest
Execution time node of timer callback = tick start time when registering callback + timer threshold timeout
When the execution time node of the timer callback at the root node of the binary minimum heap is < = the start time of the current time cycle tick, it indicates that there is at least one expired timer. The loop iterates over the root node of the binary minimum heap and calls the callback function corresponding to the timer.
When the execution time of the timer callback of the root node of the binary minimum heap > the start time of the current time cycle tick, it indicates that the execution time has not arrived. According to the characteristics of the binary minimum heap, if the time of the root node cannot meet the execution time, the subsequent nodes have not expired. At this point, the callback function of the timer stage is exited and the next stage is entered
Execute the callback functions of pending callbacks, idel and prepare
Calculate the time p when the poll blocks the current tick. If the pending callbacks, idel s and close callbacks callback queue is not empty, it is 0. Enter the next tick as soon as possible to execute the corresponding callback; If there is a timeout timer, it is 0. Enter the next tick as soon as possible to execute the callback of the timeout timer; If there is a timer that does not time out, the blocking time = the execution time of the root node timer callback of the binary minimum heap - the start time of the current time cycle tick; - 1 if there is no timer, infinite blocking
Execute the callback functions of check and close callbacks

To sum up, operating system scheduling or other running callbacks refer to:

System time call may be affected by other applications
When the poll is blocked, the thread will hang, and the CPU will schedule to do other things. The processing time received by the CPU is uncontrollable
The execution time of callback in each stage is uncontrollable

Therefore, the effect of executing as soon as the threshold is exceeded, rather than executing immediately at the time point.

Reference

The function and principle of epoll
Viewing nodejs event loop from libuv
libuv source code analysis (V) IO watcher
Introduction to File Descriptor
I/O kernel principle and five I/O models

Topics: Javascript node.js

Programmer Think