Seastar source code reading - event loop

Posted by smoothrider on Tue, 25 Jan 2022 09:14:12 +0100

Seastar event loop

reactor::do_run

Each thread in the Seastar application calls reactor::do_run() function, enter the event loop, do_ The run() function mainly does the following work:

  • Register various poller s, which are saved in_ pollers data structure (STD:: vector < pollfn * > type).
  • Asynchronous future waits for the network stack to start. When the network stacks of all threads are started, reactor officially starts (_start_promise.set_value()).
  • Set idle time detection timer_ Timer, when the CPU is idle enough, call sleep() to go to sleep to avoid unnecessary polling.
  • Repeat the following in the while(true) loop:
    • Call run first_ some_ Tasks () to execute the tasks in the work queue (at most a certain number of tasks will return to prevent hunger).
    • If_ If stopped is set, cancel load first_ Timer, and then execute all the work in the work queue (until the queue is empty, because a last response may be sent to cpu0). If the current thread is the main thread, you also need to wait for all secondary threads to exit. Finally, clear the I/O queue and exit.
    • Call check_for_work(), which calls the register_ poll() function of each poller in pollers to handle all ready events. If there is at least one ready event or the work queue is not empty, the next cycle will continue, otherwise sleep will be attempted.

The registered poller is asynchronous. It is packaged as a task and added to the reactor's work queue. future may work the same way.

poller

Each poller object controls one_ Ownership of the pollfn object, which implements the actual poll() function.

The interface of pollfn class is as follows:

struct pollfn {
    virtual ~pollfn() {}
    // Returns true if work was done (false = idle)
    virtual bool poll() = 0;
    // Checks if work needs to be done, but without actually doing any
    // returns true if works needs to be done (false = idle)
    virtual bool pure_poll() = 0;
    // Tries to enter interrupt mode.
    //
    // If it returns true, then events from this poller will wake
    // a sleeping idle loop, and exit_interrupt_mode() must be called
    // to return to normal polling.
    //
    // If it returns false, the sleeping idle loop may not be entered.
    virtual bool try_enter_interrupt_mode() = 0;
    virtual void exit_interrupt_mode() = 0;
};

Where poll () checks and handles the relevant ready events, while pure_poll() only checks for ready events and does not handle them.

In order to bypass the kernel, Seastar defines a series of poller s to handle various events (SMP, I/O, kernel events, etc.).

// The order in which we execute the pollers is very important for performance.
//
// This is because events that are generated in one poller may feed work into others. If
// they were reversed, we'd only be able to do that work in the next task quota.
//
// One example is the relationship between the smp poller and the I/O submission poller:
// If the smp poller runs first, requests from remote I/O queues can be dispatched right away
//
// We will run the pollers in the following order:
//
// 1. SMP: any remote event arrives before anything else
// 2. reap kernel events completion: storage related completions may free up space in the I/O
//                                   queue.
// 4. I/O queue: must be after reap, to free up events. If new slots are freed may submit I/O
// 5. kernel submission: for I/O, will submit what was generated from last step.
// 6. reap kernel events completion: some of the submissions from last step may return immediately.
//                                   For example if we are dealing with poll() on a fd that has events.
poller smp_poller(std::make_unique<smp_pollfn>(*this));

poller reap_kernel_completions_poller(std::make_unique<reap_kernel_completions_pollfn>(*this));
poller io_queue_submission_poller(std::make_unique<io_queue_submission_pollfn>(*this));
poller kernel_submit_work_poller(std::make_unique<kernel_submit_work_pollfn>(*this));
poller final_real_kernel_completions_poller(std::make_unique<reap_kernel_completions_pollfn>(*this));
// ...

The comment says that arranging the order of poller s in this way can improve the performance. At present, the specific reason is not clear.

go_to_sleep

In the event loop, when both pollers and task queue are empty, the CPU will be slowed down first (the built-in _builtin_ia32_pause() function of gcc will be called on x86 platform, which is equivalent to the "pause" assembly instruction with memory barrier), and then the idle will be checked_ end - idle_ Whether start is greater than_ max_poll_time, if it means that the CPU is idle enough, the sleep() function will be called to make the current thread sleep.

void reactor::sleep() {
    for (auto i = _pollers.begin(); i != _pollers.end(); ++i) {
        auto ok = (*i)->try_enter_interrupt_mode();
        if (!ok) {
            while (i != _pollers.begin()) {
                (*--i)->exit_interrupt_mode();
            }
            return;
        }
    }

    _backend->wait_and_process_events(&_active_sigmask);

    for (auto i = _pollers.rbegin(); i != _pollers.rend(); ++i) {
        (*i)->exit_interrupt_mode();
    }
}

As can be seen above, the sleep() function will try to make each poller enter the interrupt mode. If a poller cannot enter the interrupt mode, it will roll back (all pollers that have entered the interrupt mode will exit the interrupt mode) and return immediately, so as to ensure that no events will be missed during thread sleep; Otherwise, wait will be called_ and_ process_ Events () waits for the interrupt to arrive. After the interrupt arrives, you can exit the interrupt mode and return to the event loop.

When using the epoll model, wait_ and_ process_ The principle of events to realize sleep is to block in:: epoll_ Wait for the event on pwait().

Topics: Cpp