A complete understanding of Linux interrupt processing

Posted by Andy-H on Fri, 19 Nov 2021 17:40:55 +0100

What is an interrupt

interrupt It is a mechanism to notify the CPU after the external device completes some work (for example, after the hard disk completes the read-write operation, it informs the CPU that it has completed through interrupt). In the early days, computers without interrupt mechanism had to query the status of external devices through polling. Because polling is exploratory (that is, the device is not necessarily ready), they often have to do a lot of useless queries, resulting in very low efficiency. Since the interrupt is actively notified to the CPU by the external device, the CPU does not need to poll to query, and the efficiency is greatly improved.

From the perspective of physics, interrupt is an electrical signal generated by hardware equipment and directly sent to the input pin of interrupt controller (such as 8259A), and then the interrupt controller sends the corresponding signal to the processor. Once the processor detects the signal, it interrupts the work it is currently processing and processes the interrupt instead. After that, the processor notifies the OS that an interrupt has been generated. In this way, the OS can properly handle this interrupt. Different devices have different interrupts, and each interrupt is identified by a unique number. These values are usually called interrupt request line.

Interrupt controller

The CPU of X86 computer only provides two external pins for interrupt: NMI and INTR. NMI is a non maskable interrupt, which is usually used for power failure and physical memory parity; INTR is a maskable interrupt, which can be masked by setting the interrupt mask bit. It is mainly used to receive interrupt signals from external hardware, which are transmitted to the CPU by the interrupt controller.

There are two common interrupt controllers:

Programmable Interrupt Controller 8259A

The traditional PIC (Programmable Interrupt Controller) is connected by two 8259A style external chips in a "cascade" way. Each chip can process up to 8 different IRQs. Because the INT output line of the PIC is connected to the IRQ2 pin of the main PIC, the number of available IRQ lines reaches 15, as shown in the figure below.

8259A

Advanced programmable interrupt controller (APIC)

8259A is only suitable for single CPU. In order to fully tap the parallelism of SMP architecture, it is very important to be able to pass interrupts to each CPU in the system. For this reason, Intel introduced a new component called I/O advanced programmable controller to replace the old 8259A Programmable Interrupt Controller. The component consists of two parts: one is "local APIC", which is mainly responsible for transmitting interrupt signals to the specified processor; For example, if a machine has three processors, it must have three local APICS. Another important part is the I/O APIC, which mainly collects interrupt signals from I/O devices and sends signals to the local APIC when those devices need to be interrupted. The system can have up to 8 I/O APICS.

Each local APIC has a 32-bit register, an internal clock, a local timing device, and two additional IRQ lines LINT0 and LINT1 reserved for local interrupts. All local APICS are connected to I / O APICS to form a multi-level APIC system, as shown in the figure below.

At present, most single processor systems contain an I/O APIC chip, which can be configured in the following two ways:

As a standard 8259A working mode. The local APIC is disabled, the external I/O APIC is connected to the CPU, and the two LINT0 and LINT1 are connected to the INTR and NMI pins respectively.
As a standard external I/O APIC. The local APIC is activated and all external interrupts are received through the I/O APIC.

To identify whether a system is using I/O APIC, you can enter the following commands on the command line:

# cat /proc/interrupts

# cat /proc/interrupts
           CPU0       
  0:      90504    IO-APIC-edge  timer
  1:        131    IO-APIC-edge  i8042
  8:          4    IO-APIC-edge  rtc
  9:          0    IO-APIC-level  acpi
 12:        111    IO-APIC-edge  i8042
 14:       1862    IO-APIC-edge  ide0
 15:         28    IO-APIC-edge  ide1
177:          9    IO-APIC-level  eth0
185:          0    IO-APIC-level  via82cxxx
...

If IO-APIC is listed in the output result, your system is using APIC. If you see XT-PIC, it means that your system is using 8259A chip.

Interrupt classification

Interrupts can be divided into synchronous interrupts and asynchronous interrupts:

Synchronization interrupt is generated by the CPU control unit when an instruction is executed. It is called synchronization because the CPU will issue an interrupt only after an instruction is executed, rather than during the execution of code instructions, such as system calls.
Asynchronous interrupts are randomly generated by other hardware devices according to the CPU clock signal, which means that interrupts can occur between instructions, such as keyboard interrupts.

According to Intel official data, synchronous interrupts are called exception s, and asynchronous interrupts are called interrupts.

Interrupts can be divided into Maskable interrupt and Non maskable interrupt. Anomalies can be divided into fault, trap and abort.

Broadly speaking, interrupts can be divided into four categories: interrupt, fault, trap and termination. See table for similarities and differences between these categories.

Table: interrupt categories and their behaviors

category	reason	Asynchronous / synchronous	Return behavior
interrupt	Signals from I/O devices	asynchronous	Always return to the next instruction
trap	Intentional anomaly	synchronization	Always return to the next instruction
fault	Potentially recoverable errors	synchronization	Returns to the current instruction
termination	Unrecoverable error	synchronization	No return

Each interrupt in the X86 architecture is given a unique number or vector (8-bit unsigned integer). The unshielded interrupt and exception vectors are fixed, while the maskable interrupt vector can be changed by programming the interrupt controller.

[article welfare] Xiaobian has sorted out some linux kernel learning books and videos that he thinks are better. They are shared in the group file. If necessary, you can click to add them for free!!! (including video tutorials, e-books, practical projects and codes)

Interrupt handling - upper half (hard interrupt)

because APIC interrupt controller It is a little complicated, so this paper mainly through 8259A interrupt controller To introduce the processing of interrupt in Linux.

Interrupt processing related structure

As mentioned earlier, 8259A interrupt controller It consists of two 8259A style external chips cascade Each chip can process up to 8 different IRQs (interrupt requests), so the number of available IRQ lines reaches 15. As shown in the figure below:

In the kernel, each IRQ line consists of a structure irq_desc_t To describe, irq_desc_t It is defined as follows:

typedef struct {
    unsigned int status;        /* IRQ status */
    hw_irq_controller *handler;
    struct irqaction *action;   /* IRQ action list */
    unsigned int depth;         /* nested irq disables */
    spinlock_t lock;
} irq_desc_t;

Let's introduce it irq_desc_t Function of each field of the structure:

Status: status of the IRQ line.
handler: type is hw_interrupt_type Structure, which represents the hardware related processing function corresponding to the IRQ line, such as 8259A interrupt controller When receiving an interrupt signal, you need to send an acknowledgement signal to continue receiving the interrupt signal. The function of sending the acknowledgement signal is hw_interrupt_type Medium ack Function.
Action: type is irqaction Structure, processing entry of interrupt signal. Because an IRQ line can be shared by multiple hardware, so action Is a linked list, each action Represents an interrupt processing entry for hardware.
depth: prevent multiple opening and closing of IRQ line.
Lock: a spin lock that prevents multiple core CPU s from operating on IRQ at the same time.

hw_interrupt_type This structure is related to hardware. We won't introduce it here. Let's take a look irqaction This structure:

struct irqaction {
    void (*handler)(int, void *, struct pt_regs *);
    unsigned long flags;
    unsigned long mask;
    const char *name;
    void *dev_id;
    struct irqaction *next;
};

Let's talk about it irqaction Function of each field of the structure:

Handler: the entry function of interrupt processing, handler The first parameter is the interrupt number, the second parameter is the ID corresponding to the device, and the third parameter is the value of each register saved by the kernel when the interrupt occurs.
flags: flag bit, used to indicate irqaction Some behaviors, such as whether the IRQ line can be shared with other hardware.
Name: the name used to save interrupt processing.
dev_id: device ID.
Next: one interrupt processing entry for each hardware irqaction Structure, because multiple hardware can share the same IRQ line, it is passed here next Field to connect different hardware interrupt processing entries.

irq_desc_t The structural relationship is shown in the figure below:

irq_desc_t

Register interrupt processing entry

In the kernel, you can setup_irq() Function to register an interrupt processing entry. setup_irq() The function code is as follows:

int setup_irq(unsigned int irq, struct irqaction * new)
{
    int shared = 0;
    unsigned long flags;
    struct irqaction *old, **p;
    irq_desc_t *desc = irq_desc + irq;
    ...
    spin_lock_irqsave(&desc->lock,flags);
    p = &desc->action;
    if ((old = *p) != NULL) {
        if (!(old->flags & new->flags & SA_SHIRQ)) {
            spin_unlock_irqrestore(&desc->lock,flags);
            return -EBUSY;
        }

        do {
            p = &old->next;
            old = *p;
        } while (old);
        shared = 1;
    }

    *p = new;

    if (!shared) {
        desc->depth = 0;
        desc->status &= ~(IRQ_DISABLED | IRQ_AUTODETECT | IRQ_WAITING);
        desc->handler->startup(irq);
    }
    spin_unlock_irqrestore(&desc->lock,flags);

    register_irq_proc(irq); //  Registering the proc file system
    return 0;
}

setup_irq() The function is relatively simple, that is, through irq Number to find the corresponding irq_desc_t Structure and put the new irqaction connection to irq_desc_t Structural action In the linked list. Note that if the device does not support shared IRQ lines (that is flags Field is not set SA_SHIRQ Flag), then return EBUSY Wrong.

Let's see Clock interrupt processing entry Registered instance of:

static struct irqaction irq0  = { timer_interrupt, SA_INTERRUPT, 0, "timer", NULL, NULL};

void __init time_init(void)
{
    ...
    setup_irq(0, &irq0);
}

You can see that the IRQ number of the clock interrupt processing entry is 0 and the processing function is 0 timer_interrupt(), and shared IRQ lines (flags) are not supported Field is not set SA_SHIRQ Flag).

Processing interrupt requests

When an interrupt occurs, the interrupt control layer will send a signal to the CPU, and the CPU will interrupt the current execution and execute the interrupt processing process instead. The interrupt process first saves the value of the register to the stack and then calls it. do_IRQ() Function for further processing, do_IRQ() The function code is as follows:

asmlinkage unsigned int do_IRQ(struct pt_regs regs)
{
    int irq = regs.orig_eax & 0xff; /* Get IRQ number   */
    int cpu = smp_processor_id();
    irq_desc_t *desc = irq_desc + irq;
    struct irqaction * action;
    unsigned int status;

    kstat.irqs[cpu][irq]++;
    spin_lock(&desc->lock);
    desc->handler->ack(irq);

    status = desc->status & ~(IRQ_REPLAY | IRQ_WAITING);
    status |= IRQ_PENDING; /* we _want_ to handle it */

    action = NULL;
    if (!(status & (IRQ_DISABLED | IRQ_INPROGRESS))) { //  The current IRQ is not in process
        action = desc->action;    //  obtain   action   Linked list
        status &= ~IRQ_PENDING;   //  Remove IRQ_PENDING logo,   This flag is used to record whether another interrupt occurred while processing the IRQ request
        status |= IRQ_INPROGRESS; //  Set IRQ_INPROGRESS flag,   Indicates that the IRQ is being processed
    }
    desc->status = status;

    if (!action)  //  If the last IRQ has not been completed,   immediate withdrawal
        goto out;

    for (;;) {
        spin_unlock(&desc->lock);
        handle_IRQ_event(irq, &regs, action); //  Processing IRQ requests
        spin_lock(&desc->lock);
        
        if (!(desc->status & IRQ_PENDING)) //  If another interrupt occurs while processing an IRQ request,   Continue processing IRQ requests
            break;
        desc->status &= ~IRQ_PENDING;
    }
    desc->status &= ~IRQ_INPROGRESS;
out:

    desc->handler->end(irq);
    spin_unlock(&desc->lock);

    if (softirq_active(cpu) & softirq_mask(cpu))
        do_softirq(); //  Interrupt lower half processing
    return 1;
}

do_IRQ() The function first obtains its corresponding IRQ number through the IRQ number irq_ desc_ t Structure. Note that the same interrupt may occur multiple times, so judge whether the current IRQ is being processed (judge) irq_desc_t Structural status Is the field set IRQ_INPROGRESS Flag), if the current is not processed, the action Linked list, and then call handle_IRQ_event() Function to execute the interrupt processing function in the action linked list.

If the same interrupt occurs in the process of processing the interrupt (irq_desc_t) Structural status Field is set IRQ_INPROGRESS Flag), then continue to process the interrupt. After processing interrupt, call do_softirq() Function to process the lower half of the interrupt (described below).

Let's see handle_IRQ_event() Function implementation:

int handle_IRQ_event(unsigned int irq, struct pt_regs * regs, struct irqaction * action)
{
    int status;
    int cpu = smp_processor_id();

    irq_enter(cpu, irq);

    status = 1; /* Force the "do bottom halves" bit */

    if (!(action->flags & SA_INTERRUPT)) //  If the interrupt processing can be performed with the interrupt open,   Then turn on the interrupt
        __sti();

    do {
        status |= action->flags;
        action->handler(irq, action->dev_id, regs);
        action = action->next;
    } while (action);
    if (status & SA_SAMPLE_RANDOM)
        add_interrupt_randomness(irq);
    __cli();

    irq_exit(cpu, irq);

    return status;
}

handle_IRQ_event() The function is very simple, that is, traversing the action linked list and executing the processing function, such as for Clock interrupt Is to call timer_interrupt() Function. It should be noted here that if the interrupt processing process can turn on the interrupt, it will turn on the interrupt (because the CPU will turn off the interrupt when receiving the interrupt signal).

Interrupt handling - lower half (soft interrupt)

Because interrupt processing is generally executed when the interrupt is closed, interrupt processing cannot be too time-consuming, otherwise subsequent interrupts cannot be processed in real time. For this reason, Linux divides interrupt processing into two parts, the first half and Lower half, upper half I've already introduced it. Let's introduce it next Lower half Implementation of.

General interrupt Upper half It will only do some basic operations (such as copying data from the network card to the cache), and then interrupt the execution Lower half Identify and call after identification do_softirq() Function.

softirq mechanism

Interrupt lower half from Softirq (soft interrupt) In the Linux kernel, there is a mechanism called softirq_vec Array of, as follows:

static struct softirq_action softirq_vec[32];

Its type is softirq_action Structure, as defined below:

struct softirq_action
{
    void    (*action)(struct softirq_action *);
    void    *data;
};

softirq_vec Array is softirq The core of the mechanism, softirq_vec Each element of the array represents a soft interrupt. However, only four soft interrupts are defined in Linux, as follows:

enum
{
    HI_SOFTIRQ=0,
    NET_TX_SOFTIRQ,
    NET_RX_SOFTIRQ,
    TASKLET_SOFTIRQ
};

HI_SOFTIRQ Is a high priority tasklet, and TASKLET_SOFTIRQ It is an ordinary tasklet. Tasklet is a task queue based on softirq mechanism (described below). NET_TX_SOFTIRQ and NET_RX_SOFTIRQ Soft interrupts specific to network sub modules (not described).

Register softirq handler

To register a softirq handler, you can open_softirq() Function, the code is as follows:

void open_softirq(int nr, void (*action)(struct softirq_action*), void *data)
{
    unsigned long flags;
    int i;

    spin_lock_irqsave(&softirq_mask_lock, flags);
    softirq_vec[nr].data = data;
    softirq_vec[nr].action = action;

    for (i=0; i<NR_CPUS; i++)
        softirq_mask(i) |= (1<<nr);
    spin_unlock_irqrestore(&softirq_mask_lock, flags);
}

open_softirq() The main job of the function is to softirq_vec Add a softirq handler to the array.

Linux registers two softirq processing functions during system initialization: TASKLET_SOFTIRQ and HI_SOFTIRQ:

void __init softirq_init()
{
    ...
    open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
}

Processing softirq

Processing softirq is through do_softirq() Function implementation, the code is as follows:

asmlinkage void do_softirq()
{
    int cpu = smp_processor_id();
    __u32 active, mask;

    if (in_interrupt())
        return;

    local_bh_disable();

    local_irq_disable();
    mask = softirq_mask(cpu);
    active = softirq_active(cpu) & mask;

    if (active) {
        struct softirq_action *h;

restart:
        softirq_active(cpu) &= ~active;

        local_irq_enable();

        h = softirq_vec;
        mask &= ~active;

        do {
            if (active & 1)
                h->action(h);
            h++;
            active >>= 1;
        } while (active);

        local_irq_disable();

        active = softirq_active(cpu);
        if ((active &= mask) != 0)
            goto retry;
    }

    local_bh_enable();

    return;

retry:
    goto restart;
}

As I said earlier softirq_vec The array has 32 elements, and each element corresponds to a type of softirq. How does Linux know which softirq needs to be executed? In Linux, each CPU has a type of irq_cpustat_t Structural variables, irq_cpustat_t The structure is defined as follows:

typedef struct {
    unsigned int __softirq_active;
    unsigned int __softirq_mask;
    ...
} irq_cpustat_t;

among __ softirq_active Field indicates which softirq is triggered (int type has 32 bits, and each bit represents a softirq), and __ softirq_mask Field indicates which softirq is masked. Linux pass __ softirq_active This field knows which softirq needs to be executed (just set the corresponding bit to 1).

So, do_softirq() The function passes first softirq_mask(cpu) To obtain the shielded softirq corresponding to the current CPU, and softirq_active(cpu) & mask Is to get the softirq to be executed, and then compare it __ softirq_active Field to determine whether to execute this type of softirq.

tasklet mechanism

As mentioned earlier, the tasklet mechanism is based on the softirq mechanism. The tasklet mechanism is actually a task queue, which is then executed through softirq. There are two kinds of tasklets in the Linux kernel, one is high priority tasklet and the other is ordinary tasklet. The implementation of these two kinds of tasklets is basically the same. The only difference is the execution priority. High priority tasklets will be executed before ordinary tasklets.

A tasklet is essentially a queue through a structure tasklet_head Storage, and each CPU has one such queue. Let's take a look at the structure tasklet_head Definition of:

struct tasklet_head
{
    struct tasklet_struct *list;
};

struct tasklet_struct
{
    struct tasklet_struct *next;
    unsigned long state;
    atomic_t count;
    void (*func)(unsigned long);
    unsigned long data;
};

from tasklet_head You can know the definition of tasklet_head Structure is tasklet_struct Structure the head of the queue, and tasklet_struct Structural func Field pointer to the function to be executed by the formal task. Linux defines two types of tasklet queues, namely tasklet_vec and tasklet_hi_vec, as defined below:

struct tasklet_head tasklet_vec[NR_CPUS];
struct tasklet_head tasklet_hi_vec[NR_CPUS];

As you can see, tasklet_vec and tasklet_hi_vec Both are arrays. The number of elements in the array is the number of CPU cores, that is, each CPU core has a high priority tasklet queue and an ordinary tasklet queue.

Scheduling tasklet s

If we have a tasklet to execute, the high priority tasklet can be executed through tasklet_hi_schedule() Function scheduling, while ordinary tasklets can be through tasklet_schedule() dispatch. The two functions are basically the same, so we only analyze one of them:

static inline void tasklet_hi_schedule(struct tasklet_struct *t)
{
    if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
        int cpu = smp_processor_id();
        unsigned long flags;

        local_irq_save(flags);
        t->next = tasklet_hi_vec[cpu].list;
        tasklet_hi_vec[cpu].list = t;
        __cpu_raise_softirq(cpu, HI_SOFTIRQ);
        local_irq_restore(flags);
    }
}

The type of the function parameter is tasklet_struct Structure, indicating the tasklet structure to be executed. tasklet_hi_schedule() Function first determines whether the tasklet has been added to the queue. If not, it will be added to the queue tasklet_hi_vec In the queue and by calling __ cpu_raise_softirq(cpu, HI_SOFTIRQ) To tell softirq that it needs to be executed HI_ SOFTIRQ Type softirq, let's take a look __ cpu_raise_softirq() Function implementation:

static inline void __cpu_raise_softirq(int cpu, int nr)
{
    softirq_active(cpu) |= (1<<nr);
}

As you can see__ cpu_raise_softirq() Function is to put irq_cpustat_t Structural __ softirq_active Field nr bit Set to 1. about tasklet_hi_schedule() Function is to put HI_SOFTIRQ Bit (0 bit) is set to 1.

As mentioned earlier, Linux will register two softirq and tasklet during initialization_ SOFTIRQ and HI_SOFTIRQ:

void __init softirq_init()
{
    ...
    open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
}

So when irq_cpustat_t Structural __ softirq_active Field HI_SOFTIRQ When bit (0 bit) is set to 1, the softirq mechanism is executed tasklet_hi_action() Function, let's see tasklet_hi_action() Function implementation:

static void tasklet_hi_action(struct softirq_action *a)
{
    int cpu = smp_processor_id();
    struct tasklet_struct *list;

    local_irq_disable();
    list = tasklet_hi_vec[cpu].list;
    tasklet_hi_vec[cpu].list = NULL;
    local_irq_enable();

    while (list != NULL) {
        struct tasklet_struct *t = list;

        list = list->next;

        if (tasklet_trylock(t)) {
            if (atomic_read(&t->count) == 0) {
                clear_bit(TASKLET_STATE_SCHED, &t->state);

                t->func(t->data);  //  Call tasklet handler
                tasklet_unlock(t);
                continue;
            }
            tasklet_unlock(t);
        }
        ...
    }
}

tasklet_hi_action() The function is very simple, that is, traversal tasklet_hi_vec Queue and execute the handler function of the tasklet in it.

Finally, I sorted out some linux kernel learning books and video materials that I think are better. If you need them, you can click to add them for free!!

Original link;

https://mp.weixin.qq.com/s?__biz=MzA3NzYzODg1OA==&mid=2648465955&idx=2&sn=5a3c7341f897683d602ba10356d71df3&chksm=87663f86b011b690ae16b06877fee46f8d05d7eed261f4d1205e4438c96309d95c71e2c41483#rd

Topics: Linux Operation & Maintenance server

Programmer Think