A little curiosity about linux: linux startup process

Posted by teddirez on Tue, 04 Jan 2022 10:42:47 +0100

Always curious, how does the operating system work? We know how normal programming makes the code run, but those are high-level things. Later on, you will feel like a castle in the air, or someone has paved a lot of truth for you, but you know nothing about it.

 

1. Confusion of operating system

Of course, I don't really know anything. Because there are many books on the operating system to teach you how the operating system works and its various principles. But there is always a feeling that Ren Du's two veins are blocked. I seem to know a little, but I seem to know what this is, where to use it, and why.

I've read a series of articles about how to make my own operating system. It's great. https://wiki.0xffffff.org/  It completely shows the process of a knowledge seeker, including hardware loading, software takeover, operating system, memory, interrupt, driver, thread and so on. It can be said that it is a rare article for dispelling doubts. It should be said that many puzzles about the operating system can be found here. Of course, you have to summarize it yourself.

However, I still have a feeling that no matter how much I see the principle, it is still very empty. Although the above demo has talked about everything, it seems to have talked about all the problems, but it is only a demo after all. Perhaps this is not the case? At least it won't be that simple. This really bothers my beating heart.

Later, I came across an article about epoll: https://bbs.gameres.com/thread_842984_1_1.html   . After the explanation of this article, it can be said that the principle of the whole io is very thorough. And I did start from here and shared it with the team. I don't know how they feel. Anyway, I feel very transparent. This is about io, which can be said to be a small point in the operating system. But I personally think that you may only need to understand the framework once, but the small points need to be pondered over and over again. We need to find the answer with the question, which is often more about small points.

 

2. Dare you bite the hard bone of the operating system?

To tell you the truth, I dare not. The reason is that it is too complex and grand, which is a difficulty in the larger direction. Secondly, it may be difficult for me to pass the language alone, because you should at least be good enough in languages such as assembly and C, but you are only superficial. Just as the saying goes, how dare you enter prosperity when you are poor all your life, and how dare you miss a beautiful woman when you are clean.

Do you just muddle along? But there are always some questions in my heart and I don't know how to solve them. First, you can't ask many people. Second, you don't know how to ask. Third, can you understand even if others tell you? (like teaching)

So, let's find out for ourselves. In fact, there are too many scattered answers on the Internet. It seems that they can understand, but they don't seem to understand very well.

The best documents are in the official data. The best answers are in the code. So why not go and have a look.

 

3. linux kernel source code address

Maybe we usually look at these source codes on github. But at home, github's speed is really not flattering.

gitee address: https://gitee.com/mirrors/linux

github address: https://github.com/torvalds/linux

As for reading tools, it's OK to use sublime for purely soy sauce. If you want to be better, eclipse is also OK. Of course, you may have to set other environmental problems.

 

4. linux framework

For reading skills, refer to the article: https://www.cnblogs.com/fanzhidongyzby/archive/2013/03/20/2970624.html

The overall directory structure is as follows:

The details are briefly described as follows:

arch -- architecture related code. Corresponding to each supported architecture, there is a corresponding directory, such as x86, arm, alpha, etc. Each architecture subdirectory contains several main subdirectories: kernel, mm and lib.
Documentation - documentation related to the kernel.
drivers -- device driver code. Each type of device has corresponding subdirectories, such as char, block, net and other fs file system codes. Each supporting file system has corresponding subdirectories, such as ext2, proc, etc.
fs -- file system implementation. Such as fat, ext4
include -- kernel header file. There are corresponding subdirectories for each supported architecture, such as asm-x86, ASM arm, ASM alpha, etc.
init -- kernel initialization code. Provide main c. Include start_kernel function.
ipc -- interprocess communication code.
Kernel -- kernel management code.
Lib -- architecture independent kernel library code. The library code of a specific architecture is saved in the arch/*/lib directory.
mm -- memory management code.
net -- the network code of the kernel.
samples -- some examples of using functional interfaces, which are somewhat similar to unit tests.
Scripts - this directory contains scripts used in kernel setup.
security -- security related implementation.
tools -- implementation of some attached tool classes.

Among them, the Documentation directory may be irrelevant to the source code, but it is very important for us to understand the system. (because most of us can only understand the surface meaning of the text)

 

5. linux-86 startup process

Taking the implementation of x86 as an example, the startup process is roughly as follows: take header Start with s and start with main C end (where we think we can understand).

    /arch/x86/boot/header.S
        -> calll main    ->    /arch/x86/boot/main.c
        -> go_to_protected_mode()    ->    /arch/x86/boot/pmjump.S
        -> jmpl    *%eax    ->    /arch/x86/kernel/head_32.S
        -> .long i386_start_kernel    ->    /arch/x86/kernel/head32.c
        -> start_kernel()    ->    /init/main.c    (C Language entry)

Detailed code examples are as follows:

// /arch/x86/boot/header.S
#include <asm/segment.h>
#include <asm/boot.h>
#include <asm/page_types.h>
#include <asm/setup.h>
#include <asm/bootparam.h>
#include "boot.h"
#include "voffset.h"
#include "zoffset.h"
...
6:

# Check signature at end of setup
    cmpl    $0x5a5aaa55, setup_sig
    jne    setup_bad

# Zero the bss
    movw    $__bss_start, %di
    movw    $_end+3, %cx
    xorl    %eax, %eax
    subw    %di, %cx
    shrw    $2, %cx
    rep; stosl

# Jump to C code (should not return)
    calll    main


// /arch/x86/boot/main.c
void main(void)
{
    /* First, copy the boot header into the "zeropage" */
    copy_boot_params();

    /* Initialize the early-boot console */
    console_init();
    if (cmdline_find_option_bool("debug"))
        puts("early console in setup code\n");

    /* End of heap check */
    init_heap();

    /* Make sure we have all the proper CPU support */
    if (validate_cpu()) {
        puts("Unable to boot - please use a kernel appropriate "
             "for your CPU.\n");
        die();
    }

    /* Tell the BIOS what CPU mode we intend to run in. */
    set_bios_mode();

    /* Detect memory layout */
    detect_memory();

    /* Set keyboard repeat rate (why?) and query the lock flags */
    keyboard_init();

    /* Query Intel SpeedStep (IST) information */
    query_ist();

    /* Query APM information */
#if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
    query_apm_bios();
#endif

    /* Query EDD information */
#if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
    query_edd();
#endif

    /* Set the video mode */
    set_video();

    /* Do the last things and invoke protected mode */
    go_to_protected_mode();
}


// /arch/x86/boot/pmjump.S
/*
 * The actual transition into protected mode
 */

#include <asm/boot.h>
#include <asm/processor-flags.h>
#include <asm/segment.h>
#include <linux/linkage.h>

    .text
    .code16
...
2:    .long    in_pm32            # offset
    .word    __BOOT_CS        # segment
ENDPROC(protected_mode_jump)

    .code32
    .section ".text32","ax"
GLOBAL(in_pm32)
    # Set up data segments for flat 32-bit mode
    movl    %ecx, %ds
    movl    %ecx, %es
    movl    %ecx, %fs
    movl    %ecx, %gs
    movl    %ecx, %ss
    # The 32-bit code sets up its own stack, but this way we do have
    # a valid stack if some debugging hack wants to use it.
    addl    %ebx, %esp

    # Set up TR to make Intel VT happy
    ltr    %di

    # Clear registers to allow for future extensions to the
    # 32-bit boot protocol
    xorl    %ecx, %ecx
    xorl    %edx, %edx
    xorl    %ebx, %ebx
    xorl    %ebp, %ebp
    xorl    %edi, %edi

    # Set up LDTR to make Intel VT happy
    lldt    %cx

    jmpl    *%eax            # Jump to the 32-bit entrypoint
ENDPROC(in_pm32)


// /arch/x86/kernel/head_32.S
.text
#include <linux/threads.h>
#include <linux/init.h>
#include <linux/linkage.h>
#include <asm/segment.h>
#include <asm/page_types.h>
#include <asm/pgtable_types.h>
#include <asm/cache.h>
#include <asm/thread_info.h>
#include <asm/asm-offsets.h>
#include <asm/setup.h>
#include <asm/processor-flags.h>
#include <asm/msr-index.h>
#include <asm/cpufeatures.h>
#include <asm/percpu.h>
#include <asm/nops.h>
#include <asm/bootparam.h>
#include <asm/export.h>
#include <asm/pgtable_32.h>
...
/*
 * 32-bit kernel entrypoint; only used by the boot CPU.  On entry,
 * %esi points to the real-mode code as a 32-bit pointer.
 * CS and DS must be 4 GB flat segments, but we don't depend on
 * any particular GDT layout, because we load our own as soon as we
 * can.
 */
__HEAD
ENTRY(startup_32)
...
hlt_loop:
    hlt
    jmp hlt_loop
ENDPROC(early_ignore_irq)

__INITDATA
    .align 4
GLOBAL(early_recursion_flag)
    .long 0

__REFDATA
    .align 4
ENTRY(initial_code)
    .long i386_start_kernel
ENTRY(setup_once_ref)
    .long setup_once
    
// /arch/x86/kernel/head32.c
#include <linux/init.h>
#include <linux/start_kernel.h>
#include <linux/mm.h>
#include <linux/memblock.h>

#include <asm/desc.h>
#include <asm/setup.h>
#include <asm/sections.h>
#include <asm/e820/api.h>
#include <asm/page.h>
#include <asm/apic.h>
#include <asm/io_apic.h>
#include <asm/bios_ebda.h>
#include <asm/tlbflush.h>
#include <asm/bootparam_utils.h>
...
asmlinkage __visible void __init i386_start_kernel(void)
{
    /* Make sure IDT is set up before any exception happens */
    idt_setup_early_handler();

    cr4_init_shadow();

    sanitize_boot_params(&boot_params);

    x86_early_init_platform_quirks();

    /* Call the subarch specific early setup function */
    switch (boot_params.hdr.hardware_subarch) {
    case X86_SUBARCH_INTEL_MID:
        x86_intel_mid_early_setup();
        break;
    case X86_SUBARCH_CE4100:
        x86_ce4100_early_setup();
        break;
    default:
        i386_default_early_setup();
        break;
    }

    start_kernel();
}


// /init/main.c
#define DEBUG        /* Enable initcall_debug */

#include <linux/types.h>
#include <linux/extable.h>
#include <linux/module.h>
#include <linux/proc_fs.h>
#include <linux/binfmts.h>
#include <linux/kernel.h>
#include <linux/syscalls.h>
#include <linux/stackprotector.h>
#include <linux/string.h>
#include <linux/ctype.h>
#include <linux/delay.h>
#include <linux/ioport.h>
#include <linux/init.h>
#include <linux/initrd.h>
#include <linux/bootmem.h>
#include <linux/acpi.h>
#include <linux/console.h>
#include <linux/nmi.h>
#include <linux/percpu.h>
#include <linux/kmod.h>
#include <linux/vmalloc.h>
#include <linux/kernel_stat.h>
#include <linux/start_kernel.h>
#include <linux/security.h>
#include <linux/smp.h>
#include <linux/profile.h>
#include <linux/rcupdate.h>
#include <linux/moduleparam.h>
#include <linux/kallsyms.h>
#include <linux/writeback.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
#include <linux/cgroup.h>
#include <linux/efi.h>
#include <linux/tick.h>
#include <linux/sched/isolation.h>
#include <linux/interrupt.h>
#include <linux/taskstats_kern.h>
#include <linux/delayacct.h>
#include <linux/unistd.h>
#include <linux/utsname.h>
#include <linux/rmap.h>
#include <linux/mempolicy.h>
#include <linux/key.h>
#include <linux/buffer_head.h>
#include <linux/page_ext.h>
#include <linux/debug_locks.h>
#include <linux/debugobjects.h>
#include <linux/lockdep.h>
#include <linux/kmemleak.h>
#include <linux/pid_namespace.h>
#include <linux/device.h>
#include <linux/kthread.h>
#include <linux/sched.h>
#include <linux/sched/init.h>
#include <linux/signal.h>
#include <linux/idr.h>
#include <linux/kgdb.h>
#include <linux/ftrace.h>
#include <linux/async.h>
#include <linux/sfi.h>
#include <linux/shmem_fs.h>
#include <linux/slab.h>
#include <linux/perf_event.h>
#include <linux/ptrace.h>
#include <linux/pti.h>
#include <linux/blkdev.h>
#include <linux/elevator.h>
#include <linux/sched_clock.h>
#include <linux/sched/task.h>
#include <linux/sched/task_stack.h>
#include <linux/context_tracking.h>
#include <linux/random.h>
#include <linux/list.h>
#include <linux/integrity.h>
#include <linux/proc_ns.h>
#include <linux/io.h>
#include <linux/cache.h>
#include <linux/rodata_test.h>
#include <linux/jump_label.h>
#include <linux/mem_encrypt.h>

#include <asm/io.h>
#include <asm/bugs.h>
#include <asm/setup.h>
#include <asm/sections.h>
#include <asm/cacheflush.h>
// Platform independent startup code entry
asmlinkage __visible void __init start_kernel(void)
{
    char *command_line;
    char *after_dashes;

    set_task_stack_end_magic(&init_task);
    smp_setup_processor_id();
    debug_objects_early_init();

    cgroup_init_early();

    local_irq_disable();
    early_boot_irqs_disabled = true;

    /*
     * Interrupts are still disabled. Do necessary setups, then
     * enable them.
     */
    boot_cpu_init();
    page_address_init();
    pr_notice("%s", linux_banner);
    setup_arch(&command_line);
    /*
     * Set up the the initial canary and entropy after arch
     * and after adding latent and command line entropy.
     */
    add_latent_entropy();
    add_device_randomness(command_line, strlen(command_line));
    boot_init_stack_canary();
    mm_init_cpumask(&init_mm);
    setup_command_line(command_line);
    setup_nr_cpu_ids();
    setup_per_cpu_areas();
    smp_prepare_boot_cpu();    /* arch-specific boot-cpu hooks */
    boot_cpu_hotplug_init();

    build_all_zonelists(NULL);
    page_alloc_init();

    pr_notice("Kernel command line: %s\n", boot_command_line);
    parse_early_param();
    after_dashes = parse_args("Booting kernel",
                  static_command_line, __start___param,
                  __stop___param - __start___param,
                  -1, -1, NULL, &unknown_bootoption);
    if (!IS_ERR_OR_NULL(after_dashes))
        parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
               NULL, set_init_arg);

    jump_label_init();

    /*
     * These use large bootmem allocations and must precede
     * kmem_cache_init()
     */
    setup_log_buf(0);
    vfs_caches_init_early();
    sort_main_extable();
    trap_init();
    mm_init();

    ftrace_init();

    /* trace_printk can be enabled here */
    early_trace_init();

    /*
     * Set up the scheduler prior starting any interrupts (such as the
     * timer interrupt). Full topology setup happens at smp_init()
     * time - but meanwhile we still have a functioning scheduler.
     */
    sched_init();
    /*
     * Disable preemption - early bootup scheduling is extremely
     * fragile until we cpu_idle() for the first time.
     */
    preempt_disable();
    if (WARN(!irqs_disabled(),
         "Interrupts were enabled *very* early, fixing it\n"))
        local_irq_disable();
    radix_tree_init();

    /*
     * Set up housekeeping before setting up workqueues to allow the unbound
     * workqueue to take non-housekeeping into account.
     */
    housekeeping_init();

    /*
     * Allow workqueue creation and work item queueing/cancelling
     * early.  Work item execution depends on kthreads and starts after
     * workqueue_init().
     */
    workqueue_init_early();

    rcu_init();

    /* Trace events are available after this */
    trace_init();

    if (initcall_debug)
        initcall_debug_enable();

    context_tracking_init();
    /* init some links before init_ISA_irqs() */
    early_irq_init();
    init_IRQ();
    tick_init();
    rcu_init_nohz();
    init_timers();
    hrtimers_init();
    softirq_init();
    timekeeping_init();
    time_init();
    sched_clock_postinit();
    printk_safe_init();
    perf_event_init();
    profile_init();
    call_function_init();
    WARN(!irqs_disabled(), "Interrupts were enabled early\n");
    early_boot_irqs_disabled = false;
    local_irq_enable();

    kmem_cache_init_late();

    /*
     * HACK ALERT! This is early. We're enabling the console before
     * we've done PCI setups etc, and console_init() must be aware of
     * this. But we do want output early, in case something goes wrong.
     */
    console_init();
    if (panic_later)
        panic("Too many boot %s vars at `%s'", panic_later,
              panic_param);

    lockdep_info();

    /*
     * Need to run this when irqs are enabled, because it wants
     * to self-test [hard/soft]-irqs on/off lock inversion bugs
     * too:
     */
    locking_selftest();

    /*
     * This needs to be called before any devices perform DMA
     * operations that might use the SWIOTLB bounce buffers. It will
     * mark the bounce buffers as decrypted so that their usage will
     * not cause "plain-text" data to be decrypted when accessed.
     */
    mem_encrypt_init();

#ifdef CONFIG_BLK_DEV_INITRD
    if (initrd_start && !initrd_below_start_ok &&
        page_to_pfn(virt_to_page((void *)initrd_start)) < min_low_pfn) {
        pr_crit("initrd overwritten (0x%08lx < 0x%08lx) - disabling it.\n",
            page_to_pfn(virt_to_page((void *)initrd_start)),
            min_low_pfn);
        initrd_start = 0;
    }
#endif
    page_ext_init();
    kmemleak_init();
    debug_objects_mem_init();
    setup_per_cpu_pageset();
    numa_policy_init();
    acpi_early_init();
    if (late_time_init)
        late_time_init();
    calibrate_delay();
    pid_idr_init();
    anon_vma_init();
#ifdef CONFIG_X86
    if (efi_enabled(EFI_RUNTIME_SERVICES))
        efi_enter_virtual_mode();
#endif
    thread_stack_cache_init();
    cred_init();
    fork_init();
    proc_caches_init();
    uts_ns_init();
    buffer_init();
    key_init();
    security_init();
    dbg_late_init();
    vfs_caches_init();
    pagecache_init();
    signals_init();
    seq_file_init();
    proc_root_init();
    nsfs_init();
    cpuset_init();
    cgroup_init();
    taskstats_init_early();
    delayacct_init();

    check_bugs();

    acpi_subsystem_init();
    arch_post_acpi_subsys_init();
    sfi_init_late();

    if (efi_enabled(EFI_RUNTIME_SERVICES)) {
        efi_free_boot_services();
    }

    /* Do the rest non-__init'ed, we're now alive */
    rest_init();
}

This article does not do in-depth discussion, just to sort out the context. The profound meaning needs to be understood by each. Anyway, after power on and startup, enter the BIOS, hand in the permission, transfer the special address, then go to the system startup, load the assembly instructions of the corresponding platform, make various hardware settings, and finally go to the process of C code entry that we are familiar with. In this process, there are more operations such as memory addresses and registers. It can be said that it involves a wide range of things, so we can't ask too much, and maybe we don't need to ask too much.

Happy old fellow's trip to linux!

Topics: C Linux Assembly Language Algorithm source code