Introduction to operating system : Operating Systems: Three Easy Pieces
After class exercises: https://pages.cs.wisc.edu/~remzi/OSTEP/Homework/homework.html
The translation of the README part of the exercises after class in this chapter (easy to view later):
Welcome to this simulator. The idea is to get familiar with threads by observing how they interleave; simulator x86.py Will help you understand this.
The simulator simulates the execution of assembly sequence code through multiple threads. Note that the OS code to be run (for example, performing context switching) is not displayed; Therefore, all you see is the interleaving of user code.
The operation of assembly code is based on x86, but it is somewhat simplified. In these instruction sets, here are four general registers (% ax,% BX,% CX,% DX), a program pointer (PC) and a small instruction set, which are enough for us.
Here is a short example code that can be run:
.main mov 2000, %ax # get the value at the address add $1, %ax # increment it mov %ax, 2000 # store it back halt
This code is easy to understand: the first instruction, an x86 "mov", loads a value from address 2000 into the% ax register. In this subset of X86, addresses can take the following forms:
two thousand -> number ( 2000) is an address
(% cx) -> The address consists of the contents of the register (in parentheses)
1000(%dx) -> The number + content of the register constitutes the address
10(%ax,%bx) -> The number + reg1 + reg2 constitutes the address
To store a value, the "mov" instruction is also used, but the parameters are the opposite this time, for example:
mov %ax, 2000
In the above order, the "add" instruction is clear: it adds an immediate value (specified by $1) to the second parameter specified in the register (for example:% AX =% ax + 1).
Therefore, we can understand the above code sequence: it loads the value at address 2000, then adds 1 to it, and then stores it back at address 2000.
The fake instruction "halt" just stops the thread.
Run the simulator and see how it works, assuming that the above code sequence is in the file "simple race. S".
HW-ThreadsIntro$ ./x86.py -p simple-race.s -t 1 Thread 0 1000 mov 2000(%bx), %ax 1001 add $1, %ax 1002 mov %ax, 2000(%bx) 1003 halt
Here, the parameter (- p) is used to specify a program, and (- t) specifies the number of threads and interrupt interval. The interrupt interval is the frequency at which the scheduler wakes up and runs to switch to different tasks. Because there is only one thread in this example, this interval is not important.
The output is easy to read: the simulator prints program counters (here from 1000 to 1003) and gets instructions to execute. Note that we assume that (unreal) all instructions are executed in memory only in a single byte; In x86, instructions are flexible in size and may be a few bytes.
We can use more detailed tracking to better understand how the state of the machine changes during execution.
HW-ThreadsIntro$ ./x86.py -p simple-race.s -t 1 -M 2000 -R ax,bx -c 2000 ax bx Thread 0 0 0 0 0 0 0 1000 mov 2000(%bx), %ax 0 1 0 1001 add $1, %ax 1 1 0 1002 mov %ax, 2000(%bx) 1 1 0 1003 halt
By using the - M flag, memory locations can be tracked (comma separated means that multiple can be tracked, such as 20003000); By using the - R flag, you can track the value in a specific register.
The value on the left shows the memory / register contents after the instruction on the right is executed. For example, after the "add" instruction, you can see that% ax has increased to the value 1; After the second "mov" instruction (at PC=1002), you can see that the memory content at address 2000 has now also increased.
Here are some instructions that must be understood. Here is a code fragment of a loop:
.main .top sub $1,%dx test $0,%dx jgte .top halt
Here we need to introduce some. The first is the "test" instruction. This instruction takes two parameters and compares them; Then set it to implicit "condition codes" (like a 1-bit register) so that subsequent instructions can operate on it.
In this example, the other instruction is "jump" (in this example, "jgte" means "jump" if it is greater than or equal to the first value). If the second value is greater than or equal to the first value, the instruction will jump.
Last point: in order for the code to really work, dx needs to be initialized to 1 or greater.
So we run the program like this:
HW-ThreadsIntro$ ./x86.py -p loop.s -t 1 -a dx=3 -r dx -C -c dx >= > <= < != == Thread 0 3 0 0 0 0 0 0 2 0 0 0 0 0 0 1000 sub $1,%dx 2 1 1 0 0 1 0 1001 test $0,%dx 2 1 1 0 0 1 0 1002 jgte .top 1 1 1 0 0 1 0 1000 sub $1,%dx 1 1 1 0 0 1 0 1001 test $0,%dx 1 1 1 0 0 1 0 1002 jgte .top 0 1 1 0 0 1 0 1000 sub $1,%dx 0 1 0 1 0 0 1 1001 test $0,%dx 0 1 0 1 0 0 1 1002 jgte .top -1 1 0 1 0 0 1 1000 sub $1,%dx -1 0 0 1 1 1 0 1001 test $0,%dx -1 0 0 1 1 1 0 1002 jgte .top -1 0 0 1 1 1 0 1003 halt
"- R dx" flag tracks the value of% dx; "- The "C" flag tracks the value of the condition code set by the test instruction. Finally, the "- a dx=3" flag sets the% dx register to the starting value of 3.
Through tracing, you can intuitively see that the value of the instruction "sub" gradually decreases%dx, and finally end the cycle by judging the conditions.
Now we have a more interesting example, for example, a race condition with multithreading. Let's first look at the code:
.main .top # critical section mov 2000, %ax # get the value at the address add $1, %ax # increment it mov %ax, 2000 # store it back # see if we're still looping sub $1, %bx test $0, %bx jgt .top halt
This code has a critical section that loads the variable value (at address 2000), adds 1 to the value, and then saves it back.
The following code reduces only one loop counter (in% bx) to test whether it is greater than or equal to zero. If so, jump back to the critical area at the top again.
HW-ThreadsIntro$ ./x86.py -p looping-race-nolock.s -t 2 -a bx=1 -M 2000 -c 2000 Thread 0 Thread 1 0 0 1000 mov 2000, %ax 0 1001 add $1, %ax 1 1002 mov %ax, 2000 1 1003 sub $1, %bx 1 1004 test $0, %bx 1 1005 jgt .top 1 1006 halt 1 ----- Halt;Switch ----- ----- Halt;Switch ----- 1 1000 mov 2000, %ax 1 1001 add $1, %ax 2 1002 mov %ax, 2000 2 1003 sub $1, %bx 2 1004 test $0, %bx 2 1005 jgt .top 2 1006 halt
You can see that each thread runs once and updates the shared variable at address 2000 every time, so the final result is 2.
Insert "halt" whenever one thread stops and another thread must run; Switch row. Last example: run the same program as above, but with less interrupt frequency.
HW-ThreadsIntro$ ./x86.py -p looping-race-nolock.s -t 2 -a bx=1 -M 2000 -i 2 2000 Thread 0 Thread 1 ? ? 1000 mov 2000, %ax ? 1001 add $1, %ax ? ------ Interrupt ------ ------ Interrupt ------ ? 1000 mov 2000, %ax ? 1001 add $1, %ax ? ------ Interrupt ------ ------ Interrupt ------ ? 1002 mov %ax, 2000 ? 1003 sub $1, %bx ? ------ Interrupt ------ ------ Interrupt ------ ? 1002 mov %ax, 2000 ? 1003 sub $1, %bx ? ------ Interrupt ------ ------ Interrupt ------ ? 1004 test $0, %bx ? 1005 jgt .top ? ------ Interrupt ------ ------ Interrupt ------ ? 1004 test $0, %bx ? 1005 jgt .top ? ------ Interrupt ------ ------ Interrupt ------ ? 1006 halt ? ----- Halt;Switch ----- ----- Halt;Switch ----- ? 1006 halt
As you can see, each thread is interrupted every 2 instructions, as we specified through the "- i 2" flag. What is the value of memory [2000] throughout the run? What should it be?
Now let's provide more information about what this program can simulate. Complete register set:% ax,% bx,% cx,% dx, and PC. In this version, "stack" is not supported, and there are no call and return instructions.
The complete simulation instruction set is as follows:
mov immediate, register # moves immediate value to register mov memory, register # loads from memory into register mov register, register # moves value from one register to other mov register, memory # stores register contents in memory mov immediate, memory # stores immediate value in memory add immediate, register # register = register + immediate add register1, register2 # register2 = register2 + register1 sub immediate, register # register = register - immediate sub register1, register2 # register2 = register2 - register1 test immediate, register # compare immediate and register (set condition codes) test register, immediate # same but register and immediate test register, register # same but register and register jne # jump if test'd values are not equal je # ... equal jlt # ... second is less than first jlte # ... less than or equal jgt # ... is greater than jgte # ... greater than or equal xchg register, memory # atomic exchange: # put value of register into memory # return old contents of memory into reg # do both things atomically nop # no op Notes: - 'immediate' is something of the form $number - 'memory' is of the form 'number' or '(reg)' or 'number(reg)' or 'number(reg,reg)' (as described above) - 'register' is one of %ax, %bx, %cx, %dx
Finally, the following is the complete set of options for the simulator, which can use the - h flag
HW-ThreadsIntro$ ./x86.py -h Usage: x86.py [options] Options: -h, --help show this help message and exit -s SEED, --seed=SEED the random seed -t NUMTHREADS, --threads=NUMTHREADS number of threads -p PROGFILE, --program=PROGFILE source program (in .s) -i INTFREQ, --interrupt=INTFREQ interrupt frequency -r, --randints if interrupts are random -a ARGV, --argv=ARGV comma-separated per-thread args (e.g., ax=1,ax=2 sets thread 0 ax reg to 1 and thread 1 ax reg to 2); specify multiple regs per thread via colon-separated list (e.g., ax=1:bx=2,cx=3 sets thread 0 ax and bx and just cx for thread 1) -L LOADADDR, --loadaddr=LOADADDR address where to load code -m MEMSIZE, --memsize=MEMSIZE size of address space (KB) -M MEMTRACE, --memtrace=MEMTRACE comma-separated list of addrs to trace (e.g., 20000,20001) -R REGTRACE, --regtrace=REGTRACE comma-separated list of regs to trace (e.g., ax,bx,cx,dx) -C, --cctrace should we trace condition codes -S, --printstats print some extra stats -v, --verbose print some extra info -c, --compute compute answers for me
Most are obvious. Use - r to open a random interrupter (intfreq specified from 1 to - i), which can make the job problem more interesting.
-L specifies the location where the code is loaded in the address space.
-m specifies the size of the address space in KB.
-S prints some additional Statistics - c is not really used (unlike most simulators in the book) using trace or condition code.