This article is a continuation of the following articles. It is suggested to read it first

A program goes through three processes from source code to execution:

  • Compile: will c file compiled into Don't care about the file o links between documents
  • Static link: all o merge files into one so or out file, process all o layout of the document section area in the target document
  • Dynamic links: adding The so or a.out file is loaded into memory to handle the layout of the loaded file in memory

What is relocation

Relocation is the process of transforming the logical address space of a program into the actual physical address space in memory. It is the basis of realizing multi-channel programs to run simultaneously in memory. There are two kinds of relocation: dynamic relocation and static relocation.

  • 1. Static relocation: that is, it is completed during the process of loading the program into the memory. It means that before the program starts running, the items related to each address in the program have been relocated. The address transformation is usually completed at one time during loading and will not be changed in the future, so it is called static relocation. That is, the static location of the address has been completed while generating the executable / shared target file, which solves the internal contradiction of the executable / shared target file
  • 2. Dynamic relocation: it is not completed when the program is loaded into memory, but each time the CPU accesses the memory, the dynamic address conversion mechanism (hardware) automatically converts the relative address into the absolute address. Dynamic relocation requires the cooperation of software and hardware. In other words, the external contradiction of executable file / shared object file needs to be solved by the external environment. It provides a diplomatic note for staying in the global village That is the last part of this article

Ten types of relocation

  • There are 10 types of repositioning. In practice, please take your seat according to the number. These types can be seen in this article, as follows:
typeformulaSpecific description
R_X86_64_32Formula: S+A
S: The memory address of the symbol referred to by the VALUE member in the retarget
A: The original value at the relocated position indicates the offset between "memory address of reference symbol" and S
Global variables are generated without - fPIC compilation o in the file, each reference corresponds to an R_X86_64_32 relocation item, non static global variable, generated without - fPIC compilation In the so file, each reference corresponds to an R_X86_64_32 reposition item
R_X86_64_PC32Formula: S+A-P
S: The memory address of the symbol referred to by the VALUE member in the retarget
A: The original value of the relocated position indicates the offset between the "relocated position" and the "next instruction"
P: Memory address at which it is relocated
Non static functions are generated without - fPIC compilation o and In the so file, each call corresponds to an R_X86_64_PC32 relocation item
R_X86_64_PLT32Formula: L+A-P
50: < memory address of the symbol @ PLT > referred to by the VALUE member in the reset item
A: The original value of the relocated location indicates the offset of the "relocated location" relative to the "next instruction". P: the memory address of the relocated location
Non static functions are generated by compiling with - fPIC o in the file, each call corresponds to an R_386_PLT32 relocation item
R_X86_64_RELATIVEFormula: B+A
B:. The base address at which the so file is loaded into memory
A: The original value at the relocated position indicates that the reference symbol is in Offset in so file
static global variable, generated without - fPIC compilation In the so file, each reference corresponds to an R_X86_64_RELATIVE relocates items
R_X86_64_GOT32Formula: G
G: Address pointer of reference symbol, offset relative to GOT
Non static global variables are generated by adding - fPIC compilation o in the file, each reference corresponds to an R_X86_64_GOT32 relocation item
R_X86_64_GOTOFFFormula: S-GOT
S: The memory address of the symbol referred to by the VALUE member in the retarget
got: runtime End address of got segment
static global variable, which is generated by adding - fPIC compilation o in the file, each reference corresponds to an R_X86_64_GOTOFF relocates items
R_X86_64_GOLB_DATFormula: S
S: The memory address of the symbol referred to by the VALUE member in the retarget
Non static global variables are generated by adding - fPIC compilation In the so file, each reference corresponds to an R_X86_64_GOLB_DAT relocation item
R_X86_64_COPYFormula: NoneUse extern reference in. out The variable in so corresponds to an R at each reference_ X86_ 64_ Copy relocates items
R_X86_64_JUMP_SLOTFormula: S (same as the formula of r_386_globe_dat, but for dynamic ld,R_386_JMP_SLOT type is equivalent to R_386_RELATIVE)
S: The memory address of the symbol referred to by the VALUE member in the retarget
Non static functions are generated by compiling with - fPIC In the so file, each call corresponds to an R_X86_64_JMP_SLOT relocates items
R_X86_64_GOTPCFormula: GOT+A-P
got: runtime End address of got segment
A: The original value of the relocated position indicates the offset of the "relocated position" in the machine code
P: Memory address at which it is relocated
Global variable, generated by adding - fPIC compilation o in the file, an additional r will be generated_ X86_ 64_ PC32 and R_X86_64_GOTPC relocation item, non static function, generated after compiling with - fPIC o in the file, an additional r will also be generated_ X86_ 64_ PC32 and R_X86_64_GOTPC relocates items


  • The full name of fPIC is Position Independent Code, which is used to generate Position Independent Code.

objdump command

Objdump command is a command to disassemble the object file or executable file under Linux. It allows you to know more about the additional information that binary files may contain in a readable format This article will use it to explain the implementation details of static relocation and the preparation of dynamic relocation preconditions First read the objdump command as a whole

root@5e3abe332c5a:/home/docker/test4harmony/54# objdump
Usage: objdump <option(s)> <file(s)>
 Display information from object <file(s)>.
 At least one of the following switches must be given:
  -a, --archive-headers    Display archive header information
  -f, --file-headers       Display the contents of the overall file header
  -p, --private-headers    Display object format specific file header contents
  -P, --private=OPT,OPT... Display object format specific contents
  -h, --[section-]headers  Display the contents of the section headers
  -x, --all-headers        Display the contents of all headers
  -d, --disassemble        Display assembler contents of executable sections
  -D, --disassemble-all    Display assembler contents of all sections
      --disassemble=<sym>  Display assembler contents from <sym>
  -S, --source             Intermix source code with disassembly
      --source-comment[=<txt>] Prefix lines of source code with <txt>
  -s, --full-contents      Display the full contents of all sections requested
  -g, --debugging          Display debug information in object file
  -e, --debugging-tags     Display debug information using ctags style
  -G, --stabs              Display (in raw form) any STABS info in the file
  -W[lLiaprmfFsoRtUuTgAckK] or
                           Display DWARF info in the file
  --ctf=SECTION            Display CTF info from SECTION
  -t, --syms               Display the contents of the symbol table(s)
  -T, --dynamic-syms       Display the contents of the dynamic symbol table
  -r, --reloc              Display the relocation entries in the file
  -R, --dynamic-reloc      Display the dynamic relocation entries in the file
  @<file>                  Read options from <file>
  -v, --version            Display this program's version number
  -i, --info               List object formats and architectures supported
  -H, --help               Display this information

objdump -S ./obj/main.o

root@5e3abe332c5a:/home/docker/test4harmony/54# objdump -S ./obj/main.o 
./obj/main.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:     
#include <stdio.h>
#include "part.h"
extern int g_int;
extern char *g_str;

int main() {
   0:   f3 0f 1e fa             endbr64
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 83 ec 10             sub    $0x10,%rsp
        int loc_int = 53;
   c:   c7 45 f4 35 00 00 00    movl   $0x35,-0xc(%rbp)
        char *loc_str = "harmony os";
  13:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # 1a <main+0x1a>
  1a:   48 89 45 f8             mov    %rax,-0x8(%rbp)
        printf("main start - overall situation g_int = %d, overall situation g_str = %s.\n", g_int, g_str);
  1e:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 25 <main+0x25>
  25:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 2b <main+0x2b>
  2b:   89 c6                   mov    %eax,%esi
  2d:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 34 <main+0x34>
  34:   b8 00 00 00 00          mov    $0x0,%eax
  39:   e8 00 00 00 00          callq  3e <main+0x3e>
  3e:   8b 45 f4                mov    -0xc(%rbp),%eax
  41:   89 c7                   mov    %eax,%edi
  43:   e8 00 00 00 00          callq  48 <main+0x48>
  48:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  4c:   48 89 c7                mov    %rax,%rdi
  4f:   e8 00 00 00 00          callq  54 <main+0x54>
        printf("main end - overall situation g_int = %d, overall situation g_str = %s.\n", g_int, g_str);
  54:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 5b <main+0x5b>
  5b:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 61 <main+0x61>
  61:   89 c6                   mov    %eax,%esi
  63:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 6a <main+0x6a>
  6a:   b8 00 00 00 00          mov    $0x0,%eax
  6f:   e8 00 00 00 00          callq  74 <main+0x74>
        return 0;
  74:   b8 00 00 00 00          mov    $0x0,%eax
  79:   c9                      leaveq
  7a:   c3                      retq


  • Pay attention to those 00 parts, which cannot be determined by the compiler for the time being By naked eye calculation, the OFFSET bit of OFFSET at this time is 0x16,0x21, which is the content of the following table

objdump -r ./obj/main.o

root@5e3abe332c5a:/home/docker/test4harmony/54# objdump -r ./obj/main.o
./obj/main.o:     file format elf64-x86-64
OFFSET           TYPE              VALUE
0000000000000016 R_X86_64_PC32     .rodata-0x0000000000000004
0000000000000021 R_X86_64_PC32     g_str-0x0000000000000004
0000000000000027 R_X86_64_PC32     g_int-0x0000000000000004
0000000000000030 R_X86_64_PC32     .rodata+0x000000000000000c
000000000000003a R_X86_64_PLT32    printf-0x0000000000000004
0000000000000044 R_X86_64_PLT32    func_int-0x0000000000000004
0000000000000050 R_X86_64_PLT32    func_str-0x0000000000000004
0000000000000057 R_X86_64_PC32     g_str-0x0000000000000004
000000000000005d R_X86_64_PC32     g_int-0x0000000000000004
0000000000000066 R_X86_64_PC32     .rodata+0x0000000000000044
0000000000000070 R_X86_64_PLT32    printf-0x0000000000000004


  • These values corresponding to 0x16 and 0x21 are all 0, that is, the addresses that cannot be determined by the compiler are set to null (0x000000). At the same time, the compiler generates a one-to-one corresponding record, which tells the linker to correct the memory address of the function in this instruction when linking, and tells what relocation type it is and where to find data filling

  • External global variable relocation g_str,g_int

    0000000000000021 R_X86_64_PC32     g_str-0x0000000000000004
    0000000000000027 R_X86_64_PC32     g_int-0x0000000000000004
    1e:   48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 25 <main+0x25>
    25:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 2b <main+0x2b>

    Compiler link G_ Where is STR o I don't know the documents, and of course I don't know G_ The address of STR when running, so set a relocation in the g.o file and require the subsequent process to modify main according to "S(g_str memory address) - A(0x04)" o the value at the offset of 0x21 in the mirror image

  • Function relocation, relocation type is R_X86_64_PLT32

    000000000000003a R_X86_64_PLT32    printf-0x0000000000000004
    0000000000000044 R_X86_64_PLT32    func_int-0x0000000000000004
    0000000000000050 R_X86_64_PLT32    func_str-0x0000000000000004
    0000000000000070 R_X86_64_PLT32    printf-0x0000000000000004
    39:   e8 00 00 00 00          callq  3e <main+0x3e>
    43:   e8 00 00 00 00          callq  48 <main+0x48>

    Similarly, the compiler connects ` ` func_ Where is int, printf ' o files do not know, of course, do not know their runtime address, so in main o set a relocation in the file, and then modify main o the value at the 3a offset in the mirror image

  • Another part of the data is provided by the o self provided, as follows

objdump -sj .rodata ./obj/main.o

root@5e3abe332c5a:/home/docker/test4harmony/54# objdump -sj .rodata ./obj/main.o
./obj/main.o:     file format elf64-x86-64
Contents of section .rodata:
 0000 6861726d 6f6e7920 6f730000 00000000  harmony os......
 0010 6d61696e 20e5bc80 e5a78b20 2d20e585  main ...... - ..
 0020 a8e5b180 20675f69 6e74203d 2025642c  .... g_int = %d,
 0030 20e585a8 e5b18020 675f7374 72203d20   ...... g_str =
 0040 25732e0a 00000000 6d61696e 20e7bb93  %s......main ...
 0050 e69d9f20 2d20e585 a8e5b180 20675f69  ... - ...... g_i
 0060 6e74203d 2025642c 20e585a8 e5b18020  nt = %d, ......
 0070 675f7374 72203d20 25732e0a 00        g_str = %s...


  • Internal variable relocation
    13:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # 1a <main+0x1a>
    0000000000000016 R_X86_64_PC32     .rodata-0x0000000000000004
    Because it is a local variable, the compiler knows where the data is placed rodata area, which requires the subsequent process to modify main according to "s (memory address of. rodata in main.o image) - A(0x04)" O value at offset 0x16 in the mirror image

Then analyze the executable file after static link

objdump -S ./bin/weharmony

root@5e3abe332c5a:/home/docker/test4harmony/54# objdump -S ./bin/weharmony 
Disassembly of section .text:
0000000000001188 <func_str>:
void func_str(char *str) {
    1188:       f3 0f 1e fa             endbr64
    118c:       55                      push   %rbp
    118d:       48 89 e5                mov    %rsp,%rbp
    1190:       48 83 ec 10             sub    $0x10,%rsp
    1194:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
        g_str = str;
    1198:       48 8b 45 f8             mov    -0x8(%rbp),%rax
    119c:       48 89 05 75 2e 00 00    mov    %rax,0x2e75(%rip)        # 4018 <g_str>
        printf("func_str g_str = %s.\n", g_str);
    11a3:       48 8b 05 6e 2e 00 00    mov    0x2e6e(%rip),%rax        # 4018 <g_str>
    11aa:       48 89 c6                mov    %rax,%rsi
    11ad:       48 8d 3d 83 0e 00 00    lea    0xe83(%rip),%rdi        # 2037 <_IO_stdin_used+0x37>
    11b4:       b8 00 00 00 00          mov    $0x0,%eax
    11b9:       e8 92 fe ff ff          callq  1050 <printf@plt>
    11be:       90                      nop
    11bf:       c9                      leaveq
    11c0:       c3                      retq

00000000000011c1 <main>:
#include <stdio.h>
#include "part.h"
extern int g_int;
extern char *g_str;

int main() {
    11c1:       f3 0f 1e fa             endbr64
    11c5:       55                      push   %rbp
    11c6:       48 89 e5                mov    %rsp,%rbp
    11c9:       48 83 ec 10             sub    $0x10,%rsp
        int loc_int = 53;
    11cd:       c7 45 f4 35 00 00 00    movl   $0x35,-0xc(%rbp)
        char *loc_str = "harmony os";
    11d4:       48 8d 05 75 0e 00 00    lea    0xe75(%rip),%rax        # 2050 <_IO_stdin_used+0x50>
    11db:       48 89 45 f8             mov    %rax,-0x8(%rbp)
        printf("main start - overall situation g_int = %d, overall situation g_str = %s.\n", g_int, g_str);
    11df:       48 8b 15 32 2e 00 00    mov    0x2e32(%rip),%rdx        # 4018 <g_str>
    11e6:       8b 05 24 2e 00 00       mov    0x2e24(%rip),%eax        # 4010 <g_int>
    11ec:       89 c6                   mov    %eax,%esi
    11ee:       48 8d 3d 6b 0e 00 00    lea    0xe6b(%rip),%rdi        # 2060 <_IO_stdin_used+0x60>
    11f5:       b8 00 00 00 00          mov    $0x0,%eax
    11fa:       e8 51 fe ff ff          callq  1050 <printf@plt>
    11ff:       8b 45 f4                mov    -0xc(%rbp),%eax
    1202:       89 c7                   mov    %eax,%edi
    1204:       e8 40 ff ff ff          callq  1149 <func_int>
    1209:       48 8b 45 f8             mov    -0x8(%rbp),%rax
    120d:       48 89 c7                mov    %rax,%rdi
    1210:       e8 73 ff ff ff          callq  1188 <func_str>
        printf("main end - overall situation g_int = %d, overall situation g_str = %s.\n", g_int, g_str);
    1215:       48 8b 15 fc 2d 00 00    mov    0x2dfc(%rip),%rdx        # 4018 <g_str>
    121c:       8b 05 ee 2d 00 00       mov    0x2dee(%rip),%eax        # 4010 <g_int>
    1222:       89 c6                   mov    %eax,%esi
    1224:       48 8d 3d 6d 0e 00 00    lea    0xe6d(%rip),%rdi        # 2098 <_IO_stdin_used+0x98>
    122b:       b8 00 00 00 00          mov    $0x0,%eax
    1230:       e8 1b fe ff ff          callq  1050 <printf@plt>
        return 0;
    1235:       b8 00 00 00 00          mov    $0x0,%eax
    123a:       c9                      leaveq
    123b:       c3                      retq
    123c:       0f 1f 40 00             nopl   0x0(%rax)
root@5e3abe332c5a:/home/docker/test4harmony/54# objdump -s ./bin/weharmony
...Omitted part

Contents of section .plt.got:
 1040 f30f1efa f2ff25ad 2f00000f 1f440000  ......%./....D..
Contents of section .plt.sec:
 1050 f30f1efa f2ff2575 2f00000f 1f440000  ......%u/....D..

Contents of section .data:
 4000 00000000 00000000 08400000 00000000  .........@......
 4010 33000000 00000000 08200000 00000000  3........ ......

Contents of section .rodata:
 2000 01000200 00000000 68656c6c 6f20776f  ........hello wo
 2010 726c6400 00000000 66756e63 5f696e74  rld.....func_int
 2020 20675f69 6e74203d 2025642c 746d7020   g_int = %d,tmp
 2030 3d202564 2e0a0066 756e635f 73747220  = %d...func_str
 2040 675f7374 72203d20 25732e0a 00000000  g_str = %s......
 2050 6861726d 6f6e7920 6f730000 00000000  harmony os......
 2060 6d61696e 20e5bc80 e5a78b20 2d20e585  main ...... - ..
 2070 a8e5b180 20675f69 6e74203d 2025642c  .... g_int = %d,
 2080 20e585a8 e5b18020 675f7374 72203d20   ...... g_str =
 2090 25732e0a 00000000 6d61696e 20e7bb93  %s......main ...
 20a0 e69d9f20 2d20e585 a8e5b180 20675f69  ... - ...... g_i
 20b0 6e74203d 2025642c 20e585a8 e5b18020  nt = %d, ......
 20c0 675f7374 72203d20 25732e0a 00        g_str = %s...


  • main. The relocated part in O is no longer 00, but has actual data, such as:
    char *loc_str = "harmony os";
    11d4:       48 8d 05 75 0e 00 00    lea    0xe75(%rip),%rax        # 2050 <_IO_stdin_used+0x50>
    Corresponding # 2050<_ IO_ stdin_ Used + 0x50 > the address data is harmony os at rodata 2050
  • Look at the in main()
      1209:       48 8b 45 f8             mov    -0x8(%rbp),%rax
      120d:       48 89 c7                mov    %rax,%rdi
      1210:       e8 73 ff ff ff          callq  1188 <func_str>
    callq 1188 1188 is func_ Entry address of STR
    void func_str(char *str) {
        1188:       f3 0f 1e fa             endbr64
  • Look at the global variable G_ str``g_ Link addresses 0x4018 and 0x4010 corresponding to int
    1215:       48 8b 15 fc 2d 00 00    mov    0x2dfc(%rip),%rdx        # 4018 <g_str>
    121c:       8b 05 ee 2d 00 00       mov    0x2dee(%rip),%eax        # 4010 <g_int>
    By Provided in data area
      4000 00000000 00000000 08400000 00000000  .........@......
      4010 33000000 00000000 08200000 00000000  3........ ......
    0x4010 = 0x33 = 51
  • The printf code is called callq 1050 in the main function.
      1230:       e8 1b fe ff ff          callq  1050 <printf@plt>
    The content is provided by plt.sec area is provided and disassembled as
      Contents of section .plt.sec:
      1050 f30f1efa f2ff2575 2f00000f 1f440000  ......%u/....D..
      Disassembly of section .plt.sec:
      0000000000001050 <printf@plt>:
          1050:       f3 0f 1e fa             endbr64
          1054:       f2 ff 25 75 2f 00 00    bnd jmpq *0x2f75(%rip)        # 3fd0 <printf@GLIBC_2.2.5>
          105b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    Note 3fd0 that it needs to be provided by the runtime environment to realize the dynamic relocation of the loader
  • To sum up, weharmony has finished all the work o the static relocation part of the file is combined into a new executable file, in which only the dynamic link part has not been completed, because it requires runtime relocation As follows:

objdump -R ./bin/weharmony

root@5e3abe332c5a:/home/docker/test4harmony/54# objdump -R ./bin/weharmony 

./bin/weharmony:     file format elf64-x86-64

OFFSET           TYPE              VALUE
0000000000003db8 R_X86_64_RELATIVE  *ABS*+0x0000000000001140
0000000000003dc0 R_X86_64_RELATIVE  *ABS*+0x0000000000001100
0000000000004008 R_X86_64_RELATIVE  *ABS*+0x0000000000004008
0000000000004018 R_X86_64_RELATIVE  *ABS*+0x0000000000002008
0000000000003fd8 R_X86_64_GLOB_DAT  _ITM_deregisterTMCloneTable
0000000000003fe0 R_X86_64_GLOB_DAT  __libc_start_main@GLIBC_2.2.5
0000000000003fe8 R_X86_64_GLOB_DAT  __gmon_start__
0000000000003ff0 R_X86_64_GLOB_DAT  _ITM_registerTMCloneTable
0000000000003ff8 R_X86_64_GLOB_DAT  __cxa_finalize@GLIBC_2.2.5
0000000000003fd0 R_X86_64_JUMP_SLOT  printf@GLIBC_2.2.5


  • This is a diplomatic note submitted by weharmony on the runtime environment. With it, we can be in line with international standards and live in the global village
  • Other parts of this description are strange. Take a look at 3fd0, whose dynamic link relocation type is R_X86_64_JUMP_SLOT, which tells the dynamic loader to find printf in the runtime environment and complete dynamic relocation

