1, Definition of symbols
The essence of linking is to splice multiple different object files together like a jigsaw puzzle. In order to make different object files bond with each other, these object files need specific rules.
Specific rules: references to addresses between target files, that is, references to the addresses of variables or functions
For example, if the function foo in the object file A is used in the object file B, then the object file A defines the function foo, and the object file B refers to the function foo in the object file A
In the link, we collectively refer to functions and variables as symbols, and function names and variable names are symbolic names
Each target file will have a symbol table to record all symbol names and symbol values. Symbol values refer to the address of symbols
With simplesection C as an example, symbols can be divided into the following types:
int printf( const char* format, ... ); int global_init_var = 84; int global_uninite_var; void func1( int i ) { printf("%d\n", i); } int main() { static int static_var = 85; static int static_var2; int a = 1; int b; func1(static_var + static_var2 + a + b); return a; }
- Global symbols defined in this target file: can be referenced by other target files, such as global_init_var,global_uninite_var,func1,main
- Global symbol referenced in the target file: this symbol is defined in other target files, such as printf
- Local symbols: these symbols are only visible in this object file, such as static_var,static_var2,a,b
- Segment name: this symbol is generated by the compiler, and its symbol value is the starting address of the segment
Use the nm command to view the target file symbol. The results are as follows:
[gongruiyang@localhost ws]$ nm SimpleSection.o 0000000000000000 T func1 0000000000000000 D global_init_var 0000000000000004 C global_uninite_var 0000000000000022 T main U printf 0000000000000004 d static_var.1801 0000000000000000 b static_var2.1802
- T: The symbol is located at text segment
- D/d: this symbol is located at data segment
- C: This symbol is common data and uninitialized data
- U: Indicates that the symbol is undefined and needs to be found in other target files
- b: The symbol is located at bss segment
2, Symbol structure: Elf32_Sym
The symbol table in the ELF file is a segment in the target file called symtab, the information of this segment is represented by the structure Elf32_Shdr to describe that the data information in the segment is actually an array, and each element in the array is a structure Elf32_Sym: include\linux\elf.h
typedef struct elf32_sym{ Elf32_Word/*unsigned int*/ st_name; Elf32_Addr/*unsigned int*/ st_value; Elf32_Word/*unsigned int*/ st_size; unsigned char st_info; unsigned char st_other; Elf32_Half/*unsigned short*/ st_shndx; } Elf32_Sym; typedef struct elf64_sym { Elf64_Word/*unsigned int*/ st_name; /* Symbol name, index in string tbl */ unsigned char st_info; /* Type and binding attributes */ unsigned char st_other; /* No defined meaning, 0 */ Elf64_Half/*unsigned short*/ st_shndx; /* Associated section index */ Elf64_Addr/*unsigned long long*/ st_value; /* Value of the symbol */ Elf64_Xword/*unsigned long long*/ st_size; /* Associated symbol size */ } Elf64_Sym;
- st_name: symbol name. The value of this member variable represents the subscript of the symbol name in the string table
- st_value: the value corresponding to the symbol, which may be the address of the symbol. The values of different symbols have different meanings
- st_size: symbol size refers to the size of the data type of the symbol
- st_info: symbol type and binding information. The lower 4 bits of the member represent the symbol type, and the upper 28 bits represent the binding information of the symbol
- st_other: meaningless, fill 0
- st_shndx: the segment where the symbol is located
st_info
Symbol binding information
#define STB_LOCAL 0 #define STB_GLOBAL 1 #define STB_WEAK 2
Constant name | value | meaning |
---|---|---|
STB_LOCAL | 0 | Local symbol, not visible externally |
STB_GLOBAL | 1 | Global symbol, externally visible |
STB_WEAK | 2 | Weak reference |
Symbol type
#define STT_NOTYPE 0 #define STT_OBJECT 1 #define STT_FUNC 2 #define STT_SECTION 3 #define STT_FILE 4 #define STT_COMMON 5 #define STT_TLS 6
Constant name | value | meaning |
---|---|---|
STT_NOTYPE | 0 | unknown type |
STT_OBJECT | 1 | Data object type, such as variable, array |
STT_FUNC | 2 | Function or executable code |
STT_SECTION | 3 | The symbol is a segment |
STT_FILE | 4 | The symbol is the file name |
STT_COMMON | 5 | common data |
STT_TLS | 6 | Thread local data |
st_shndx
Segment of symbol
/* special section indexes */ #define SHN_UNDEF 0 #define SHN_LORESERVE 0xff00 #define SHN_LOPROC 0xff00 #define SHN_HIPROC 0xff1f #define SHN_ABS 0xfff1 #define SHN_COMMON 0xfff2 #define SHN_HIRESERVE 0xffff
constant | value | meaning |
---|---|---|
SHN_UNDEF | 0 | The symbol is not defined in this document, but in other target documents |
SHN_LORESERVE | 0xff00 | The lower bound of the reserved index number range |
SHN_LOPROC | 0xff00 | The lower limit of the index number range reserved for a specific processor custom section |
SHN_HIPROC | 0xff1f | The upper limit of the index number range reserved for a processor specific custom section |
SHN_ABS | 0xfff1 | Indicates that the symbol contains an absolute value. For example, the symbol representing the file name belongs to this type |
SHN_COMMON | 0xfff2 | Indicates that the symbol is a "COMMON block" type symbol, which is the type of uninitialized global symbols |
SHN_HIRESERVE | 0xffff | The upper limit of the index number range reserved |
st_value
Each symbol has a corresponding value. If the symbol is the definition of a function or variable, the value of the symbol is the address of the function or variable. More accurately, it can be divided into the following cases:
- In the target file, if the symbol is the definition of a function or variable and the segment of the symbol is not a "COMMON block" (that is, st_shndx is not SHN_COMMON), the value of st_value represents the offset of the symbol in the segment
- In the target file, if the symbol is the definition of a function or variable and the segment of the symbol is a "COMMON block" (that is, st_shndx is SHN_COMMON), the value of st_value represents the alignment attribute of the symbol
- In the executable, st_value represents the virtual address of the symbol
With simplesection O as an example, analyze the status of each symbol:
[gongruiyang@localhost ws]$ readelf -s SimpleSection.o Symbol table '.symtab' contains 16 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS SimpleSection.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000004 4 OBJECT LOCAL DEFAULT 3 static_var.1801 7: 0000000000000000 4 OBJECT LOCAL DEFAULT 4 static_var2.1802 8: 0000000000000000 0 SECTION LOCAL DEFAULT 7 9: 0000000000000000 0 SECTION LOCAL DEFAULT 8 10: 0000000000000000 0 SECTION LOCAL DEFAULT 6 11: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 global_init_var 12: 0000000000000004 4 OBJECT GLOBAL DEFAULT COM global_uninite_var 13: 0000000000000000 34 FUNC GLOBAL DEFAULT 1 func1 14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf 15: 0000000000000022 51 FUNC GLOBAL DEFAULT 1 main
- Output column Description:
The second column Value corresponds to st_value
The third column Size corresponds to st_size
The fourth column Type corresponds to the fifth column Bind_ info
The sixth column Vis is not used
The seventh column Ndx corresponds to st_shndx
The eighth column corresponds to st_name
- Analysis of each symbol:
Symbol name | Segment | Binding information | Symbol type |
---|---|---|---|
main / func1 | 1 (.text) | STB_GLOBAL (global symbol) | STT_FUNC (function or executable code) |
printf | UND (other target documents) | STB_GLOBAL (global symbol) | STT_NOTYPE (undefined) |
global_init_var | 3(.data) | STB_GLOBAL (global symbol) | STT_OBJECT (data object type: variable) |
global_uninite_var | COM (COMMON block) | STB_GLOBAL (global symbol) | STT_COMMON (COMMON block) |
static_var.1801 | 3(.data) | STB_LOCAL (local symbol) | STT_OBJECT (data object type: variable) |
static_var2.1802 | 4(.bss) | STB_LOCAL (local symbol) | STT_OBJECT (data object type: variable) |
SimpleSection.c | ABS (absolute value of file name) | STB_LOCAL (local symbol) | STT_FILE (file name) |