This article can solve the following three problems:
- How are the member variables of a class distributed after inheritance in different ways?
- Virtual function table and virtual function table pointer, position in executable file?
- How does the content of the virtual function table of a class change after single inheritance, multiple inheritance and virtual inheritance?
The variables involved here are: whether there is inheritance, whether there is virtual function, whether there is multiple inheritance, and whether there is virtual inheritance.
preparation
Before we begin to explore the memory layout of a class, we first understand the concept of virtual function table, the rules of byte alignment, and how to print the memory layout of a class.
View the memory layout of the class
We can use clang + + to view the memory layout of the class:
# View the object layout. sizeof(class_t) is required in main clang++ -Xclang -fdump-record-layouts xxx.cpp # View the layout of the virtual function table. It is required to instantiate an object in main clang++ -Xclang -fdump-record-layouts xxx.cpp # perhaps clang -cc1 -fdump-vtable-layouts -emit-llvm xxx.cpp
Virtual function table
- Each class has its own virtual function table, which belongs to a class rather than an instantiated object.
- If a class declares a virtual function, in all instantiated objects of the class, the 8 bytes [0,7] (assuming a 64 bit machine) will store a pointer vtable of the virtual function table.
- Each element in the virtual function table is a function address, pointing to a virtual function in the code segment.
- The virtual function table pointer vtable is filled in when the object is instantiated (so the constructor cannot be declared as a virtual function with virtual).
- Suppose B inherits A. if we have A *a = new B() at runtime, a - > VTable actually fills in the address of the virtual function table of class B.
- How do I get the value of vtable? By reading the contents of the first 8 bytes of the object, i.e. * (uint64_t *) & object.
+---------+ +----------------+ | entity1 | | .text segment | +---------+ +----------------+ | vtable |-------+ +------->| Entity::vfunc1 | | member1 | | +-----------------+ | +---->| Entity::vfunc2 | | member2 | | | Entity's vtable | | | | ... | +---------+ | +-----------------+ | | +----------------+ +-------->| 0 : vfunc_ptr0 |------+ | | Entity::func1 | +---------+ | | 1 : vfunc_ptr1 |---------+ | Entity::func2 | | entity2 | | | ... | | ... | +---------+ | +-----------------+ +----------------+ | vtable |-------+ | member1 | | member2 | +---------+
Where will the virtual function table (i.e. Entity's vtable in the figure above) be stored?
One intuition is that, like static member variables, they are stored in. data segment, because both belong to data shared by classes.
byte alignment
Byte alignment rule: align according to the number of bytes (always 1, 2, 4, 8) of the longest data type "scanned" by the compiler, and try to fill the "gap".
The compiler parses a struct / class in the declared order (scan from front to back).
It should be noted that the byte alignment rules of different compilers will be slightly different, but in general, they are similar. The compilers used in this article are clang/clang + +.
Example 1
struct Entity { char c1; int val; }; // sizeof(Entity) = 8
- If you replace char c1 with short val0, it is still 8.
- If you replace int val with double d, it is 16.
Example 2
struct Entity { char cval; short ival; double dval; }; /* *** Dumping AST Record Layout 0 | struct Entity 0 | char cval 2 | short ival 8 | double dval | [sizeof=16, dsize=16, align=8, | nvsize=16, nvalign=8] */
- If short ival is replaced by int ival, the starting position of ival is 4 (because when the compiler scans ival, the longest number of bytes seen is sizeof(int) = 4).
Example 3
struct Entity { char cval; double dval; char cval2; int ival; }; /* *** Dumping AST Record Layout 0 | struct Entity 0 | char cval 8 | double dval 16 | char cval2 20 | int ival | [sizeof=24, dsize=24, align=8, | nvsize=24, nvalign=8] */
The example here is to illustrate the above "fill the gap as much as possible". Note that three bytes of space 17, 18 and 19 are left between cval2 and ival.
- In cval2, ival inserts any one byte data type (3 at most), which will not affect the size of sizeof(Entity).
- If we insert a short sval between cval2 and ival, the sval will be located at 18.
Example 4
What if there were virtual functions?
class Entity { char cval; virtual void vfunc() {} }; /* *** Dumping AST Record Layout 0 | class Entity 0 | (Entity vtable pointer) 8 | char cval | [sizeof=16, dsize=9, align=8, | nvsize=9, nvalign=8] */
On 64 bit machines, the size of a pointer is 8 bytes, so the compiler will align according to 8 bytes.
Single class
Member variable
Considering the memory layout of member variables without virtual functions.
class A { private: short val1; public: int val2; double d; static char ch; void funcA1() {} }; int main() { __attribute__((unused)) int k = sizeof(A); } // clang++ -Xclang -fdump-record-layouts test.cpp
After compiling with the above command, the output is:
*** Dumping AST Record Layout 0 | class A 0 | short val1 4 | int val2 8 | double d | [sizeof=16, dsize=16, align=8, | nvsize=16, nvalign=8]
As can be seen from the above output:
- Members of static type do not occupy the memory of the instantiated object (because members of static type are stored in the static data area. Data).
- Member functions do not occupy memory (because they are stored in the code segment. text).
- The permission levels of member variables private and public do not affect the memory layout. The memory layout is only related to the declaration order (byte alignment may be required).
Virtual function table
class A { private: short val1; public: int val2; double d; static char ch; void funcA1() {} virtual void vfuncA1() {} virtual void vfuncA2() {} }; int main() { __attribute__((unused)) int k = sizeof(A); // __attribute__((unused)) A a; }
It can be seen from here that the pointer of the virtual function table is stored at the beginning of a class by default (usually 4 or 8 bytes, depending on the word length of the machine).
Memory layout:
clang++ -Xclang -fdump-record-layouts test.cpp *** Dumping AST Record Layout 0 | class A 0 | (A vtable pointer) 8 | short val1 12 | int val2 16 | double d | [sizeof=24, dsize=24, align=8, | nvsize=24, nvalign=8] clang++ -Xclang -fdump-vtable-layouts test.cpp Original map Vtable for 'A' (4 entries). 0 | offset_to_top (0) 1 | A RTTI -- (A, 0) vtable address -- 2 | void A::vfuncA1() 3 | void A::vfuncA2() VTable indices for 'A' (2 entries). 0 | void A::vfuncA1() 1 | void A::vfuncA2()
- offset_to_top(0): indicates the offset of the current virtual function table address from the top address of the object. Because the head of the object is the pointer of the virtual function table, the offset is 0. In the case of multiple inheritance, a class may have multiple vtable pointers.
- RTTI: Run Time Type Info, which refers to the address where the runtime type information (type_info) is stored. It is used for runtime type identification, typeid and dynamic_cast .
single inheritance
Member variable
class A { public: char aval; static int sival; void funcA1(); }; class B : public A { public: double bval; void funcB1(); }; class C : public B { public: int cval; void funcC1() {} };
Memory layout:
clang++ -Xclang -fdump-record-layouts test.cpp *** Dumping AST Record Layout 0 | class A 0 | char aval | [sizeof=1, dsize=1, align=1, | nvsize=1, nvalign=1] *** Dumping AST Record Layout 0 | class B 0 | class A (base) 0 | char aval 8 | double bval | [sizeof=16, dsize=16, align=8, | nvsize=16, nvalign=8] *** Dumping AST Record Layout 0 | class C 0 | class B (base) 0 | class A (base) 0 | char aval 8 | double bval 16 | int cval | [sizeof=24, dsize=20, align=8, | nvsize=20, nvalign=8]
It can be seen that in ordinary single inheritance, member variables are arranged from top to bottom, and follow the byte alignment rules mentioned above.
Virtual function table
- There are two virtual functions vfona1 and vfona2 in A
- B overrides vfona1 and defines the virtual function vffuncb
- C rewrites vffunc1 and customizes the virtual function vffuncc
class A { public: char aval; static int sival; virtual void vfuncA1() {} virtual void vfuncA2() {} }; class B : public A { public: double bval; virtual void vfuncA1() {} virtual void vfuncB() {} }; class C : public B { public: int cval; virtual void vfuncA1() {} virtual void vfuncC() {} };
Member variable layout:
clang++ -Xclang -fdump-record-layouts test.cpp *** Dumping AST Record Layout 0 | class A 0 | (A vtable pointer) 8 | char aval | [sizeof=16, dsize=9, align=8, | nvsize=9, nvalign=8] *** Dumping AST Record Layout 0 | class B 0 | class A (primary base) 0 | (A vtable pointer) 8 | char aval 16 | double bval | [sizeof=24, dsize=24, align=8, | nvsize=24, nvalign=8] *** Dumping AST Record Layout 0 | class C 0 | class B (primary base) 0 | class A (primary base) 0 | (A vtable pointer) 8 | char aval 16 | double bval 24 | int cval | [sizeof=32, dsize=28, align=8, | nvsize=28, nvalign=8]
The virtual functions of the three classes are as follows:
clang++ -Xclang -fdump-vtable-layouts test.cpp Original map void C::vfuncA1() -> void B::vfuncA1() void B::vfuncA1() -> void A::vfuncA1() Vtable for 'C' (6 entries). 0 | offset_to_top (0) 1 | C RTTI -- (A, 0) vtable address -- -- (B, 0) vtable address -- -- (C, 0) vtable address -- 2 | void C::vfuncA1() 3 | void A::vfuncA2() 4 | void B::vfuncB() 5 | void C::vfuncC() VTable indices for 'C' (2 entries). 0 | void C::vfuncA1() 3 | void C::vfuncC() Original map void C::vfuncA1() -> void B::vfuncA1() void B::vfuncA1() -> void A::vfuncA1() Vtable for 'B' (5 entries). 0 | offset_to_top (0) 1 | B RTTI -- (A, 0) vtable address -- -- (B, 0) vtable address -- 2 | void B::vfuncA1() 3 | void A::vfuncA2() 4 | void B::vfuncB() VTable indices for 'B' (2 entries). 0 | void B::vfuncA1() 2 | void B::vfuncB() Original map void C::vfuncA1() -> void B::vfuncA1() void B::vfuncA1() -> void A::vfuncA1() Vtable for 'A' (4 entries). 0 | offset_to_top (0) 1 | A RTTI -- (A, 0) vtable address -- 2 | void A::vfuncA1() 3 | void A::vfuncA2() VTable indices for 'A' (2 entries). 0 | void A::vfuncA1() 1 | void A::vfuncA2()
It can be seen that in single inheritance, the virtual function table of subclasses is constructed through the following steps:
- Copy the virtual function table of the parent class of the previous level first.
- If the subclass has custom virtual functions (for example, B:: vffuncb, C:: vffuncc), the addresses of these virtual functions are directly appended to the virtual function table.
- If the subclass overrides the virtual function of the parent class, use the new address (e.g. B:: vfona1, C:: vfona1) to override the original address (i.e. a:: vfon1).
Multiple inheritance
By default, you are already familiar with the routine. Now let's look directly at member variables and virtual functions.
class A { char aval; virtual void vfuncA1() {} virtual void vfuncA2() {} }; class B { double bval; virtual void vfuncB1() {} virtual void vfuncB2() {} }; class C : public A, public B { char cval; virtual void vfuncC() {} virtual void vfuncA1() {} virtual void vfuncB1() {} };
The memory layout is as follows (note the layout of class C):
clang++ -Xclang -fdump-record-layouts test.cpp *** Dumping AST Record Layout 0 | class A 0 | (A vtable pointer) 8 | char aval | [sizeof=16, dsize=9, align=8, | nvsize=9, nvalign=8] *** Dumping AST Record Layout 0 | class B 0 | (B vtable pointer) 8 | double bval | [sizeof=16, dsize=16, align=8, | nvsize=16, nvalign=8] *** Dumping AST Record Layout 0 | class C 0 | class A (primary base) 0 | (A vtable pointer) 8 | char aval 16 | class B (base) 16 | (B vtable pointer) 24 | double bval 32 | char cval | [sizeof=40, dsize=33, align=8, | nvsize=33, nvalign=8]
Note the memory layout of class C:
- A total of 40 bytes, with 2 vtable pointers.
- Inheritance can be divided into primary base parent class and normal base parent class.
In fact:
+--------+--------+---------------+ | offset | size | content | +--------+--------+---------------+ | 0 | 8 | vtable1 | | 8 | 1 | aval | | 9 | 7 | aligned bytes | | 16 | 8 | vtable2 | | 24 | 8 | bval | | 32 | 1 | cval | | 33 | 7 | aligned bytes | +--------+--------+---------------+
In general, in the memory layout of the lowest subclass, the arrangement rules of multi inherited member variables and vtable pointers are as follows:
- The first declared inheritance is the parent class of primary base.
- It is arranged in the order of inherited declarations and needs to follow the compiler's byte alignment rules.
- Finally, arrange the member variables of the lowest subclass.
The virtual function table is as follows (contents of A and B are omitted):
clang++ -Xclang -fdump-vtable-layouts test.cpp Original map void C::vfuncA1() -> void A::vfuncA1() Vtable for 'C' (10 entries). 0 | offset_to_top (0) 1 | C RTTI -- (A, 0) vtable address -- -- (C, 0) vtable address -- 2 | void C::vfuncA1() 3 | void A::vfuncA2() 4 | void C::vfuncC() 5 | void C::vfuncB1() 6 | offset_to_top (-16) 7 | C RTTI -- (B, 16) vtable address -- 8 | void C::vfuncB1() [this adjustment: -16 non-virtual] method: void B::vfuncB1() 9 | void B::vfuncB2() Thunks for 'void C::vfuncB1()' (1 entry). 0 | this adjustment: -16 non-virtual VTable indices for 'C' (3 entries). 0 | void C::vfuncA1() 2 | void C::vfuncC() 3 | void C::vfuncB1()
As can be seen from the above, the virtual function table of C consists of two parts:
- The first is "C inherits A", which generates the first virtual function table according to the above principle of single inheritance. At this time, C:: vffuncb1() is A custom virtual function for A, so the first part of the virtual function table has four function addresses.
- The second is "C inherits B", which is also generated according to the rule of single inheritance, but C:: vffuncc() does not need to be added, because C:: vffuncc() has been filled in the first part.
It can be found that:
- There is a duplicate function address C:: vffuncb1 in the virtual function table of C.
- Although C has two vtable pointers, it still has only one virtual function table( 😅 In fact, it can also be understood as two tables, but the two tables are next to each other), and the two vtable pointers point to different positions of the virtual function table (perhaps related to the processing of the compiler, at least in the case of clang).
After the virtual function table, the memory layout of C is as follows:
+-----------------------+ |-2: offset_to_top(0) | |-1: C RTTI | +--------+--------+---------------+ +-----------------------+ | offset | size | content | | class C's vtable | +--------+--------+---------------+ +-----------------------+ | 0 | 8 | vtable1 |--------------------->| 0: C::vfuncA1_ptr | | 8 | 1 | aval | | 1: A::vfuncA2_ptr | | 9 | 7 | aligned bytes | | 2: C::vfuncC_ptr | | 16 | 8 | vtable2 |------------+ | 3: C::vfuncB1_ptr | | 24 | 8 | bval | | | 4: offset_to_top(-16) | | 32 | 1 | cval | | | 5: C RTTI | | 33 | 7 | aligned bytes | +-------->| 6: C::vfuncB1_ptr | +--------+--------+---------------+ | 7: B::vfuncB2_ptr | +-----------------------+
How to test this idea?
class A { public: char aval; virtual void vfuncA1() { cout << "A::vfuncA1()" << endl; } virtual void vfuncA2() { cout << "A::vfuncA2()" << endl; } }; class B { public: double bval; virtual void vfuncB1() { cout << "B::vfuncB1()" << endl; } virtual void vfuncB2() { cout << "B::vfuncB2()" << endl; } }; class C : public A, public B { public: char cval; virtual void vfuncC() { cout << "C::vfuncC()" << endl; } virtual void vfuncA1() { cout << "C::vfuncA1()" << endl; } virtual void vfuncB1() { cout << "C::vfuncB1()" << endl; } }; int main() { __attribute__((unused)) int k = sizeof(C); C c; uint64_t *cvtable = (uint64_t *)*(uint64_t *)(&c); uint64_t *cvtable2 = (uint64_t *)*(uint64_t *)((uint8_t *)(&c) + 16); typedef void (*func_t)(void); cout << "---- vtable1 ----" << endl; ((func_t)(*(cvtable + 0)))(); // C::vfuncA1() ((func_t)(*(cvtable + 1)))(); // A::vfuncA2() ((func_t)(*(cvtable + 2)))(); // C::vfuncC() ((func_t)(*(cvtable + 3)))(); // C::vfuncB1() printf("offset_to_top = %d\n", *(cvtable2 - 2)); // -16 cout << "---- vtable2 ----" << endl; ((func_t)(*(cvtable2 + 0)))(); // C::vfuncB1(), same as cvtable + 6 ((func_t)(*(cvtable2 + 1)))(); // B::vfuncB2(), same as cvtable + 7 }
Prismatic inheritance and virtual inheritance
If we need to use an inheritance chain similar to "prism", we should implement it through "virtual inheritance".
Suppose the inheritance chain here is:
Base / \ A B \ / Child
If you do not use virtual to modify the inheritance method:
class Base { public: int value; }; class A : public Base { }; class B : public Base { }; class Child : public A, public B { }; int main() { Child child; child.value; }
Then the member variable child.value will have a compile time error (clang + +), similar to "naming conflict".
Single virtual inheritance
class Base { char baseval; virtual void vfuncBase1() {} virtual void vfuncBase2() {} }; class A : virtual public Base { double aval; virtual void vfuncBase1() {} virtual void vfuncA() {} }; class B : virtual public Base { double bval; virtual void vfuncBase2() {} virtual void vfuncB() {} };
Take A as an example. Member variable layout:
clang++ -Xclang -fdump-record-layouts diamond2.cpp *** Dumping AST Record Layout 0 | class A 0 | (A vtable pointer) 8 | double aval 16 | class Base (virtual base) 16 | (Base vtable pointer) 24 | char baseval | [sizeof=32, dsize=25, align=8, | nvsize=16, nvalign=8]
Different from the "single inheritance" above, virtual inheritance here will have two vtable pointers, and the targets of virtual inheritance (i.e. Base will be arranged at the end).
The contents of the virtual function table are as follows:
clang++ -Xclang -fdump-vtable-layouts diamond2.cpp Original map Vtable for 'A' (11 entries). 0 | vbase_offset (16) 1 | offset_to_top (0) 2 | A RTTI -- (A, 0) vtable address -- 3 | void A::vfuncBase1() 4 | void A::vfuncA() 5 | vcall_offset (0) 6 | vcall_offset (-16) 7 | offset_to_top (-16) 8 | A RTTI -- (Base, 16) vtable address -- 9 | void A::vfuncBase1() [this adjustment: 0 non-virtual, -24 vcall offset offset] method: void Base::vfuncBase1() 10 | void Base::vfuncBase2() Virtual base offset offsets for 'A' (1 entry). Base | -24 Thunks for 'void A::vfuncBase1()' (1 entry). 0 | this adjustment: 0 non-virtual, -24 vcall offset offset VTable indices for 'A' (2 entries). 0 | void A::vfuncBase1() 1 | void A::vfuncA()
Simplify:
A vtable: B vtable: - A::vfuncBase1() - B::vfuncBase2() - A::vfuncA() - B::vfuncB() - A::vfuncBase1() - Base::vfuncBase1() - Base::vfuncBase2() - B::vfuncBase2()
As can be seen from the above:
- The first part 3-4 of the virtual function table is constructed according to the rules when A is A "single class".
- The second part 9-10 of the virtual function table is constructed according to the rules of A single inheritance Base.
Prism inherited member variable
class Child : public A, public B { char childval; virtual void vfuncC() {} virtual void vfuncB() {} virtual void vfuncA() {} };
The memory layout of Child member variables is as follows:
clang++ -Xclang -fdump-record-layouts diamond.cpp *** Dumping AST Record Layout 0 | class A 0 | (A vtable pointer) 8 | double aval 16 | class Base (virtual base) 16 | char baseval | [sizeof=24, dsize=17, align=8, | nvsize=16, nvalign=8] *** Dumping AST Record Layout 0 | class B 0 | (B vtable pointer) 8 | double bval 16 | class Base (virtual base) 16 | char baseval | [sizeof=24, dsize=17, align=8, | nvsize=16, nvalign=8] *** Dumping AST Record Layout 0 | class Child 0 | class A (primary base) 0 | (A vtable pointer) 8 | double aval 16 | class B (base) 16 | (B vtable pointer) 24 | double bval 32 | char childval 33 | class Base (virtual base) 33 | char baseval | [sizeof=40, dsize=34, align=8, | nvsize=33, nvalign=8]
In Child:
- Member variables and virtual function pointers are the same as in multi inheritance.
- Child ranks the content of Base (the parent class inherited by virtual inheritance) last (after child's custom members), and only retains a copy of Base data. This is the role of virtual inheritance.
Virtual function table of prism inheritance
A. B's virtual function table, as described in the section "single virtual inheritance". The virtual functions of Child are as follows:
clang++ -Xclang -fdump-vtable-layouts diamond.cpp Original map void Child::vfuncA() -> void A::vfuncA() Vtable for 'Child' (18 entries). 0 | vbase_offset (40) 1 | offset_to_top (0) 2 | Child RTTI -- (A, 0) vtable address -- -- (Child, 0) vtable address -- 3 | void A::vfuncBase1() 4 | void Child::vfuncA() 5 | void Child::vfuncC() 6 | void Child::vfuncB() 7 | vbase_offset (24) 8 | offset_to_top (-16) 9 | Child RTTI -- (B, 16) vtable address -- 10 | void B::vfuncBase2() 11 | void Child::vfuncB() [this adjustment: -16 non-virtual] method: void B::vfuncB() 12 | vcall_offset (-24) 13 | vcall_offset (-40) 14 | offset_to_top (-40) 15 | Child RTTI -- (Base, 40) vtable address -- 16 | void A::vfuncBase1() [this adjustment: 0 non-virtual, -24 vcall offset offset] method: void Base::vfuncBase1() 17 | void B::vfuncBase2() [this adjustment: 0 non-virtual, -32 vcall offset offset] method: void Base::vfuncBase2() Virtual base offset offsets for 'Child' (1 entry). Base | -24 Thunks for 'void Child::vfuncB()' (1 entry). 0 | this adjustment: -16 non-virtual VTable indices for 'Child' (3 entries). 1 | void Child::vfuncA() 2 | void Child::vfuncC() 3 | void Child::vfuncB()
Review the virtual function tables of A and B:
A vtable: B vtable: - A::vfuncBase1() - B::vfuncBase2() - A::vfuncA() - B::vfuncB() - A::vfuncBase1() - Base::vfuncBase1() - Base::vfuncBase2() - B::vfuncBase2()
It can be seen that Child's virtual function table has two parts:
- The first part 3-6 and 10-11 are similar to the construction rules of Child multi inheritance a and B, that is, merging Avtable[0 - 1] and Bvtable[0 - 1].
- Part II 16-17, merging Avtable[2 - 3] and Bvtable[2 - 3].
summary
scene | Member variable | Virtual function table |
---|---|---|
Single class | It is arranged in the order of declaration and needs to follow the rule of byte alignment | Store the vtable pointer in the first 8 bytes of memory of the object |
single inheritance | 1. Arrange them in sequence according to the inherited hierarchical order, and follow the rule of byte alignment 2. There is only one vtable pointer |
1. Copy the virtual function table of the parent class of the previous level 2. If there is a custom virtual function, add the corresponding address to the virtual function table 3. If the parent virtual function is overridden, the new address is used to overwrite the original address. |
Multiple inheritance | 1. Multiple vtable pointers 2. Arrange < VTable, members > of the parent class in the order of inheritance |
Refer to the section "multiple inheritance". |
Single virtual inheritance | Unlike ordinary single inheritance, there are multiple vtable pointers | Part 2: the first part follows the "single class" rule and the second part follows the "single inheritance" rule. |
Prismatic inheritance | 1. Similar to multi inheritance 2. Add the data of the virtual inherited target at the end |
Refer to the section "virtual function table of prism inheritance". |