C + + object model Chapter 3 data semantics

Posted by compsci on Sat, 20 Nov 2021 14:23:04 +0100

Chapter III data semantics

Data member binding timing

Summary:
==The compiler starts to parse the member function myfunc after the entire class A is defined== Because only after the entire class A is defined,
Only when the compiler can see the myvar in class A, can it make the above appropriate explanation for the occurrence of myvar according to the needs of the time (the member function is resolved to myvar in the class, and the global function is resolved to myvar in the global function)

If you want to reference a global variable, add two colons before the variable name
**For member function parameters: it is determined when the compiler first encounters the whole type mytype** Therefore, when mytype is first encountered, the compiler only sees
typedef string mytype. You don't see typedef in mytype in the class;
Conclusion: in order to see the type mytype in the class as soon as possible, this type definition statement typedef must be moved to the definition at the beginning of the class.
When the latter member function first encounters this type mytype, it applies the recently encountered type according to the principle of the recently encountered type.

#include <iostream>
#include <time.h >
#include <stdio.h>
using namespace std;

//string myvar= "I Love China!"; // Global quantity, string type
typedef string mytype;

//Define a class
class A
{
	typedef int mytype;

public:
	//int myfunc();
	///*{
	//	return myvar;
	//}*/
	//void myfunc(mytype tmpvalue) //mytype = string
	//{
	//	m_value = tmpvalue; // The error is to give a string type to an integer
	//}

	void myfunc(mytype tmpvalue); //string

private:
	//int myvar; // Same as global variable name, but different type.		
	mytype m_value; //int	
};

void A::myfunc(mytype tmpvalue) //int
{
	m_value = tmpvalue;
}

void myfunc(mytype tmpvalue) //mytype
{
	string mvalue = tmpvalue;
}


//int A::myfunc() / / member function
//{
//	cout << myvar << endl;  // Myvar is defined within a class
//	cout << ::myvar.c_ str() << endl;  // Myvar is global
//	return myvar; // Here is also A::myvar
//}

//int myfunc()
//{
//	return myvar; // Myvar here is global and is of string type, so an error is reported here;
//}

int main()
{		
	//The compiler parses the member function myfunc and starts only after the entire class A is defined;
	   //Therefore, the parsing and binding of myvar occurs after the class definition is completed.

	//Summary:
	//The compiler starts to parse the member function myfunc after the entire class A is defined; Because only after the entire class A is defined,
	   //Only when the compiler can see the myvar in class A, can it properly explain the occurrence of myvar according to the needs of the time (the member function is resolved to myvar in the class, and the global function is resolved to myvar in the global function;
	/*A aobj;
	aobj.myvar = 15;
	aobj.myfunc();*/

	//For member function parameters: it is determined when the compiler encounters the whole type mytype for the first time; therefore, when mytype is encountered for the first time, the compiler only sees
	 //typedef string mytype. I don't see typedef int mytype in the class;
	//Conclusion: in order to see the type mytype in the class as soon as possible, this type definition statement typedef must be moved to the definition at the beginning of the class.
	  //When the latter member function first encounters this type mytype, it applies the recently encountered type according to the principle of the recently encountered type.


	return 1;
}

Process memory space

Linux virtual address space layout and process stack and thread stack summary

Different data will be saved at different times and locations in memory
When an executable file is run, the operating system loads the executable file into memory; at this time, the process has a virtual address space (memory space)
linux has a nm command: it can list the addresses of global variables in executable files;

There is a data segment for the static member variable of the class

Data member layout

Observe the address rule of member variables
Boundary adjustment, byte alignment
Printing of offset values of member variables

Static member variables do not occupy class space -- they exist in data segments

The storage order of ordinary member variables is from top to bottom according to the definition order in the class

The storage order of common member variables is from top to bottom according to the definition order in the class;
The member variable that appears later has a higher address in memory; (the latter can be accessed only by adding an address)
The number of pubic, private and protected in the class definition does not affect the sizeof of the class object;
Boundary adjustment, byte alignment
Some factors will lead to discontinuous arrangement between member variables, that is, boundary adjustment (byte alignment). The purpose of adjustment is to improve efficiency, and the compiler automatically adjusts;
Adjust: fill some bytes between members, and use the number of sizoef bytes of the class object to form an integer multiple of 4 and an integer multiple of 8;

In order to unify the byte alignment problem, a concept called one byte alignment (misalignment) is introduced;
When there are virtual functions, the compiler adds vptr virtual function table pointers to the class definition: internal data members.

#pragma pack(1) 1 byte aligned (not aligned)
#pragma pack() cancels the specified alignment and restores the default alignment;

Printing of offset values of member variables
The offset value of the member variable is the offset of the address of the member variable from the first address of the object;

Many computer systems make some restrictions on the legal address of basic data types, requiring the location of certain types of objects
**(the address must be a multiple of a value K (usually 2, 4, or 8). (first address) * * this alignment restriction simplifies the formation of processor and memory systems
For example, if a processor always takes 8 bytes from memory, the address must be 8
If we can ensure that the addresses of all double type data are aligned to a multiple of 8, then we can
j uses a memory operation to read or write values. Otherwise, we may need to perform two memory accesses because the object may
It is divided into two 8-byte memory blocks.

Offset value

	//Member variable pointer
	int MYACLS::*mypoint = &MYACLS::m_n; // Or print directly with & myacls:: m_n
	printf("pmyobj->m_n Offset value = %d\n", mypoint);

Data member access

1: Access to static member variables

	//Static member variable can be regarded as a global variable, but it is only visible in the class space; when referencing, use class name:: static member variable name
	//Static member variables have only one entity, which is stored in the data segment of the executable file;

	MYACLS myobj;
	MYACLS *pmyobj = new MYACLS();

	cout << MYACLS::m_si << endl; // In theory, it should be like this
	cout << myobj.m_si << endl;     // Just syntax support
	cout << pmyobj->m_si << endl;

2: Access to non static member variables (ordinary member variables) is stored in the class object. Access is through the class object (class object pointer)

	pmyobj->myfunc();
	Compiler angle: MYACLS::myfunc(pmyobj)
	MYACLS::myfunc(MYACLS *const this)
	{
		this->m_i = 5;
		this->m_j = 5;
	}
	For the access of ordinary members, the compiler adds the first address of the class object to the offset value of the member variable;
	&myobj + 4  = &myobj.m_j

Data member layout under single inheritance

//(1) The content of a subclass object is the sum of its own members and the members of its parent class;
//(2) From the offset value, the parent class member appears first, and then the child class member.
FAC facobj;
MYACLS myaclobj; //The subclass object actually contains the child objects of the parent class

class Base //sizeof = 8 bytes; byte alignment - last address space not available
{
public:
	int m_i1;
	char m_c1;
	char m_c2;
	char m_c3;
};

The introduction of inheritance may lead to an additional increase in memory space.

class Base1
{
public:
	int m_i1;
	char m_c1;
};
class Base2 :public Base1
{
public:
	char m_c2;
};
class Base3 :public Base2
{
public:
	char m_c3;
};

Memory layout on Windows VS

Base 1,Base 2,Base 3

Memory layout on linux g++ 5.4

//The data layout is different on linux and windows. Note:
//a) The compiler is constantly improving and optimizing;
//b) The implementation details of compilers from different manufacturers are also different;
//c) Memory copy should be cautious;

	Base2 mybase2obj;
	Base3 mybase3obj;
	//You can't use memcpy memory copy to copy Base2 content directly into Base3;

Data member layout of virtual functions under single class and single inheritance

When introducing virtual functions into classes, there will be additional costs

(1) When compiling, the compiler will generate a virtual function table. Refer to Chapter 3 and section 5
(2) The virtual function table pointer vptr will be generated in the object to point to the virtual function table
(3) Add or extend the constructor, add the code assigned to the virtual function table pointer vptr, and let vptr point to the virtual function table; (program run time)
(4) If you inherit multiple inheritance, for example, if you inherit two parent classes and each parent class has a virtual function, each parent class will have a vptr. When inheriting, the child class will inherit both vptrs,
If the subclass has its own additional virtual function, the subclass and the first base class share a vptr (Chapter 3, section 4);
Data member layout of a single class with virtual functions

Data member layout with virtual function of single inherited parent class

Single - inherits the layout of data members of the parent class without virtual functions

	cout << sizeof(MYACLS) << endl;
	printf("MYACLS::m_bi = %d\n", &MYACLS::m_bi);
	printf("MYACLS::m_i = %d\n", &MYACLS::m_i);
	printf("MYACLS::m_j = %d\n", &MYACLS::m_j);
	
	MYACLS myobj;
	myobj.m_i = 3;
	myobj.m_j = 6;
	myobj.m_bi = 9;

What is printed is the diagram on the left, which is actually tested (run-time test, which is on the right when the address space changes)

Deep discussion on multiple inheritance data layout and this adjustment!

	//1:  Single inherited data member layout this pointer offset knowledge supplement
	//Section 3 of Chapter 1: this pointer adjustment

	//2:  Data member layout with multiple inheritance and parent classes with virtual functions
	//(1) Through this pointer printing, we can see that access to Base1 members does not need to skip, access to Base2 members needs to offset (skip) this pointer by 8 bytes;
	//(2) We see the offset value, m_bi and m_b2i offset is 4;
	//(3) This pointer, plus the offset value, can access the corresponding member variable, such as m_b2i = this pointer + offset value

	//We come to a conclusion from our study:
	//We want to access a member in a class object. The positioning of the member is defined by two factors: the this pointer (which will be automatically adjusted by the compiler) and the offset value of the member;
	   //The adjustment of this pointer offset requires the intervention of the compiler;

A conclusion is drawn:
We want to access a member in a class object. The positioning of the member is defined by two factors: the this pointer (which will be automatically adjusted by the compiler) and the offset value of the member;
The adjustment of this pointer offset requires the intervention of the compiler;

this offset + member offset

	Base2 *pbase2 = &myobj; //this pointer adjustment causes pbase2 to actually move forward by 8 bytes
	                            //myobj = 0x0093fad0. After this statement, pbase2 = 0x0093fad8
	//From the compiler's point of view, the above line is adjusted
	//Base2 *pbase2 = (Base2 *)(((char *)&myobj) + sizeof(Base1));

	Base1 *pbase1 = &myobj; //There's no need to offset here

	Base2 *pbase2 = new MYACLS(); //The parent class pointer is a new subclass object. Here, new comes out as 24 bytes

	MYACLS *psubobj = (MYACLS *)pbase2; //8 bytes less than the upper address (offset)

	//delete pbase2; // An exception is reported. Therefore, we believe that the address returned in pbase2 is not the first address assigned, but the offset address.
	          //The first address actually allocated should be the address inside psubj
	delete psubobj;

More complex inheritance layout

On the problem of virtual basis

class Grand //Grandpa class
{
public:
	int m_grand;
};
class A1 : virtual public Grand
{
public:
	int m_a1;
};

class A2 : virtual public Grand
{
public:
	int m_a2;
};

class C1 :public A1, public A2
{
public:
	int m_c1;
};

Virtual base class (virtual inheritance / virtual derivation) problem
Traditional multiple inheritance causes: space problem, efficiency problem, ambiguity problem;

Virtual base class, let Grandpa class be inherited only once;

	//2:  On virtual base classes
	//Two concepts: (1) virtual base table (vbtable). (2) virtual base table pointer (vbptr)
	//Empty sizeof(Grand) ==1, easy to understand; 
	//After virtual virtual inheritance, the compiler will insert a virtual base class table pointer in A1 and A2, which feels like a member variable
	//A1 and A2 occupy 4 bytes because of the virtual base class table pointer
	A1 a1;
	A2 a2;
	//The virtual base class table pointer is used to point to the virtual base class table (to be discussed later).

A1 and A2 occupy 4 bytes because of the virtual base class table pointer
There is only one virtual base class table pointer in C1

Object layout

Content analysis of virtual base class table in two-tier structure -- for VS 2017

//1: 5-8 byte content analysis of virtual base class table content
//The virtual base class table is generally 8 bytes, and four bytes are a unit. For each additional virtual base class, the virtual base class table will add 4 more bytes
//Because the compiler has a virtual base class, it will add a default constructor to classes A1 and A2, and the code will be added by the compiler to this default constructor,
//Assign a value to the vbptr virtual base class table pointer.

The virtual base class table stores the offset relative to the virtual base class table pointer
The first address of the "virtual base class table pointer" member variable + this offset is equal to the first address of the virtual base class object. By skipping this offset value, we can access the virtual base class object;

Offset (virtual base class table) + virtual base class table pointer + offset of class member = accessed Member Address

a1obj.m_grand = 2; —— Virtual base class

ptr -- is the first address of the virtual base class table pointer
Step 1 - load the address of the virtual base class table pointer
Step 2 - load the offset (virtual base class table) (address eax + 4). The virtual base class table here is 8 direct, using only 4 bytes
Step 3 - assignment (offset (virtual base class table) + virtual base class table pointer + offset of class members (0))

Only when processing virtual base class members, such as assignment, will the virtual base class table be used to take the offset and participate in the address calculation

// project100.cpp: this file contains the "main" function. Program execution will begin and end here.
//

#include "pch.h"
#include <iostream>
#include <time.h >
#include <stdio.h>
#include <vector>

using namespace std;

class Grand //Grandpa class
{
public:
	int m_grand;
};

class Grand2 //Grandpa class
{
public:
	int m_grand2;
	//int m_grand2_1;
};

class A1 : virtual public Grand,virtual public Grand2
{
public:
	int m_a1;
};

class A2 : virtual public Grand//, virtual public Grand2
{
public:
	int m_a2;
};

class C1 :public A1, public A2
{
public:
	int m_c1;
};

int main()
{	
	//1:  5-8 byte content analysis of virtual base class table content
	//The virtual base class table is generally 8 bytes, and four bytes are a unit. For each additional virtual base class, the virtual base class table will add 4 more bytes
	//Because the compiler has a virtual base class, it will add a default constructor to classes A1 and A2, and the code will be added by the compiler to this default constructor,
	   //Assign a value to the vbptr virtual base class table pointer.

	


	cout << sizeof(Grand) << endl;
	cout << sizeof(A1) << endl;
	cout << sizeof(A2) << endl;
	cout << sizeof(C1) << endl;

	A1 a1obj;
	a1obj.m_grand = 2;
	a1obj.m_grand2 = 6;
	//a1obj.m_grand2_1 = 7;
	a1obj.m_a1 = 5;

	//The first address of the "virtual base class table pointer" member variable + this offset is equal to the first address of the virtual base class object. By skipping this offset value, we can access the virtual base class object;

	//2:  Continue to observe the inheritance of various forms
	//a) The virtual base class is represented by three items, + 4 and + 8, which are assigned by obtaining the offset value in the virtual base class table
	//b) The offset in the virtual base class table is stored according to the inheritance order;
	//c) Virtual base class sub objects are always placed at the bottom;

	//3:  1-4 byte content analysis of virtual base class table content
	//The offset between the first address of the virtual base class table pointer member variable and the A1 first address of the object, that is, the first address of the virtual base class table pointer - the first address of the A1 object

	//Conclusion: only when the virtual base class members are processed, such as assignment, will the virtual base class table be used to take the offset and participate in the address calculation;
	   	  	

	return 1;
}

Content analysis of virtual base class table in three-tier structure

The virtual base class table is generated during compilation

Access grand through the C1 grandson pointer -- vbptr2 is not used, only vbptr1 is used

Access grand through the A2 pointer - vbptr1 is not used, only vbptr2 is used

	A2 *pa2 = &c1obj;
	pa2->m_grand = 8;

Load pa2 pointer (virtual base table pointer address)
Load virtual base class table address
Offset to access virtual base class table
Assign member variable m_grand (the member variable address is eax + edx, and the assignment content is 8)

Contents of the header address of the virtual base class

// project100.cpp: this file contains the "main" function. Program execution will begin and end here.
//

#include "pch.h"
#include <iostream>
#include <time.h >
#include <stdio.h>
#include <vector>

using namespace std;

class Grand //Grandpa class
{
public:
	int m_grand;
};

class A1 : virtual public Grand
{
public:
	int m_a1;
};

class A2 : virtual public Grand
{
public:
	int m_a2;
};

class C1 :public A1, public A2
{
public:
	int m_c1;
};

int main()
{	
	//1:  Content analysis of virtual base class table in three-tier structure
	cout << sizeof(Grand) << endl;
	cout << sizeof(A1) << endl;
	cout << sizeof(A2) << endl;
	cout << sizeof(C1) << endl;

	//A1 a1obj;
	//a1obj.m_grand = 2;
	//a1obj.m_grand2 = 6;
	a1obj.m_grand2_1 = 7;
	//a1obj.m_a1 = 5;

	C1 c1obj;
	c1obj.m_grand = 2;
	c1obj.m_a1 = 5;
	c1obj.m_a2 = 6;
	c1obj.m_c1 = 8;
	//C1 c2obj;

	//vbptr2 is not used, only vbptr1 is used

	//2:  Why are virtual base classes so designed
	//Why is it so designed is a difficult question to answer;
	//A2 *pobja2 = new C1();  
	A2 *pa2 = &c1obj;
	pa2->m_grand = 8;
	pa2->m_a2 = 9;



	return 1;
}

Member variable address, offset, pointer, etc

#include <iostream>
#include <time.h >
#include <stdio.h>
#include <vector>

using namespace std;

class MYACLS
{
public:
	int m_i;
	int m_j;
	int m_k;
};

void myfunc(int MYACLS::*mempoint, MYACLS &obj)
{
	obj.*mempoint = 260; //Pay attention to the writing
}

int main()
{	
	//1:  Object member variable memory address and its pointer
	MYACLS myobj;
	myobj.m_i = myobj.m_j = myobj.m_k = 0;
	printf("myobj.m_i = %p\n", &myobj.m_i); //The member variable of the object has a real memory address;

	MYACLS *pmyobj = new MYACLS();
	printf("pmyobj->m_i = %p\n", &pmyobj->m_i);
	printf("pmyobj->m_j = %p\n", &pmyobj->m_j);

	int *p1 = &myobj.m_i;
	int *p2 = &pmyobj->m_j;

	*p1 = 15;
	*p2 = 30;

	printf("p1 address=%p,p1 value=%d\n", p1,*p1);
	printf("p2 address=%p,p2 value=%d\n", p2, *p2);

	//2:  The offset value of the member variable and its pointer (not related to the specific object)
	cout << "Print member variable offset values----------------" << endl;
	printf("MYACLS::m_i Offset value = %d\n",&MYACLS::m_i); //Print offset value, use% d here
	printf("MYACLS::m_j Offset value = %d\n", &MYACLS::m_j);
	//You can also print the offset value with the member variable pointer. See the writing method
	//You should know that what is stored in the member variable pointer is actually an offset value (not an actual memory address).
	int MYACLS::*mypoint = &MYACLS::m_j;
	printf("MYACLS::m_j Offset address = %d\n",mypoint);

	mypoint = &MYACLS::m_i; //Note here that when used alone, the name is used directly, and MYACLS needs to be added when defining:
	printf("MYACLS::m_i Offset address = %d\n", mypoint);

	//3:  There is no pointer to any data member variable
	//Access the member variable of an object through an object name or object pointer followed by a member variable pointer:
	myobj.m_i = 13;
	myobj.*mypoint = 22;
	pmyobj->*mypoint = 19;

	myfunc(mypoint, myobj);
	myfunc(mypoint, *pmyobj);
	cout << "sizeof(mypoint) =" <<  sizeof(mypoint) << endl; //It is also a 4-byte;

	int *ptest = 0;
	int MYACLS::*mypoint2;
	mypoint2 = 0; //Member variable pointer
	mypoint2 = NULL; //0xffffffff is automatically assigned to - 1, - 1 indicates that it is not a (meaningful) member variable
	printf("mypoint2 = %d\n", mypoint2);

	//if(mypoint2 == mypoint) / / not valid
	int MYACLS::*mypoint10 = &MYACLS::m_i;
	if (mypoint == mypoint10) //Established
	{
		cout << "establish" << endl;
	}

	//mypoint2 += 1;  //  Operation not allowed
	//mypoint2++;
	//mypoint2 = ((&MYACLS::m_i) + 1);

	return 1;
}

Topics: C++

Programmer Think