[C language] user defined type details

Posted by redmonkey on Sat, 29 Jan 2022 00:12:44 +0100

In C language, there are several special custom types: structure, enumeration and union

In this blog, let's take a look at these custom types! 😶

1. Structure

A structure is a collection of values. Each member of a structure can be a variable of different types

1.1 declaration of structure

Taking personal information as an example, there are several elements such as name, gender, age and height. You can define the structure as follows

struct Stu
{
	char name[20];
	char sex[5];
	int age;
	int hight;
}s2, s3, s4;//s2,s3,s4 global variables

1.2 special declaration

When declaring a structure, you can not declare it completely

struct
{
	char c;
	int a;
	double d;
}sa;

struct
{
	char c;
	int a;
	double d;
}*ps;

These two structures are two anonymous structure types, and the structure label is omitted.

Anonymous structure type can only be defined once and cannot be used later

The contents of these two structures are exactly the same.

In the second structure, a structure pointer * ps is defined. Can this pointer be stored &sa?

int main()
{
	//The compiler believes that there are different structure types on both sides of the equal sign, so this writing is wrong
	ps = &sa;
	return 0;
}

1.3 self reference of structure

When defining a structure, you can include a member of the structure itself

//Code 1 - error
struct Node
{
 int data;
 struct Node next;
};

//Correct self reference method
struct Node
{
 int data;
 struct Node* next;
};

In many cases, we will use the typedef function to name the structure weight

  • typedef functions cannot rename anonymous structures
typedef struct
{
 int data;`
 Node* next;
}Node;
//Anonymous structures cannot be renamed

//Correct writing
typedef struct Node
{
 int data;
 struct Node* next;
}Node;

1.4 definition and initialization of structure variables

struct Point
{
 int x;
 int y;
}p1; //Define the variable p1 while declaring the type
struct Point p2; //Define structure variable p2
//Initialization: define variables and assign initial values at the same time.
struct Point p3 = {x, y};
struct Stu        //Type declaration
{
 char name[15];//name
 int age;      //Age
};
struct Stu s = {"zhangsan", 20};//initialization
struct Node
{
 int data;
 struct Point p;
 struct Node* next; 
}n1 = {10, {4,5}, NULL}; //Structure nesting initialization
struct Node n2 = {20, {5, 6}, NULL};//Structure nesting initialization

1.5 structure memory alignment

Structure is stored in memory in a special way

struct S1
{
	char c1;//1
	int i;//4
	char c2;//1
};

A total of 6 bytes are stored in the structure. According to theory, the space it occupies should also be 6 bytes.

But when we use sizeof to calculate its length, we get 12 bytes.

Why?

When the structure is stored in memory, memory alignment is required to ensure the efficiency of memory reading

  • The first member is at an address offset from the structure variable by 0.

  • Other member variables should be aligned to the address of an integer multiple of a number (alignment number)

Alignment number = the smaller value of the compiler's default alignment number and the size of the member

The default value in VS is 8

  • The total size of the structure is an integer multiple of the maximum number of alignments (each member variable has an alignment number).

  • If a structure is nested, the nested structure is aligned to an integer multiple of its maximum alignment number. The overall size of a structure is an integer multiple of the maximum number of alignments (including the number of alignments of nested structures).

offsetof function: view the offset of the structure member variable relative to the first address

  • The i variable is of type int and 4 bytes. The default alignment number is 8 and the alignment number is 4. C2 variable is 1 byte, and the alignment number is 1.

  • Therefore, int i should be stored from the position of an integer multiple of 4 in the memory, that is, the position of the fifth byte. c2 is stored after int at the position of integral multiple of alignment number 1, that is, the position of the 9th byte.

  • The total size of the structure should be an integral multiple of the maximum alignment number (here is 4), and 9 is not an integral multiple of 4, so the total size of the structure should be 12 bytes

As shown in the figure, the three members of the structure are stored in memory and are not completely continuous. There are three bytes between char c1 and int i, and three bytes are left after char c2.

Special case: array

struct Stu 
{
	int i;    //4  4,8 - 4
    char c[5];	//5  1,8 - 1
};

When calculating the alignment number of an array, we look at the size of a single element, not the size of the entire array.

  • char c[5] array occupies 5 bytes
  • When calculating the alignment number, compare 1 byte of char type with 8. The alignment number is 1

(please ignore the subscript of char array in the figure)

So why do I need memory alignment?

  1. Platform reason (migration reason): not all hardware platforms can access any data at any address; Some hardware platforms can only get certain types of data at certain addresses, otherwise hardware exceptions will be thrown.

  2. Performance reason: data structures (especially stacks) should be aligned on natural boundaries as much as possible. The reason is that in order to access the misaligned memory, the processor needs to make two memory accesses; Aligned memory access requires only one access.

When designing the structure, we can put the members of the structure with small space together as much as possible, so as to reduce the space occupied by the structure

1.6 modify the default alignment number

We can use #pragma preprocessing instructions to change the default alignment number

#pragma pack(1) / / set the default alignment number to 1
struct S2
{
 char c1;
 int i;
 char c2;
};
#pragma pack() / / unset the default alignment number and restore it to the default

1.7 structural transmission parameters

struct S
{
 int data[1000];
 int num;
};
struct S s = {{1,2,3,4}, 1000};
//Structural transmission parameters
void print1(struct S s)
{
 printf("%d\n", s.num);
}
//Structure address transmission parameter
void print2(struct S* ps)
{
 printf("%d\n", ps->num);
}
int main()
{
 print1(s);  //Transmission structure
 print2(&s); //Transmission address
 return 0;
}

When the structure passes parameters, we'd better pass the address of the structure.

When transferring parameters to a function, the parameters need to be pressed on the stack, which will have system overhead in time and space. If the structure is too large when passing a structure object, the system overhead of parameter stack pressing is relatively large, which will lead to performance degradation.

2. Bit segment

Bit end is a special structure

2.1 what is a bit segment?

The declaration and structure of bit segments are similar, with two differences:

1. The member of the bit segment must be int, unsigned int or signed int

2. There is a colon and a number after the member name of the bit field.

struct A
{
 int _a:2;
 int _b:5;
 int _c:10;
 int _d:30;
};

The number after the member name indicates that the member needs several bits of space to store

In the structure, the smallest member char type requires 1 byte of space. However, some data do not need a byte to store. At this time, bit segments can be used to reduce space occupation

int _a:2;//2bit - 00 01 10 11 - four cases

The size of the bit end is as follows: first, open up a space of 4 bytes for continuous storage of a, b and c. After storage, there is still 15 bits left, which is not enough to store 30 bits of d, so a second 4-byte space is opened up to store member d

2.2 storage of bit segments in memory

  • The member of the bit segment can be int unsigned int signed int or char (belonging to the shaping family)

  • The space of bit segment is opened up in the way of 4 bytes (int) or 1 byte (char) as required

  • Bit segment involves many uncertain factors. Bit segment is not cross platform. Pay attention to portable programs and avoid using bit segment

//An example
struct S
{
 char a:3;
 char b:4;
 char c:5;
 char d:4;
};
struct S s = {0};
s.a = 10;//1010 truncation 010
s.b = 12;//1100 - complete save - 1100
s.c = 3;//0011 - front supplement 0-00011
s.d = 4;//0100 complete save
  • a and b occupy one byte
  • c occupies an independent byte
  • d takes up one byte

Its storage in memory is shown in the following figure:

2.3 cross platform problem of bit segment

Under different compilers, the storage methods of bit segments will be very different

  • It is uncertain whether the int bit field is treated as a signed number or an unsigned number.
  • The number of the largest bits in the bit field cannot be determined (16 bit machine is the largest 16, 32-bit machine is the largest 32, written as 27, and there will be problems on 16 bit machine)
  • Whether the members in the bit segment are allocated from left to right or from right to left in memory has not been defined
  • When a structure contains two bit segments, and the member of the second bit segment is too large to accommodate the remaining bits of the first bit segment, it is uncertain whether to discard the remaining bits or use them

Compared with the structure, the bit segment can achieve the same effect and save space, but there are cross platform problems.

2.4 application of bit segment

In the Internet, when data is transferred between servers, the following paradigm will be referred to.

The 4-bit version number, header length and 3-bit flag in the figure do not reach the size of 1 byte. At this time, if the structure is used to save, there will be a large waste of space, which will increase the pressure on the server.

At this time, it is suitable to use bit segments to store such content and save space.

3. Enumeration

Enumeration means to enumerate one by one

In life, there are some types of things that can be listed one by one (limited).

For example: People's gender, week, 12 months.

In C language, enumeration types can be used to define this limited element

3.1 definition of enumeration type

enum Day//week
{
 Mon,
 Tues,
 Wed,
 Thur,
 Fri,
 Sat,
 Sun
};
enum Sex//Gender
{
 MALE,
 FEMALE,
 SECRET
}ï¼›

Note that enumeration types and structs are different.

  • The content in {} is the possible value of enumeration type, which is also called enumeration constant

  • Enumeration types represent specific values. The default is to start from 0 and increase by 1.

  • Enumeration types can be used instead of numeric values

For example, in the enumeration type of day, each element represents a number. The default is to start from 0 and increase by 1.

enum Day//week
{
 Mon,//0
 Tues,//1
 Wed,//2
 Thur,//3
 Fri,//4
 Sat,//5
 Sun//6
};

We can also assign initial values when defining

enum Color//colour
{
 RED=1,
 GREEN=2,
 BLUE=4
};

If you only assign a value to one of the constants, the following constants are also incremented by 1

enum Day//week
{
 Mon,//0
 Tues,//1
 Wed=5,//5
 Thur,//6
 Fri,//7
 Sat,//8
 Sun//9
};

3.2 advantages of enumeration

We can use #define to define constants. Why do we have to use enumeration?

Advantages of enumeration:

  1. Increase the readability and maintainability of the code
  2. Compared with #define defined identifiers, enumeration has type checking, which is more rigorous.
  3. Prevent naming pollution (encapsulated with {})
  4. Easy to debug
  5. Easy to use, you can define multiple constants at a time

The content of the fourth point is shown in the figure below:

3.3 use of enumeration

enum Color//colour
{
 RED=1,
 GREEN=2,
 BLUE=4
};
//After defining the enumeration constant, it cannot be changed externally!
RED=3,//err

enum Color clr = GREEN;//You can only assign values to enumeration variables with enumeration constants, so that there will be no type difference.
clr = 5; //err
  • enum Color contains enumeration constants
  • enum Color clr is an enumerated variable

It should be noted that if you assign a value to an enumerating variable with a number, the No error will be reported under c file, but in An error will be reported under the cpp file

  • The syntax check of CPP file is more strict!

So, how to apply enumeration types in daily use?

In the code of the calculator 👉 [blog link]

We can use enumeration constants instead of dry case 0, case 1, and so on

//This is just an example
//See the previous blog for detailed code
enum Options
{
 	EXIT,//0
 	ADD,//1
    SUB,//2
    MUL,//3
    DIV//4
};

void menu()
{
	printf("**********************************\n");
	printf("*****  1. add     2. sub     *****\n");
	printf("*****  3. mul     4. div     *****\n");
	printf("*****  0. exit               *****\n");
	printf("**********************************\n");
}

int main()
{
	int input = 0;
	int x = 0;
	int y = 0;
	int ret = 0;
	do
	{
		menu();
		printf("Please select:>");
		scanf("%d", &input);
		switch (input)
		{
		case ADD:
         //Replace the original number with enumeration type to enhance the readability of the code
			break;
		case SUB:
			break;
		case MUL:
			break;
		case DIV:
			break;
		case EXIT:
			printf("Exit calculator\n");
			break;
		default:
			printf("Selection error\n");
			break;
		}
	} while (input);

	return 0;
}

4. Consortium

4.1 definition of association type

Federation is also a special custom type

The variables defined by this type also contain a series of members, which are characterized by the fact that these members share the same space (so the union is also called a common body)

//Declaration of union type
union Un
{
 char c;
 int i;
};
//Definition of joint variables
union Un un;

When we calculate the size of this union, we find that it is only 4 bytes, not 5 bytes.

Moreover, the starting addresses of char c and int i elements are the same, which indicates that they share the four byte space.

4.2 characteristics of the Consortium

The members of the union share the same memory space. The size of such a union variable is at least the size of the largest member

(because the union must at least be able to preserve the largest member)

union Un
{
 int i;
 char c;
};
union Un un;
//What are the output results below?
un.i = 0x11223344;
un.c = 0x55;
printf("%x\n", un.i);

4.3 use the consortium to judge the size of the end

In the previous study of data storage, we learned what is the size end of the compiler, and wrote a function to judge the size end of the current compiler 👉 link

#include <stdio.h>
int check_sys()
{
	int a=1;
	char*p=(char*)&a;
	if(1==*p)
		return 1;
	else
		return 0;		
}

int main()
{
	int b=check_sys(); 
	if(1==b)
		printf("Small end\n");
	else
		printf("Big end\n");
    
	return 0;
 } 

Cast int type address to char * type

  • If it is a small end, the first address is 01
  • If it is big end, the first address is 00

Today, we will improve this code by taking advantage of the fact that members of the consortium share the same space in memory.

int cheak_sys()
{
	union
	{
		char c;
		int i;
	}u;
	u.i = 1;//Change int type to 1
	return u.c;//Returns the first byte of the type
}

int main()
{
	int ret = cheak_sys();
	if (1 == ret)
		printf("Small end\n");
	else
		printf("Big end\n");

	//If 1 is returned, it indicates the small end
	//If 0 is returned, it indicates the big end
	return 0;
}

Using a union, there is no need to cast pointer types

The definition of char c type can cover the first byte of int i

4.4 storage of consortium in memory

Like structures, federations also need memory alignment

  • The size of the union is at least the size of the largest member
  • When the maximum member size is not an integer multiple of the maximum alignment number, it should be aligned to an integer multiple of the maximum alignment number
union Un1
{
	char c[5];//5  1,8 - 1
	int i;    //4  4,8 - 4
};

union Un2
{
	short c[7];//14  2,8-2
	int i;     //4   4,8-4
};

It should be noted that when the consortium calculates the alignment number, the array is calculated according to the size of one element, not the size of the whole array! (this is the same as the structure)

Alignment number of short array in Un2:

  • short c[7] 14 bytes in total
  • Each element is 2 bytes
  • The default number of alignments is 8
  • So its alignment number is 2

Take Un1 as an example. Its storage method in memory is shown in the figure (please ignore the subscript of char array in the figure)

  • char c[5] and int i types share the first 5 bytes (int i accounts for 4 bytes)
  • Because you need to align to an integer multiple of the maximum number of alignments, the size is 8

epilogue

The content of custom types is very rich

Have you learned?

The code word is not easy. If it helps, please give me a praise!

Topics: C Back-end