C language uses GNU extension to realize simple intelligent pointer

Posted by sweyhrich on Mon, 31 Jan 2022 15:54:59 +0100

C language uses GNU extension to realize intelligent pointer

GNU/C has one__ attribute__ The extension is called cleanup:
https://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc/Variable-Attributes.html#Variable-Attributes

Its function is to automatically execute a bound function when a variable is out of its scope
This out of scope can be out of curly braces or other behaviors

It can have many uses, depending on the programmer's imagination, but the most straightforward and easy to think of is to implement a smart pointer like that in C + +

Here is a "smart pointer" I have written, which has been written into a header file for ease of use
It's easy to use:

void* test()
{
	printf("start test\n");
	autoptr_def(void*, ptr);
	autoptr_new(ptr, sizeof(int) * 20);
	autoptr_new(ptr, sizeof(int) * 200);
	printf("end test\n");
	return autoptr_cpy(ptr);
}

int main()
{
	printf("into main\n");
	autoptr_def(void*, ptr);
	ptr = test();
	printf("check\n");
}

Use mtrace for detection and add two functions:

__attribute__((constructor))
void before_main()
{
	mtrace();
	printf("trace start\n");
}

__attribute__((destructor))
void at_exit()
{
	printf("trace end\n");
	muntrace();
}

, the generated log files are as follows:

= Start
@ /lib/x86_64-linux-gnu/libc.so.6:(_IO_file_doallocate+0x94)[0x7fe03d802e84] + 0x5642e5ec9690 0x400
@ ./a.out:[0x5642e5940284] + 0x5642e5ec9aa0 0x58
@ ./a.out:[0x5642e59402c8] - 0x5642e5ec9aa0
@ ./a.out:[0x5642e59402d2] + 0x5642e5ec9b00 0x328
@ ./a.out:[0x5642e594041e] - 0x5642e5ec9b00
@ /lib/x86_64-linux-gnu/libc.so.6:[0x7fe03d91996c] - 0x5642e5ec9690
= End

As you can see, there is not a free sentence in our code above, but the memory is handled well

Source code

Release the source code first, and then explain it sentence by sentence

#ifndef AUTOPTR_H
#define AUTOPTR_H

#include <stdlib.h>
#include <stdint.h>
#include <stddef.h>

struct __autoptr {
	size_t cnt;
	char data[0];
};

#define __autoptr_offset__(_type, _name) \
	(size_t)(((_type*)0)->_name)
#define __autoptr_container__(_type, _name, _ptr) \
	((_type*)((char*)(_ptr) - __autoptr_offset__(_type, _name)))

__attribute__((always_inline))
static inline void autoptr_cleanup(void *_ptr)
{
	void *ptr = (void *)*(void **)_ptr;
	if (ptr == NULL)
		return;
	struct __autoptr *container =
	    __autoptr_container__(struct __autoptr, data, ptr);
	if (--container->cnt == 0)
		free(container);
}

#define autoptr_def(_type, _name)\
	_type _name __attribute__((cleanup(autoptr_cleanup))) = NULL

#define autoptr_new(_name, _size) \
do {\
	struct __autoptr *container = NULL;\
	if (_name != NULL){\
		container = __autoptr_container__(struct __autoptr, data, ptr);\
		free(container);\
	}\
	container = (struct __autoptr*)malloc (sizeof(struct __autoptr) + _size);\
	container->cnt = 1;\
	_name = (__typeof__ (_name)) container->data;\
} while(0)

#define autoptr_cpy(_name) \
	(_name != NULL && __autoptr_container__(struct __autoptr, data, _name)->cnt++) ? _name : NULL

#endif /* AUTOPTR_H */


The disadvantage of using macros is that copying and pasting will increase the volume of the source code, and the advantage is that the speed will be faster than calling functions

explain

Next, explain the code sentence by sentence

struct __autoptr {
	size_t cnt;
	char data[0];
};

Here, a variable length array is realized by using an array with length of 0
That may not be easy to understand
The data variable does not occupy the memory size. It points to the end of the whole structure
Its only purpose is to indicate where the structure ends

This structure is the key to our implementation of smart pointer

Smart pointers need to be referenced, that is, a piece of memory is pointed to by multiple pointers at the same time. The most common way is to use counting
For example, when there is only one pointer pointing to the memory, the count is 1, and if there are two pointers, the count is 2

However, the location of this count is a problem
Obviously, we can't put it in the memory block returned to the pointer, because our purpose is to use it like a normal pointer

If we malloc a single memory and put it into a structure, we can't find the structure to manage it only according to the location of the memory

Therefore, the best way I can think of is to regard it as a part of a structure and allocate space with the structure. In this way, the structure managing memory and the existing space in this block are continuous, so it is easy to find the structure managing it according to the memory

#define __autoptr_offset__(_type, _name) \
	(size_t)(((_type*)0)->_name)
#define __autoptr_container__(_type, _name, _ptr) \
	((_type*)((char*)(_ptr) - __autoptr_offset__(_type, _name)))

These are two classic macros. I just changed the name
The first macro is used to calculate the position offset of members in the structure
The second macro is used to find the address of the structure according to the address of the structure member

First, (_type *) 0) - >_ Name is the calculated offset
During compilation, this segment will be calculated as 0 + offset
Similar to type - > name, it will be compiled into * (type + offset) when compiling
When accessing the address of a variable, the compiler accesses the space first and then takes the address, but we have no space, so the compiler takes the address directly. And what kind of address is this? It is of type char * []
The common usage is & ((type *) 0) - > name. What you get is a char * pointer whose size is the offset
We need to do one more step of forced type conversion later, so the final result of char * or char * [] is the same, so I'm too lazy to add an address character

The container is easy to understand. The current address minus the offset is the address of the structure

This part can be seen in detail
https://blog.csdn.net/Erice_s/article/details/108549639?utm_medium=distribute.pc_relevant_bbs_down.none-task-blog-baidujs-1.nonecase&depth_1-utm_source=distribute.pc_relevant_bbs_down.none-task-blog-baidujs-1.nonecase
Let me have a rough look. It should be more detailed

__attribute__((always_inline))
static inline void autoptr_cleanup(void *_ptr)
{
	void *ptr = (void *)*(void **)_ptr;
	if (ptr == NULL)
		return;
	struct __autoptr *container =
	    __autoptr_container__(struct __autoptr, data, ptr);
	if (--container->cnt == 0)
		free(container);
}

What is written here is the function to deal with the automatic release process

First, use GNU's extension always_inline, which forces the compiler to compile this section as an inline function
The advantage is fast, but the disadvantage is that the size of the source code becomes larger.
Another advantage is that it does not appear in the symbol table of the executable file, as if this function had never appeared
Although I don't know what the use is

void *ptr = (void *)*(void **)_ptr;

The function of this sentence will be described later

	struct __autoptr *container =
	    __autoptr_container__(struct __autoptr, data, ptr);
	if (--container->cnt == 0)
		free(container);

Take the first address of the structure first, and the reference count will decrease automatically. If the count is reduced to 0, it means that there is no program using it, and it will be released automatically

Yes, we use reference counting for automatic release

#define autoptr_def(_type, _name)\
	_type _name __attribute__((cleanup(autoptr_cleanup))) = NULL

#define autoptr_new(_name, _size) \
do {\
	struct __autoptr *container = NULL;\
	if (_name != NULL){\
		container = __autoptr_container__(struct __autoptr, data, ptr);\
		free(container);\
	}\
	container = (struct __autoptr*)malloc (sizeof(struct __autoptr) + _size);\
	container->cnt = 1;\
	_name = (__typeof__ (_name)) container->data;\
} while(0)

There are two macros here. One macro defines a smart pointer of type, and the other macro assigns a value to the smart pointer

autoptr_ The cleanup attribute is used in def. When the variable is out of its domain, autoptr will be called_ Cleanup function, and pass in the address of the variable as a parameter

Therefore, here void *ptr = (void *)*(void **)_ptr; Is to take the value of the original variable

autoptr_ In new, a do while (0) is used to create a statement block. Local variables can be defined inside the statement block without affecting other parts of the program
First, check to see if ptr has been allocated memory
Note that by default, autoptr is used only when ptr is NULL_ After def, it is empty
So, in autoptr_ After DEF is used, if autoptr is not used_ If new is assigned to it at will, it will inevitably lead to the collapse of the program

In autoptr_ In new, first release the ptr that has been allocated memory, and then define a whole block of memory. The second half is available and the first half is the control part
Therefore, using free(ptr) alone will also lead to program crash

After using the smart pointer, you can only use all the smart pointers. The next autoptr_ The receiving object of CPY can only be a smart pointer

#define autoptr_cpy(_name) \
	(_name != NULL && __autoptr_container__(struct __autoptr, data, _name)->cnt++) ? _name : NULL

How to determine the number of references is a problem
C language does not provide this mechanism
Therefore, I designed a mechanism for manual reference counting
When the macro is called, the corresponding count is incremented by one

_ name != NULL && __ autoptr_ container__ (struct _autoptr, data, _name) - > CNT + +, as long as the judgment of the first half fails, the second half will not be executed

Advantages / disadvantages

First, the advantage of using macros is that they can be used as long as the header file is referenced, while the disadvantage is to increase the size of the source code

Secondly, the manual counting method is used for reference counting, which is reliable in the hands of reliable people and unreliable in the hands of unreliable people. It can be said that this mechanism is reliable, because there will be no problem in using it correctly, but it is not completely reliable, because there will always be unreliable people
You even forget free. Why do I believe you will remember to count manually?

The biggest disadvantage is that smart pointer and ordinary pointer cannot be mixed, which greatly limits its use
Smart pointers can only be used together with smart pointers to manage reference counts and trigger automatic release functions

Finally, because the memory is not allocated separately, functions such as realloc basically cannot be used. You need to use atuoptr H can only be used if the corresponding implementation is added

From this point of view, the smart pointer is not very smart. Everything should be done according to the specification
As C language programmers, we should not be confused with Java and C + +. We should know exactly what we are doing, design the most effective scheme after carefully reading the documents, and finally write efficient and accurate coding, rather than writing where we think, without design and ideas.

Topics: C pointer GNU