Redis source code [sds] [redis source code]

Posted by zzz on Tue, 18 Jan 2022 02:37:55 +0100

redis string source code analysis

For details, please refer to the redis source code SDS H this file

/* Note: sdshdr5 is never used, we just access the flags byte directly.
 * However is here to document the layout of type 5 SDS strings. */
/* In all structures:
 * len      Indicates the length used
 * alloc    Indicates the usable length
 * flags    The lower 3 bits save the structure type, and the upper 5 bits are not used */
struct __attribute__ ((__packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
    uint64_t len; /* used */
    uint64_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};

#define SDS_TYPE_5  0
#define SDS_TYPE_8  1
#define SDS_TYPE_16 2
#define SDS_TYPE_32 3
#define SDS_TYPE_64 4
#define SDS_TYPE_MASK 7
#define SDS_TYPE_BITS 3
#define SDS_HDR_VAR(T,s) struct sdshdr##T *sh = (void*)((s)-(sizeof(struct sdshdr##T)));
#define SDS_HDR(T,s) ((struct sdshdr##T *)((s)-(sizeof(struct sdshdr##T))))
#define SDS_TYPE_5_LEN(f) ((f)>>SDS_TYPE_BITS)

/* Returns the used length of the specified sds */
static inline size_t sdslen(const sds s) {
    unsigned char flags = s[-1];
    switch(flags&SDS_TYPE_MASK) {
        case SDS_TYPE_5:
            return SDS_TYPE_5_LEN(flags);
        case SDS_TYPE_8:
            return SDS_HDR(8,s)->len;
        case SDS_TYPE_16:
            return SDS_HDR(16,s)->len;
        case SDS_TYPE_32:
            return SDS_HDR(32,s)->len;
        case SDS_TYPE_64:
            return SDS_HDR(64,s)->len;
    }
    return 0;
}

Redis designed the structure of Simple Dynamic String (SDS)

Compared with the string implementation in C language, the string implementation of SDS will improve the string operation
It can be used to save binary data.
If the redis string is directly implemented with char *, there will be many problems. For example, using the end of the string \ 0 will truncate the string, and many APIs such as strlen cannot be used.

This does not meet Redis's need to save any binary data.

SDS structure design

First, the SDS structure contains a character array buf [], which is used to store the actual data. At the same time, in the SDS structure
It contains three metadata, namely, the existing length len of the character array and the space length alloc allocated to the character array
And SDS type flags. Redis defines multiple data types for len and alloc metadata
It can be used to represent different types of SDS, which I will introduce to you later. The following figure shows the structure of SDS. You can
Take a look first

typedef char *sds;

Redis uses typedef to define an alias for char * type, which is sds

Compared with the string operation in C language, SDS records the use of character arrays
The length and allocated space size avoid the traversal operation of the string, reduce the operation overhead, and further help
Many string operations are completed more efficiently, such as create, append, copy, compare, etc

sds defines such as sdshdr5, sdshdr8, sdshdr16, sdshdr32 and sdshdr64
The main difference between these types is the existing length len and the allocated space length of the character array in their data structure
alloc, the data types of the two metadata are different.

The reason why SDS designs different structure headers (i.e. different types, into sdshdr32,64, etc.) is to flexibly save words of different sizes
String, thus effectively saving memory space

As we can see, between struct and sdshdr8
Yes__ attribute__ ((__packed__)) , In fact, the function here is to tell the compiler that it is compiling
sdshdr8 structure, do not use byte alignment, but use a compact way to allocate memory.
By default, the compiler allocates memory to variables in an 8-byte alignment. That is, even if one changes
The size of the quantity is less than 8 bytes, and the compiler will allocate 8 bytes to it. [byte alignment is prohibited, which can also save some memory]
If you want to save the memory cost of data structure when developing programs, you can
__ attribute__ ((__packed__)) This programming method is useful.

sds summary

Redis specially designed the SDS data structure in the character array
The length of character array and the size of allocated space are added
Moreover, SDS does not judge the end of the string by the "\ 0" character in the string, but directly takes it as binary data
Processing, can be used to save binary data such as pictures.

Design different SDS types to represent strings of different sizes, and use__ attribute__ ((__packed__)) This programming tips, to achieve a compact memory layout, to achieve the purpose of saving memory.

Deficiency of char *

Shortcomings of char *:

  • Low operation efficiency: the length needs to be traversed, and the O(N) complexity
  • Binary unsafe: cannot store data containing \ 0

Redis's intention in designing sds:
1. Meet the conditions for storing and transmitting binary (avoid \ 0 ambiguity)
2. Efficient operation of strings (quickly obtain the length and size of characters and jump to the end of strings through len and alloc)
3. Compact memory design (according to the string type, len and alloc use different types to save memory and turn off
Memory alignment to achieve efficient memory utilization. In redis, in addition to sds, intset and ziplist also have similar goals)

Topics: Redis Cache nosql