Introduction to redis storage data structure in redis

Posted by potato on Thu, 10 Feb 2022 23:22:19 +0100

redis is an in memory database based on key value. The so-called kv storage or kv database means that the data in it is stored one-to-one, in which key is the only index. This structure is generally based on hash table, with high efficiency and search complexity of O(1).

However, hashmap is not omnipotent. With the increase of the amount of data, the more serious the hash conflict and complexity will increase. Redis needs to rehash and expand the capacity of hashmap. As shown in the figure, redis also supports rich data structure types. How do you do this? How are they stored in memory?

1. Objects in redis

In redis, each object corresponds to a redisObject object, as follows. Here we focus on type, encoding and ptr

127.0.0.1:6379> set key "hello world"
OK
127.0.0.1:6379> get key
"hello world"
127.0.0.1:6379> type key
string
127.0.0.1:6379> RPUSH scores 87 89 97
(integer) 3
127.0.0.1:6379> type scores
list

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

1.1 type

There are five types of objects: string, list, set, ordered set and hash object.

/* The actual Redis Object */
#define OBJ_STRING 0    /* String object. */
#define OBJ_LIST 1      /* List object. */
#define OBJ_SET 2       /* Set object. */
#define OBJ_ZSET 3      /* Sorted set object. */
#define OBJ_HASH 4      /* Hash object. */

The TYPE of a database key can be viewed through the TYPE command

127.0.0.1:6379> set key "hello world"
OK
127.0.0.1:6379> get key
"hello world"
127.0.0.1:6379> type key
string
127.0.0.1:6379> RPUSH scores 87 89 97
(integer) 3
127.0.0.1:6379> type scores
list

1.2 coding

The ptr pointer of the object points to the underlying data structure of the object and corresponds to the encoding attribute. as follows

/* Objects encoding. Some kind of objects like Strings and Hashes can be
 * internally represented in multiple ways. The 'encoding' field of the object
 * is set to one of this fields for this object. */
#define OBJ_ENCODING_RAW 0     /* Raw representation */
#define OBJ_ENCODING_INT 1     /* Encoded as integer */
#define OBJ_ENCODING_HT 2      /* Encoded as hash table */
#define OBJ_ENCODING_ZIPMAP 3  /* Encoded as zipmap */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define OBJ_ENCODING_INTSET 6  /* Encoded as intset */
#define OBJ_ENCODING_SKIPLIST 7  /* Encoded as skiplist */
#define OBJ_ENCODING_EMBSTR 8  /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */

The corresponding view command object encoding key. These codes are not endless. For example, string objects encoded by int and embstr may become string objects encoded by raw

127.0.0.1:6379> set hi "hello"
OK
127.0.0.1:6379> object encoding hi
"embstr"
127.0.0.1:6379> sadd numbers 87 69 90
(integer) 3
127.0.0.1:6379> object encoding numbes
(nil)
127.0.0.1:6379> object encoding numbers
"intset"

1.3 object pointer

As shown in the following figure, it is a RAW encoded string object

The following figure shows the list object encoded by ziplist. Similarly, other codes use this structure

1.4 reference counting to realize memory recycling

The object system of redis uses reference counting to realize memory recycling. When the reference count is 0, the object is freed to reclaim memory.

1.5 lru

lru this attribute indicates the interval from the last time the object was accessed to the current time. The corresponding commands are as follows

127.0.0.1:6379> OBJECT idletime key
(integer) 1942

If the maxmemory option is set and the recycling algorithm is volatile lru or allkeys lru, when the number of keys exceeds the upper limit, the part with high lru duration will be recycled.

2. Dictionary

Redis adopts the structure of dictionary to manage key value pairs. It is used as the bottom implementation of database and also the bottom implementation of hash key. The following is an analysis of the three important data structures of hash table, hash table node and dictionary.

2.1 hash table

/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

Table: hash table array, size: hash table array size, sizemask: hash table size mask, used to determine the location of new keys, used: the number of existing nodes in the hash table.

2.2 Hash table node

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

Key saves the key, v saves the value, and next points to another node to resolve key conflicts. You can see that v is a union, and the corresponding value can be uint64_t\int64_t\double these basic types can also be pointers, which indirectly point to objects.

2.3 dictionary

typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);// Function for calculating hash value
    void *(*keyDup)(void *privdata, const void *key);// Key copy function
    void *(*valDup)(void *privdata, const void *obj);// Value copy function
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);// Comparison function
    void (*keyDestructor)(void *privdata, void *key);// Key destructor
    void (*valDestructor)(void *privdata, void *obj);// Value destructor
} dictType;

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    unsigned long iterators; /* number of iterators currently running */
} dict;

type is a pointer to dictType. dictType holds a set of functions with operation key value pairs, and privdata is an optional parameter of these functions. This is similar to the virtual function table pointer in C + + (more specifically, the virtual table pointer is similar to it), so as to realize polymorphism.

ht is an array containing two dicths. Under normal circumstances, the dictionary only uses ht[0], ht[1] will only be used in rehash, and rehashidx records the current progress. If there is no rehash, it defaults to - 1. The following figure shows the dictionary, hash table and hash table nodes in normal state.