redis is an in memory database based on key value. The so-called kv storage or kv database means that the data in it is stored one-to-one, in which key is the only index. This structure is generally based on hash table, with high efficiency and search complexity of O(1).
However, hashmap is not omnipotent. With the increase of the amount of data, the more serious the hash conflict and complexity will increase. Redis needs to rehash and expand the capacity of hashmap. As shown in the figure, redis also supports rich data structure types. How do you do this? How are they stored in memory?
1. Objects in redis
In redis, each object corresponds to a redisObject object, as follows. Here we focus on type, encoding and ptr
127.0.0.1:6379> set key "hello world" OK 127.0.0.1:6379> get key "hello world" 127.0.0.1:6379> type key string 127.0.0.1:6379> RPUSH scores 87 89 97 (integer) 3 127.0.0.1:6379> type scores list
typedef struct redisObject { unsigned type:4; unsigned encoding:4; unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency * and most significant 16 bits access time). */ int refcount; void *ptr; } robj;
1.1 type
There are five types of objects: string, list, set, ordered set and hash object.
/* The actual Redis Object */ #define OBJ_STRING 0 /* String object. */ #define OBJ_LIST 1 /* List object. */ #define OBJ_SET 2 /* Set object. */ #define OBJ_ZSET 3 /* Sorted set object. */ #define OBJ_HASH 4 /* Hash object. */
The TYPE of a database key can be viewed through the TYPE command
127.0.0.1:6379> set key "hello world" OK 127.0.0.1:6379> get key "hello world" 127.0.0.1:6379> type key string 127.0.0.1:6379> RPUSH scores 87 89 97 (integer) 3 127.0.0.1:6379> type scores list
1.2 coding
The ptr pointer of the object points to the underlying data structure of the object and corresponds to the encoding attribute. as follows
/* Objects encoding. Some kind of objects like Strings and Hashes can be * internally represented in multiple ways. The 'encoding' field of the object * is set to one of this fields for this object. */ #define OBJ_ENCODING_RAW 0 /* Raw representation */ #define OBJ_ENCODING_INT 1 /* Encoded as integer */ #define OBJ_ENCODING_HT 2 /* Encoded as hash table */ #define OBJ_ENCODING_ZIPMAP 3 /* Encoded as zipmap */ #define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */ #define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */ #define OBJ_ENCODING_INTSET 6 /* Encoded as intset */ #define OBJ_ENCODING_SKIPLIST 7 /* Encoded as skiplist */ #define OBJ_ENCODING_EMBSTR 8 /* Embedded sds string encoding */ #define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */ #define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
The corresponding view command object encoding key. These codes are not endless. For example, string objects encoded by int and embstr may become string objects encoded by raw
127.0.0.1:6379> set hi "hello" OK 127.0.0.1:6379> object encoding hi "embstr" 127.0.0.1:6379> sadd numbers 87 69 90 (integer) 3 127.0.0.1:6379> object encoding numbes (nil) 127.0.0.1:6379> object encoding numbers "intset"
1.3 object pointer
As shown in the following figure, it is a RAW encoded string object
The following figure shows the list object encoded by ziplist. Similarly, other codes use this structure
1.4 reference counting to realize memory recycling
The object system of redis uses reference counting to realize memory recycling. When the reference count is 0, the object is freed to reclaim memory.
1.5 lru
lru this attribute indicates the interval from the last time the object was accessed to the current time. The corresponding commands are as follows
127.0.0.1:6379> OBJECT idletime key (integer) 1942
If the maxmemory option is set and the recycling algorithm is volatile lru or allkeys lru, when the number of keys exceeds the upper limit, the part with high lru duration will be recycled.
2. Dictionary
Redis adopts the structure of dictionary to manage key value pairs. It is used as the bottom implementation of database and also the bottom implementation of hash key. The following is an analysis of the three important data structures of hash table, hash table node and dictionary.
2.1 hash table
/* This is our hash table structure. Every dictionary has two of this as we * implement incremental rehashing, for the old to the new table. */ typedef struct dictht { dictEntry **table; unsigned long size; unsigned long sizemask; unsigned long used; } dictht;
Table: hash table array, size: hash table array size, sizemask: hash table size mask, used to determine the location of new keys, used: the number of existing nodes in the hash table.
2.2 Hash table node
typedef struct dictEntry { void *key; union { void *val; uint64_t u64; int64_t s64; double d; } v; struct dictEntry *next; } dictEntry;
Key saves the key, v saves the value, and next points to another node to resolve key conflicts. You can see that v is a union, and the corresponding value can be uint64_t\int64_t\double these basic types can also be pointers, which indirectly point to objects.
2.3 dictionary
typedef struct dictType { uint64_t (*hashFunction)(const void *key);// Function for calculating hash value void *(*keyDup)(void *privdata, const void *key);// Key copy function void *(*valDup)(void *privdata, const void *obj);// Value copy function int (*keyCompare)(void *privdata, const void *key1, const void *key2);// Comparison function void (*keyDestructor)(void *privdata, void *key);// Key destructor void (*valDestructor)(void *privdata, void *obj);// Value destructor } dictType; typedef struct dict { dictType *type; void *privdata; dictht ht[2]; long rehashidx; /* rehashing not in progress if rehashidx == -1 */ unsigned long iterators; /* number of iterators currently running */ } dict;
type is a pointer to dictType. dictType holds a set of functions with operation key value pairs, and privdata is an optional parameter of these functions. This is similar to the virtual function table pointer in C + + (more specifically, the virtual table pointer is similar to it), so as to realize polymorphism.
ht is an array containing two dicths. Under normal circumstances, the dictionary only uses ht[0], ht[1] will only be used in rehash, and rehashidx records the current progress. If there is no rehash, it defaults to - 1. The following figure shows the dictionary, hash table and hash table nodes in normal state.
reference resources:
[0] https://hazelcast.com/glossary/key-value-store/
[1] Design and implementation of redis
[2] http://www.redis.cn/documentation.html