Data type analysis of Redis source code - ZipList
The current Redis analysis version is 6.2, which should be noted.
ZipList, a compressed list, can contain multiple nodes at will.
Infrastructure
ZipList
Compress the list. Its overall layout, < zlbytes > < zltail > < zllen > < entry > < entry >... < entry > < Zlend >.
- uint32_t zlbytes, the memory byte size occupied by the compressed list, including four bytes of the zlbytes field itself. You need to store this value so that you can resize the entire structure instead of traversing it first.
- uint32_t zltail, the offset of the last node in the compressed list. Allows pop-up node operations at the far end of the compressed list without the need for the entire traversal.
- uint16_t zllen, number of nodes. When the value of this field is less than UINT16_MAX(2^16-1), the value of this field is the number of nodes. When this value is equal to UINT16_MAX, the real number of nodes needs to traverse the whole list.
- uint8_t zlend, mark the special value 0xFF at the end of the compressed list.
Compressed list node
The nodes in the compressed list start with metadata containing two parts of information, followed by entry data. The first is the length of the previous node. The second is the current node encoding, integer or string. The general structure is < prevlen > < encoding > < entry data >, but sometimes encoding includes the node data itself, that is, the simple structure of < prevlen > < encoding >.
- prevlen, the length of the previous node (bytes) (the optional value of this value is 1|5). If the length of the previous node is less than 254 bytes, the length of the attribute is 1 byte: the length of the previous node is saved in this byte. If the length of the previous node is greater than or equal to 254 bytes, the attribute length is 5 bytes, the first byte will be set to 254(0xFE), and the remaining four bytes save the length of the previous node.
- Encoding, encoding, depends on the content of the node. When the node is a string, the first two bits of the first byte of the attribute will save the encoding type used to store the length of the string, and the rest is the actual length of the string. When the node is an integer, the first two bits are set to 1, and the next two bits are used to represent the integer type stored after the header. The first byte is often enough to determine the node type (integer | string).
- The string 00 encodes a word section long. The length of the saved string is 6 bits, and the maximum length of the string is 2 ^ 6-1 bytes.
- String 01 encodes two bytes long, saves 14 bits of string length (big end), and the maximum length of string is 2 ^ 14-1 bytes.
- The encoding of string 10 is five bytes long, the length of the saved string is 32 bits (big end), and the maximum length of the string is 2 ^ 32-1 bytes. The lower 6 bits of the first byte are not used, 0.
- Integer 11000000, int16_ An integer of type T (two bytes).
- Integer 11010000, int32_ An integer of type T (four bytes).
- Integer 11100000, Int64_ An integer of type T (eight bytes).
- Integer 11110000, 24 bit signed integer.
- Integer 11111110, 8-bit signed integer.
- Integer 1111xxxx, XXXX between 0001 and 1101. Unsigned integer from 0 to 12. This code value is actually 1 to 13. Because 0000 and 1111 cannot be used, it is necessary to subtract 1 from these four bit values to be the correct value.
- 11111111, special tail node of compressed list.
ziplistEntry
Compress the node value standardization template.
// Each node in the compressed list is either a string or an integer typedef struct { // If it is a string, the length is slen unsigned char *sval; unsigned int slen; // If it is an integer, sval is NULL, and lval saves the integer long long lval; } ziplistEntry;
zlentry
Gets the template structure for compressed list node information. This is not the actual coding of nodes, but just to fill them for easy operation.
typedef struct zlentry { unsigned int prevrawlensize; // The number of bytes used to encode the length of the previous node? unsigned int prevrawlen; // Length of previous node unsigned int lensize; // Bytes used to encode the node type or length. For example, a string has a 1 | 2 | 5 byte header, and an integer usually has only one byte. unsigned int len; // The actual number of node bytes. For a string, it is the length of the string. For an integer, it depends on its range unsigned int headersize; // Header size = prevrawlensize+lensize unsigned char encoding; // Node coding method unsigned char *p; // The starting pointer of the node, that is, the length attribute pointing to the previous node. } zlentry;
Macro constant
ZIP_END
#define ZIP_END 255, the special tail node of the compressed list.
ZIP_BIG_PREVLEN
#define ZIP_BIG_PREVLEN 254, zip for prevlen attribute that represents only one byte before each node_ BIG_ Prevlen-1 is its maximum number of bytes. Otherwise, it is a four byte unsigned integer in the form of FE AA BB CC DD, representing the length of the previous node.
ZIP_STR_MASK
#define ZIP_STR_MASK 0xc0, string mask (1100 0000).
ZIP_INT_MASK
#define ZIP_INT_MASK 0x30, integer mask (0011 0000).
ZIP_STR_06B
#define ZIP_ STR_ 06B (0 < < 6), 6 bits to store string length encoded string (0000).
ZIP_STR_14B
#define ZIP_ STR_ 14b (1 < < 6), 14 bits store string length encoded string (0100 0000).
ZIP_STR_32B
#define ZIP_ STR_ 32B (2 < < 6), 32 bits store string length encoded string (1000 0000).
ZIP_INT_16B
#define ZIP_ INT_ 16b (0xc0 | 0 < < 4), 16 bit signed integer (int16_t) (1100 0000).
ZIP_INT_32B
#define ZIP_ INT_ 32B (0xc0 | 1 < < 4), 32-bit signed integer (int32_t) (1101 0000).
ZIP_INT_64B
#define ZIP_ INT_ 64b (0xc0 | 2 < < 4), 64 bit signed integer (int64_t) (1110 0000).
ZIP_INT_24B
#define ZIP_ INT_ 24B (0xc0 | 3 < < 4), 24 bit signed integer (1111 0000).
ZIP_INT_8B
#define ZIP_INT_8B 0xfe, 8-bit signed integer (1111 1110).
ZIP_INT_IMM_MASK
#define ZIP_INT_IMM_MASK 0x0f, 4-bit unsigned integer mask.
ZIP_INT_IMM_MIN
#define ZIP_INT_IMM_MIN 0xf1, 4-bit unsigned integer, minimum value 0.
ZIP_INT_IMM_MAX
#define ZIP_INT_IMM_MAX 0xfd, 4-bit unsigned integer, max. 12.
INT24_MAX
#define INT24_MAX 0x7fffff, the maximum value of a 24 bit signed integer.
INT24_MIN
#define INT24_MIN (-INT24_MAX - 1), the minimum value of a 24 bit signed integer.
ZIP_ENCODING_SIZE_INVALID
#define ZIP_ENCODING_SIZE_INVALID 0xff, invalid value for encoding size.
Macro function
ZIP_IS_STR
Judge whether the specified encoding enc represents a string. String nodes do not have 11 as the most significant bit of the first byte.
#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)
ZIPLIST_BYTES
Returns a pointer to the total number of bytes contained in the compressed list.
#define ZIPLIST_BYTES(zl) (*((uint32_t*)(zl))) // Ziplost structure // zlbytes zltail zllen entry...entry zlend
ZIPLIST_TAIL_OFFSET
Returns the offset pointer of the last node in the compressed list
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t)))) // zl offset 32-bit zlbytes - > zltail
ZIPLIST_LENGTH
Returns a pointer to the number of nodes in the compressed list. If it is equal to UINT16_MAX, you need to traverse the whole list to calculate the number of nodes.
#define ZIPLIST_LENGTH(zl) (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
ZIPLIST_HEADER_SIZE
Compressed list header size: two 32-bit integers save the total number of bytes and the last node offset, and 16 bit integers are the number of nodes.
#define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t))
ZIPLIST_END_SIZE
Compressed list end node size. Only one byte.
#define ZIPLIST_END_SIZE (sizeof(uint8_t))
ZIPLIST_ENTRY_HEAD
Returns the pointer to the first node in the compressed list. That is, ziplist+headerSize.
#define ZIPLIST_ENTRY_HEAD(zl) ((zl)+ZIPLIST_HEADER_SIZE)
ZIPLIST_ENTRY_TAIL
Returns the pointer to the last node in the compressed list.
#define ZIPLIST_ENTRY_TAIL(zl) ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
ZIPLIST_ENTRY_END
Returns the pointer to the last byte of the compressed list, that is, the end node FF.
#define ZIPLIST_ENTRY_END(zl) ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)
ZIPLIST_INCR_LENGTH
Increase the length zllen in the compressed list header.
#define ZIPLIST_INCR_LENGTH(zl,incr) { \ if (ZIPLIST_LENGTH(zl) < UINT16_MAX) \ ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \ }
ZIPLIST_ENTRY_ZERO
Initialize the compressed list node template structure.
#define ZIPLIST_ENTRY_ZERO(zle) { \ (zle)->prevrawlensize = (zle)->prevrawlen = 0; \ (zle)->lensize = (zle)->len = (zle)->headersize = 0; \ (zle)->encoding = 0; \ (zle)->p = NULL; \ }
ZIP_ENTRY_ENCODING
Get the encoding method from the ptr pointer byte and set it to the encoding attribute in the zlentry structure.
#define ZIP_ENTRY_ENCODING(ptr, encoding) do { \ (encoding) = ((ptr)[0]); \ if ((encoding) < ZIP_STR_MASK) (encoding) &= ZIP_STR_MASK; \ } while(0)
ZIP_ASSERT_ENCODING
Detect whether the encoding is invalid.
#define ZIP_ASSERT_ENCODING(encoding) do { \ assert(zipEncodingLenSize(encoding) != ZIP_ENCODING_SIZE_INVALID); \ } while (0)
ZIP_DECODE_LENGTH
Decode the node type and data length (string length, integer bytes) encoded in ptr. lensize the number of bytes encoded on the node. len node length. Similar to the zipStoreEntryEncoding section.
#define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do { \ if ((encoding) < ZIP_STR_MASK) { \ // String encoding length 1|2|5 if ((encoding) == ZIP_STR_06B) { \ (lensize) = 1; \ (len) = (ptr)[0] & 0x3f; \ } else if ((encoding) == ZIP_STR_14B) { \ (lensize) = 2; \ (len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1]; \ } else if ((encoding) == ZIP_STR_32B) { \ (lensize) = 5; \ (len) = ((ptr)[1] << 24) | \ ((ptr)[2] << 16) | \ ((ptr)[3] << 8) | \ ((ptr)[4]); \ } else { \ // Exception coding (lensize) = 0; \ (len) = 0; \ } \ } else { \ // The length of integer node encoding is 1 data byte, which needs to be determined according to encoding (lensize) = 1; \ if ((encoding) == ZIP_INT_8B) (len) = 1; \ else if ((encoding) == ZIP_INT_16B) (len) = 2; \ else if ((encoding) == ZIP_INT_24B) (len) = 3; \ else if ((encoding) == ZIP_INT_32B) (len) = 4; \ else if ((encoding) == ZIP_INT_64B) (len) = 8; \ else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) \ (len) = 0; /* 4 bit immediate */ \ else \ (lensize) = (len) = 0; // Exception coding\ } \ } while(0)
ZIP_DECODE_PREVLENSIZE
Returns the number of bytes used to encode the length of the previous node. By setting prelensize.
#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do { \ if ((ptr)[0] < ZIP_BIG_PREVLEN) { \ (prevlensize) = 1; \ } else { \ (prevlensize) = 5; \ } \ } while(0)
ZIP_DECODE_PREVLEN
Returns the length prevlen of the previous node and the number of bytes encoding this length prevlensize.
#define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do { \ ZIP_DECODE_PREVLENSIZE(ptr, prevlensize); \ if ((prevlensize) == 1) { \ (prevlen) = (ptr)[0]; \ } else { /* prevlensize == 5 */ \ (prevlen) = ((ptr)[4] << 24) | \ ((ptr)[3] << 16) | \ ((ptr)[2] << 8) | \ ((ptr)[1]); \ } \ } while(0)
Internal function
zipEncodingLenSize
Returns the number of bytes encoded by the node type and length, and returns zip for errors_ ENCODING_ SIZE_ INVALID.
static inline unsigned int zipEncodingLenSize(unsigned char encoding) { if (encoding == ZIP_INT_16B || encoding == ZIP_INT_32B || encoding == ZIP_INT_24B || encoding == ZIP_INT_64B || encoding == ZIP_INT_8B) return 1; // Integer has only one byte if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) //Special integer 4-bit return 1; if (encoding == ZIP_STR_06B) return 1; if (encoding == ZIP_STR_14B) return 2; if (encoding == ZIP_STR_32B) return 5; return ZIP_ENCODING_SIZE_INVALID; }
zipIntSize
Returns the bytes required to store an integer encoded by encoding.
static inline unsigned int zipIntSize(unsigned char encoding) { switch(encoding) { case ZIP_INT_8B: return 1; case ZIP_INT_16B: return 2; case ZIP_INT_24B: return 3; case ZIP_INT_32B: return 4; case ZIP_INT_64B: return 8; } if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) return 0; /* 4 The bit is directly in encoding */ redis_unreachable(); // abort return 0; }
zipStoreEntryEncoding
Write the encoding terminal of the node into p. if p is empty, directly return the number of bytes required to encode this length.
unsigned int zipStoreEntryEncoding(unsigned char *p, unsigned char encoding, unsigned int rawlen) { unsigned char len = 1, buf[5]; // The maximum length is 5, so it is buf[5] if (ZIP_IS_STR(encoding)) { // character string // Because the encoding given may not be for strings, rawlen still needs to judge if (rawlen <= 0x3f) { // 1 63 2^6-1 00 if (!p) return len; buf[0] = ZIP_STR_06B | rawlen; // Type and length // 00 00 0000 // 2bits symbol 6bits len } else if (rawlen <= 0x3fff) { // 2 10383 2^14-1 01 len += 1; if (!p) return len; // ZIP_STR_14B 0100 0000 // 0x3f 0011 1111 buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f); // Two + six buf[1] = rawlen & 0xff; // Last eight // 0100 0000 0000 0000 // 2bits symbol 14bits len } else { // 5 10 len += 4; if (!p) return len; buf[0] = ZIP_STR_32B; // Symbol independent one byte // Save length of remaining four bytes buf[1] = (rawlen >> 24) & 0xff; buf[2] = (rawlen >> 16) & 0xff; buf[3] = (rawlen >> 8) & 0xff; buf[4] = rawlen & 0xff; } } else { // integer if (!p) return len; buf[0] = encoding; } memcpy(p,buf,len); // storage return len; }
zipStorePrevEntryLengthLarge
Encode the length of the previous node and write p. For larger encodings only (_ziplistCascadeUpdate).
int zipStorePrevEntryLengthLarge(unsigned char *p, unsigned int len) { uint32_t u32; if (p != NULL) { // Segmentation? p[0] = ZIP_BIG_PREVLEN; // 254 u32 = len; memcpy(p+1,&u32,sizeof(u32)); memrev32ifbe(p+1); } return 1 + sizeof(uint32_t); }
zipStorePrevEntryLength
Encode the length of the previous node and write p. If p is null, returns the number of bytes of this length to be encoded.
unsigned int zipStorePrevEntryLength(unsigned char *p, unsigned int len) { if (p == NULL) { return (len < ZIP_BIG_PREVLEN) ? 1 : sizeof(uint32_t) + 1; } else { if (len < ZIP_BIG_PREVLEN) { // Big head distinction p[0] = len; return 1; } else { return zipStorePrevEntryLengthLarge(p,len); } } }
zipPrevLenByteDiff
Returns the length difference of the length of the previous node (prevlen) before encoding. When the size of the current node changes.
int zipPrevLenByteDiff(unsigned char *p, unsigned int len) { unsigned int prevlensize; // Length of prevlen ZIP_DECODE_PREVLENSIZE(p, prevlensize); // Get old prevlensize return zipStorePrevEntryLength(NULL, len) - prevlensize; }
zipTryEncoding
Detects whether a string node can be converted to an integer.
int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) { // **v integer value * encoding corresponding code long long value; if (entrylen >= 32 || entrylen == 0) return 0; if (string2ll((char*)entry,entrylen,&value)) { // string2ll string conversion to long integer // Judge the integer range and determine its encoding type if (value >= 0 && value <= 12) { *encoding = ZIP_INT_IMM_MIN+value; // Code 1111 xxxx } else if (value >= INT8_MIN && value <= INT8_MAX) { *encoding = ZIP_INT_8B; } else if (value >= INT16_MIN && value <= INT16_MAX) { *encoding = ZIP_INT_16B; } else if (value >= INT24_MIN && value <= INT24_MAX) { *encoding = ZIP_INT_24B; } else if (value >= INT32_MIN && value <= INT32_MAX) { *encoding = ZIP_INT_32B; } else { *encoding = ZIP_INT_64B; } *v = value; // Assignment integer return 1; } return 0; }
zipSaveInteger
Save the integer value to p.
void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) { int16_t i16; int32_t i32; int64_t i64; if (encoding == ZIP_INT_8B) { ((int8_t*)p)[0] = (int8_t)value; } else if (encoding == ZIP_INT_16B) { i16 = value; memcpy(p,&i16,sizeof(i16)); memrev16ifbe(p); } else if (encoding == ZIP_INT_24B) { // 24bits i32 = value<<8; memrev32ifbe(&i32); memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t)); } else if (encoding == ZIP_INT_32B) { i32 = value; memcpy(p,&i32,sizeof(i32)); memrev32ifbe(p); } else if (encoding == ZIP_INT_64B) { i64 = value; memcpy(p,&i64,sizeof(i64)); memrev64ifbe(p); } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) { // The value is in encoding } else { assert(NULL); } }
zipLoadInteger
Read integer values from p. Reverse operation of zipSaveInteger.
int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) { int16_t i16; int32_t i32; int64_t i64, ret = 0; if (encoding == ZIP_INT_8B) { ret = ((int8_t*)p)[0]; } else if (encoding == ZIP_INT_16B) { memcpy(&i16,p,sizeof(i16)); memrev16ifbe(&i16); ret = i16; } else if (encoding == ZIP_INT_32B) { memcpy(&i32,p,sizeof(i32)); memrev32ifbe(&i32); ret = i32; } else if (encoding == ZIP_INT_24B) { i32 = 0; memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t)); memrev32ifbe(&i32); ret = i32>>8; } else if (encoding == ZIP_INT_64B) { memcpy(&i64,p,sizeof(i64)); memrev64ifbe(&i64); ret = i64; } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) { ret = (encoding & ZIP_INT_IMM_MASK)-1; // - 1 required } else { assert(NULL); } return ret; }
zipEntry
Fill the structure e with the information of node p.
static inline void zipEntry(unsigned char *p, zlentry *e) { ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen); ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding); ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len); assert(e->lensize != 0); // Verify that encoding is valid e->headersize = e->prevrawlensize + e->lensize; e->p = p; }
zipEntrySafe
The secure version of zipEntry. Mainly for untrusted pointers. It ensures that memory outside the ziplost scope is not accessed.
static inline int zipEntrySafe(unsigned char* zl, size_t zlbytes, unsigned char *p, zlentry *e, int validate_prevlen) { unsigned char *zlfirst = zl + ZIPLIST_HEADER_SIZE; // Header node pointer unsigned char *zllast = zl + zlbytes - ZIPLIST_END_SIZE; // Tail node pointer // Determines whether the pointer removes the macro definition function #define OUT_OF_RANGE(p) (unlikely((p) < zlfirst || (p) > zllast)) // If there is no possibility of header overflow from the compressed list, take a shortcut (the maximum lensize and prevraw lensize are 5 bytes) if (p >= zlfirst && p + 10 < zllast) { ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen); ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding); ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len); e->headersize = e->prevrawlensize + e->lensize; e->p = p; if (unlikely(e->lensize == 0)) // Check whether e - > lensize is 0 return 0; if (OUT_OF_RANGE(p + e->headersize + e->len)) // Determine whether the ziplost range is overflowed return 0; if (validate_prevlen && OUT_OF_RANGE(p - e->prevrawlen)) // Determine whether prevlen overflows return 0; return 1; } if (OUT_OF_RANGE(p)) // Detect whether the pointer overflows return 0; ZIP_DECODE_PREVLENSIZE(p, e->prevrawlensize); // Set prevrawlensize if (OUT_OF_RANGE(p + e->prevrawlensize)) return 0; // Check whether the code is valid ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding); e->lensize = zipEncodingLenSize(e->encoding); if (unlikely(e->lensize == ZIP_ENCODING_SIZE_INVALID)) return 0; // Detect whether the node header code overflows if (OUT_OF_RANGE(p + e->prevrawlensize + e->lensize)) return 0; // Decode prevlen and node header length ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen); ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len); e->headersize = e->prevrawlensize + e->lensize; // Detect whether the node overflows if (OUT_OF_RANGE(p + e->headersize + e->len)) return 0; // Detect whether prevlen overflows if (validate_prevlen && OUT_OF_RANGE(p - e->prevrawlen)) return 0; e->p = p; return 1; #undef OUT_OF_RANGE / / cancel macro definition OUT_OF_RANGE }
zipRawEntryLengthSafe
Calculate the total number of bytes occupied by the p node.
static inline unsigned int zipRawEntryLengthSafe(unsigned char* zl, size_t zlbytes, unsigned char *p) { zlentry e; assert(zipEntrySafe(zl, zlbytes, p, &e, 0)); return e.headersize + e.len; }
zipRawEntryLength
Calculate the total number of bytes occupied by the p node. Non secure version of ziprawuntrylengthsafe.
static inline unsigned int zipRawEntryLength(unsigned char *p) { zlentry e; zipEntry(p, &e); return e.headersize + e.len; }
zipAssertValidEntry
Verify that the node overflows the compressed list range.
static inline void zipAssertValidEntry(unsigned char* zl, size_t zlbytes, unsigned char *p) { zlentry e; assert(zipEntrySafe(zl, zlbytes, p, &e, 1)); }
ziplistResize
Resize the compressed list.
unsigned char *ziplistResize(unsigned char *zl, unsigned int len) { zl = zrealloc(zl,len); // Reallocate space ZIPLIST_BYTES(zl) = intrev32ifbe(len); // Set total length zl[len-1] = ZIP_END; // Reset end node return zl; }
__ziplistCascadeUpdate
Cascade updates. When a node is inserted, we need to set the prevlen field of the next node to be equal to the length of the currently inserted node. As a result, the length cannot be encoded within 1 byte, and the next node needs to be extended to save the 5-byte encoded prevlen. This can be done freely because it only happens when a node has been inserted (which may lead to realloc and memmove). However, encoding this prevlen may also require corresponding node extensions. This effect may run through the entire compressed list when there is a series of lengths close to ZIP_BIG_PREVLEN node, so it is necessary to detect whether prevlen can be encoded in each continuous node. The inversion of the prevlen field that needs to be shrunk can also cause this effect.
unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) { // It's hard // *p points to the first node that does not need to be updated zlentry cur; // Template node size_t prevlen, prevlensize, prevoffset; // Last update node information size_t firstentrylen; // Used to process header insertion size_t rawlen, curlen = intrev32ifbe(ZIPLIST_BYTES(zl)); size_t extra = 0, cnt = 0, offset; size_t delta = 4; // The number of additional bytes required to update the prevlen attribute of a node (5-1) unsigned char *tail = zl + intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)); // Tail node pointer // Empty compressed list if (p[0] == ZIP_END) return zl; zipEntry(p, &cur); // A secure version is not required because the input pointer is validated in the function that returns it firstentrylen = prevlen = cur.headersize + cur.len; // Total length of current node = header length + content length prevlensize = zipStorePrevEntryLength(NULL, prevlen); // Calculate prevlen field length prevoffset = p - zl; // Distance zl offset p += prevlen; // Shift back. prevlen here is the total length of the current node, so add and cut to the next node // Iteratively compress the list to find the number of extra bytes needed to update it while (p[0] != ZIP_END) { // Up to the end node assert(zipEntrySafe(zl, curlen, p, &cur, 0)); // Effective node // Abort when prevlen is not updated if (cur.prevrawlen == prevlen) break; // Abort when the prevlensize of the node is large enough if (cur.prevrawlensize >= prevlensize) { if (cur.prevrawlensize == prevlensize) { // prevlen fields are equal in length zipStorePrevEntryLength(p, prevlen); // Set corresponding length } else { // This will lead to shrinkage, which we need to avoid, so set prevlen to the available bytes zipStorePrevEntryLengthLarge(p, prevlen); } break; } // Header node before cur.prevrawlen (possible). If it is a header node, it is naturally 0 // Or original length of the previous node + additional length = = total length of the previous node assert(cur.prevrawlen == 0 || cur.prevrawlen + delta == prevlen); // Update the previous node information and increase the pointer rawlen = cur.headersize + cur.len; // Original length of current node prevlen = rawlen + delta; // Original length + extra length = total length of current node prevlensize = zipStorePrevEntryLength(NULL, prevlen); // Recalculate storage length field size prevoffset = p - zl; // Recalculate the distance zl offset of the node to be updated p += rawlen; // Backward offset extra += delta; // Superimposed supplementary length cnt++; // Replenishment times? } // The extra byte is 0. All updates have been completed or no updates are required if (extra == 0) return zl; // Update tail node offset after loop if (tail == zl + prevoffset) { // // When the last node to be updated happens to be the tail node, the tail node offset is updated unless this is the only node to be updated (in this case, the tail node offset will not change) if (extra - delta != 0) { // The supplementary byte is larger than the additional byte, indicating that the tail node is not the only node to be updated, and there are nodes to be updated in front // Subtracting the additional bytes added by the tail node is the total number of bytes added by the previous node ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra-delta); } } else { // If it is not the tail node, the operation is better. Directly offset the original tail node + supplementary bytes ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra); } // Now the p pointer is at the first byte of the original compressed list that does not need to be changed // Move the subsequent data to the new compressed list offset = p - zl; // The node in front of the offset calculation is the node that needs to be changed? zl = ziplistResize(zl, curlen + extra); // Adjustment zl size = original total length + supplementary length p = zl + offset; // Move directly to the byte that does not need to be updated // Total length - offset - 1 the length to be copied (does not change the total length of the node) // The p+extra target location does not need to update the new location of the node // p copy location memmove(p + extra, p, curlen - offset - 1); p += extra; // Move to the invariant node and then offset prevlen forward // Iterate all nodes that need to be updated from end to end while (cnt) { // Replenishment times zipEntry(zl + prevoffset, &cur); // Get the first node to be updated, in reverse, that is, the last node to be updated in the previous cycle rawlen = cur.headersize + cur.len; // Original length of current node // Move the node to the tail to reset prevlen // Leave aside the current node and save the length of the original length of the previous node (changed, aligned backward) and the actual data length memmove(p - (rawlen - cur.prevrawlensize), zl + prevoffset + cur.prevrawlensize, rawlen - cur.prevrawlensize); p -= (rawlen + delta); // Move p before original length + extra length if (cur.prevrawlen == 0) { // Head node, update its prevlen to the length of the first node zipStorePrevEntryLength(p, firstentrylen); } else { // Otherwise, an additional length of 4 bytes will be added zipStorePrevEntryLength(p, cur.prevrawlen+delta); } // Move forward to the previous node. That's all right. prevoffset -= cur.prevrawlen; cnt--; } return zl; }
__ziplistDelete
From the compressed list, delete num nodes starting from p. Returns a pointer to the compressed list.
unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) { // i cyclic variable // totlen total number of bytes to be deleted // Deleted number of nodes to be deleted deleted deleted < = num unsigned int i, totlen, deleted = 0; size_t offset; int nextdiff = 0; zlentry first, tail; // Delete header node size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl)); zipEntry(p, &first); // Parse first for (i = 0; p[0] != ZIP_END && i < num; i++) { // Zip may be encountered_ End aborts prematurely, which is why deleted < num p += zipRawEntryLengthSafe(zl, zlbytes, p); // Total bytes occupied by node p + = direct backward offset deleted++; } // p has been offset to the last node to be deleted assert(p >= first.p); // Natural migration results totlen = p-first.p; // Total number of bytes removed from deleted nodes if (totlen > 0) { // obviously uint32_t set_tail; if (p[0] != ZIP_END) { // End node not found // Compared with the current prevrawlen, the prevrawlen storing the current node may increase or decrease the number of bytes required // There is a space to store it because it was pre stored when the node was being deleted // first.prevrawlen-p.prevrawlen 0|4|-4 nextdiff = zipPrevLenByteDiff(p,first.prevrawlen); // When p jumps back, there's always room // If the new previous node is large, there will be a 5-byte prevlen header node in the deleted node, so at least 5 bytes must be released here, and we only need 4 bytes p -= nextdiff; // Adjust prevlensize assert(p >= first.p && p<zl+zlbytes-1); zipStorePrevEntryLength(p,first.prevrawlen); // Store prevlen // Tail node offset forward totlen bytes set_tail = intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen; // When the tail contains multiple nodes, you still need to consider nextdiff. Otherwise, the change of prevlen size has no effect on the tail offset assert(zipEntrySafe(zl, zlbytes, p, &tail, 1)); // Parse tail if (p[tail.headersize+tail.len] != ZIP_END) { // Supplement nextdiff set_tail = set_tail + nextdiff; } // Move the tail p to the front of the compressed list // Because of the assertion P > = first. P, we know that totlen > = 0, so p > first. P is saved and cannot overflow, even if the node length is damaged size_t bytes_to_move = zlbytes-(p-zl)-1; // -1 does not contain the end node? memmove(first.p,p,bytes_to_move); } else { // The end node has been reached. The tail node has been deleted and it is no longer necessary to free up space set_tail = (first.p-zl)-first.prevrawlen; // Calculation end node position } // Resize compressed list offset = first.p-zl; zlbytes -= totlen - nextdiff; // Subtract the total number of bytes removed and the size adjustment of prevlen zl = ziplistResize(zl, zlbytes); p = zl+offset; ZIPLIST_INCR_LENGTH(zl,-deleted); // Number of update nodes // Set the tail offset calculated above assert(set_tail <= zlbytes - ZIPLIST_END_SIZE); ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(set_tail); // When nextdiff= 0, the original length of the next node has changed, so it is necessary to cascade update through the compressed list if (nextdiff != 0) zl = __ziplistCascadeUpdate(zl,p); } return zl; }
__ziplistInsert
Insert a new node s at the node p specified in the compressed list zl.
unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) { // curlen total number of bytes in the current compressed list size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen, newlen; // prevlensize prevlen size prevlen size of the previous node unsigned int prevlensize, prevlen = 0; size_t offset; int nextdiff = 0; unsigned char encoding = 0; long long value = 123456789; // Initialization avoidance warning zlentry tail; // Find prevlen and prevlensize for inserted nodes if (p[0] != ZIP_END) { // Not at end node ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); } else { // At the end node unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl); // Tail node if (ptail[0] != ZIP_END) { prevlen = zipRawEntryLengthSafe(zl, curlen, ptail); } } // Detect whether the node to be inserted can be encoded if (zipTryEncoding(s,slen,&value,&encoding)) { // Try encoding // Set the appropriate integer code according to encoding reqlen = zipIntSize(encoding); } else { // Encoding cannot be used. However, zipStoreEntryEncoding can use string length to indicate the encoding method reqlen = slen; } // You need space to store the length of the previous node and its own data reqlen += zipStorePrevEntryLength(NULL,prevlen); // Supplement the previous node length prevlen reqlen += zipStoreEntryEncoding(NULL,encoding,slen); // Supplement the space required by itself // If the access position is not the tail, you need to ensure that the prevlen of the next node can store the length of the new node int forcelarge = 0; // Forced extension // The end node doesn't care. // reqlen.lensize-p.prevlensize 1-5 nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0; if (nextdiff == -4 && reqlen < 4) { nextdiff = 0; forcelarge = 1; } // Store the p offset because realloc may change the zl address offset = p-zl; newlen = curlen+reqlen+nextdiff; // Original length + new node length + prevlen correction zl = ziplistResize(zl,newlen); p = zl+offset; // Retrieve p location // If possible, apply the memory move and update the tail offset if (p[0] != ZIP_END) { // -1 exclude ZIP_END byte // reqlen newEntry len // Move the p-nextdiff to the p+reqlen position with the length of curlen-offset-1+nextdiff // After p, the reqlen space is vacated memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff); // Encode the original length of the previous node (current node) of the next node if (forcelarge) // p+reqlen zipStorePrevEntryLengthLarge(p+reqlen,reqlen); else zipStorePrevEntryLength(p+reqlen,reqlen); // Update tail offset ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen); // When the tail contains multiple nodes, nextdiff needs to be added. Otherwise, the change of prevlen size has no effect on tail offset assert(zipEntrySafe(zl, newlen, p+reqlen, &tail, 1)); if (p[reqlen+tail.headersize+tail.len] != ZIP_END) { ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff); } } else { // The node becomes the new tail ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl); } // When nextdiff is not 0, the original length of the next node will change. So you need to cascade updates throughout the compressed list if (nextdiff != 0) { offset = p-zl; zl = __ziplistCascadeUpdate(zl,p+reqlen); p = zl+offset; } // Write node p += zipStorePrevEntryLength(p,prevlen); // Write prevlen p += zipStoreEntryEncoding(p,encoding,slen); // Write encoding // Write entry data if (ZIP_IS_STR(encoding)) { // String copy memcpy(p,s,slen); } else { zipSaveInteger(p,value,encoding); } ZIPLIST_INCR_LENGTH(zl,1); // Number of update nodes return zl; }
uintCompare
Integer comparison, fast row.
int uintCompare(const void *a, const void *b) { return (*(unsigned int *) a - *(unsigned int *) b); }
ziplistSaveValue
Quickly save a string read from val or lval to the target structure.
/* Helper method to store a string into from val or lval into dest */ static inline void ziplistSaveValue(unsigned char *val, unsigned int len, long long lval, ziplistEntry *dest) { dest->sval = val; dest->slen = len; dest->lval = lval; }
application program interface
ziplistNew
Create a new compressed list. Returns zl the corresponding pointer.
unsigned char *ziplistNew(void) { unsigned int bytes = ZIPLIST_HEADER_SIZE+ZIPLIST_END_SIZE; // Calculation head + end length empty list unsigned char *zl = zmalloc(bytes); // Allocate space ZIPLIST_BYTES(zl) = intrev32ifbe(bytes); // Sets the total length of the compressed list ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE); // Sets the offset to the tail node ZIPLIST_LENGTH(zl) = 0; // Set node number pointer zl[bytes-1] = ZIP_END; // Set end node return zl; // Return zl pointer }
ziplistMerge
Merge two compressed lists, first and second, and append second to first. Reallocate space from the large compressed list to include the new merged list.
unsigned char *ziplistMerge(unsigned char **first, unsigned char **second) { // As long as a list is empty, it aborts and returns NULL if (first == NULL || *first == NULL || second == NULL || *second == NULL) return NULL; // Naturally, equality cannot be combined if (*first == *second) return NULL; size_t first_bytes = intrev32ifbe(ZIPLIST_BYTES(*first)); // Total zl1 bytes size_t first_len = intrev16ifbe(ZIPLIST_LENGTH(*first)); // Total zl1 nodes size_t second_bytes = intrev32ifbe(ZIPLIST_BYTES(*second)); // Total zl2 bytes size_t second_len = intrev16ifbe(ZIPLIST_LENGTH(*second)); // Total zl2 nodes int append; unsigned char *source, *target; // Source & target size_t target_bytes, source_bytes; // Select a larger list as the target list to facilitate resizing // You must also know whether to append or pre append to the target list if (first_len >= second_len) { // Keep zl1 and append zl2 to large zl1 target = *first; // target target_bytes = first_bytes; source = *second; // source source_bytes = second_bytes; append = 1; } else { // Keep zl2 and precede zl1 to zl2 target = *second; target_bytes = second_bytes; source = *first; source_bytes = first_bytes; append = 0; } // Calculate the total number of final required bytes (minus a pair of metadata HEADER+END) size_t zlbytes = first_bytes + second_bytes - ZIPLIST_HEADER_SIZE - ZIPLIST_END_SIZE; size_t zllength = first_len + second_len; // The total number of nodes is added without hindrance // The total number of joint zl nodes shall be less than UINT16_MAX zllen type restrictions zllength = zllength < UINT16_MAX ? zllength : UINT16_MAX; // Save the tail offset position before starting to detach memory size_t first_offset = intrev32ifbe(ZIPLIST_TAIL_OFFSET(*first)); size_t second_offset = intrev32ifbe(ZIPLIST_TAIL_OFFSET(*second)); // Expand the target to a new byte and append or prepose the source target = zrealloc(target, zlbytes); // Reallocate space zlbytes if (append) { // Add // Copy source TARGET - END, SOURCE - HEADER memcpy(target + target_bytes - ZIPLIST_END_SIZE, source + ZIPLIST_HEADER_SIZE, source_bytes - ZIPLIST_HEADER_SIZE); } else { // Front // Move the target content to SOURCE-END, and then copy the source to the free space SOURCE-END // SOURCE-END, TARGET-HEADER memmove(target + source_bytes - ZIPLIST_END_SIZE, target + ZIPLIST_HEADER_SIZE, target_bytes - ZIPLIST_HEADER_SIZE); // Move target backward as a whole memcpy(target, source, source_bytes - ZIPLIST_END_SIZE); // source forward } // Update header meta information zlbytes zllen zltail ZIPLIST_BYTES(target) = intrev32ifbe(zlbytes); ZIPLIST_LENGTH(target) = intrev16ifbe(zllength); ZIPLIST_TAIL_OFFSET(target) = intrev32ifbe( (first_bytes - ZIPLIST_END_SIZE) + (second_offset - ZIPLIST_HEADER_SIZE)); // Cascade update is mainly for prevlen, which starts from target+first_offset start target = __ziplistCascadeUpdate(target, target+first_offset); // Source list release empty if (append) { zfree(*second); *second = NULL; *first = target; } else { zfree(*first); *first = NULL; *second = target; } return target; }
ziplistPush
Insert the new node s into the compressed list zl, specify the where from the beginning or the end, and then call it. ziplistInsert implementation.
unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) { unsigned char *p; p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl); return __ziplistInsert(zl,p,s,slen); }
ziplistIndex
Returns the node at the specified offset of the compressed list iteration, with a negative number starting from the tail.
unsigned char *ziplistIndex(unsigned char *zl, int index) { unsigned char *p; unsigned int prevlensize, prevlen = 0; size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl)); if (index < 0) { // Start from the tail index = (-index)-1; p = ZIPLIST_ENTRY_TAIL(zl); // Get tail node if (p[0] != ZIP_END) { // Non end node ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); while (prevlen > 0 && index--) { // Prevlen > 0 ensure that the head node is not reached p -= prevlen; // Forward offset assert(p >= zl + ZIPLIST_HEADER_SIZE && p < zl + zlbytes - ZIPLIST_END_SIZE); // Assertion p does not overflow within normal range ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); } } } else { // Start from scratch p = ZIPLIST_ENTRY_HEAD(zl); // Get header node while (index--) { p += zipRawEntryLengthSafe(zl, zlbytes, p); if (p[0] == ZIP_END) // To end node break; } } if (p[0] == ZIP_END || index > 0) // End node or idx exceeds zllen return NULL; zipAssertValidEntry(zl, zlbytes, p); // Verify that the node is legal return p; // Return node }
ziplistNext
Returns the next node of the specified node p in the compressed list.
unsigned char *ziplistNext(unsigned char *zl, unsigned char *p) { ((void) zl); size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl)); // The current node or the next node is the end node, and there is no next node if (p[0] == ZIP_END) return NULL; p += zipRawEntryLength(p); if (p[0] == ZIP_END) return NULL; zipAssertValidEntry(zl, zlbytes, p); return p; }
ziplistPrev
Returns the previous node of the specified node p in the compressed list.
unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p) { unsigned int prevlensize, prevlen = 0; // End node returns NULL if (p[0] == ZIP_END) { // If the specified node is the end node, the tail node is obtained to judge whether it is the end node p = ZIPLIST_ENTRY_TAIL(zl); return (p[0] == ZIP_END) ? NULL : p; } else if (p == ZIPLIST_ENTRY_HEAD(zl)) { // The header node returns NULL return NULL; } else { ZIP_DECODE_PREVLEN(p, prevlensize, prevlen); assert(prevlen > 0); // Non head node p-=prevlen; // Forward offset size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl)); zipAssertValidEntry(zl, zlbytes, p); return p; } }
ziplistGet
Obtain the node at pointer p and save the relevant information in * sstr or sval (depending on the node code).
// *sstr string slen string length // sval integer unsigned int ziplistGet(unsigned char *p, unsigned char **sstr, unsigned int *slen, long long *sval) { zlentry entry; if (p == NULL || p[0] == ZIP_END) return 0; if (sstr) *sstr = NULL; // Reset zipEntry(p, &entry); if (ZIP_IS_STR(entry.encoding)) { // Judge whether it is a string if (sstr) { *slen = entry.len; *sstr = p+entry.headersize; } } else { if (sval) { *sval = zipLoadInteger(p+entry.headersize,entry.encoding); } } return 1; }
ziplistInsert
Insert a new node after node p. Direct call__ ziplistInsert implementation.
ziplistDelete
Deletes the specified node.
unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p) { size_t offset = *p-zl; zl = __ziplistDelete(zl,*p,1); *p = zl+offset; return zl; }
ziplistDeleteRange
Delete a series of nodes.
unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num) { unsigned char *p = ziplistIndex(zl,index); return (p == NULL) ? zl : __ziplistDelete(zl,p,num); }
ziplistReplace
Replacing the node at p with s is equivalent to deleting and then inserting.
unsigned char *ziplistReplace(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) { // Get the meta information of the current node zlentry entry; zipEntry(p, &entry); // Calculate the length of storage nodes, including prevlen unsigned int reqlen; unsigned char encoding = 0; long long value = 123456789; if (zipTryEncoding(s,slen,&value,&encoding)) { // Try encoding s reqlen = zipIntSize(encoding); // Number of bytes required for encoding } else { // character string reqlen = slen; /* encoding == 0 */ } reqlen += zipStoreEntryEncoding(NULL,encoding,slen); // encoding length if (reqlen == entry.lensize + entry.len) { // Just right // Simple rewrite node p += entry.prevrawlensize; p += zipStoreEntryEncoding(p,encoding,slen); // encoding // Copy value if (ZIP_IS_STR(encoding)) { memcpy(p,s,slen); } else { zipSaveInteger(p,value,encoding); } } else { // Delete & add zl = ziplistDelete(zl,&p); zl = ziplistInsert(zl,p,s,slen); } return zl; }
ziplistFind
Finds a pointer to a node equal to the specified node.
unsigned char *ziplistFind(unsigned char *zl, unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) { int skipcnt = 0; unsigned char vencoding = 0; long long vll = 0; size_t zlbytes = ziplistBlobLen(zl); while (p[0] != ZIP_END) { // Not an end node struct zlentry e; unsigned char *q; assert(zipEntrySafe(zl, zlbytes, p, &e, 1)); q = p + e.prevrawlensize + e.lensize; if (skipcnt == 0) { // Compare current node with special node if (ZIP_IS_STR(e.encoding)) { // character string if (e.len == vlen && memcmp(q, vstr, vlen) == 0) { // If it is found, it will directly return to node p return p; } } else { // integer if (vencoding == 0) { if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) { // If the node vstr cannot be encoded, set vencoding to UCHAR_MAX. So don't try next time vencoding = UCHAR_MAX; } // Now vencoding must be non-0 assert(vencoding); } // The vencoding is not uchar_ When Max (there is no possibility of such coding and it is not a valid integer), compare the current node with the special node if (vencoding != UCHAR_MAX) { long long ll = zipLoadInteger(q, e.encoding); if (ll == vll) { return p; } } } // Reset skip statistics skipcnt = skip; } else { // Skip node skipcnt--; } p = q + e.len; // Node offset } return NULL; }
ziplistLen
Returns the number of nodes in the compressed list.
unsigned int ziplistLen(unsigned char *zl) { unsigned int len = 0; if (intrev16ifbe(ZIPLIST_LENGTH(zl)) < UINT16_MAX) { // If in uint16_ Within Max len = intrev16ifbe(ZIPLIST_LENGTH(zl)); } else { // Cycle count unsigned char *p = zl+ZIPLIST_HEADER_SIZE; size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl)); while (*p != ZIP_END) { p += zipRawEntryLengthSafe(zl, zlbytes, p); len++; } // If len is less than UINT16_MAX, update zllen of compressed list if (len < UINT16_MAX) ZIPLIST_LENGTH(zl) = intrev16ifbe(len); } return len; }
ziplistBlobLen
Returns the total number of bytes in the compressed list.
size_t ziplistBlobLen(unsigned char *zl) { return intrev32ifbe(ZIPLIST_BYTES(zl)); }
ziplistRepr
Standard print compressed list?
void ziplistRepr(unsigned char *zl) { unsigned char *p; int index = 0; zlentry entry; size_t zlbytes = ziplistBlobLen(zl); printf( "{total bytes %u} " "{num entries %u}\n" "{tail offset %u}\n", intrev32ifbe(ZIPLIST_BYTES(zl)), intrev16ifbe(ZIPLIST_LENGTH(zl)), intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))); p = ZIPLIST_ENTRY_HEAD(zl); while(*p != ZIP_END) { assert(zipEntrySafe(zl, zlbytes, p, &entry, 1)); printf( "{\n" "\taddr 0x%08lx,\n" "\tindex %2d,\n" "\toffset %5lu,\n" "\thdr+entry len: %5u,\n" "\thdr len%2u,\n" "\tprevrawlen: %5u,\n" "\tprevrawlensize: %2u,\n" "\tpayload %5u\n", (long unsigned)p, index, (unsigned long) (p-zl), entry.headersize+entry.len, entry.headersize, entry.prevrawlen, entry.prevrawlensize, entry.len); printf("\tbytes: "); for (unsigned int i = 0; i < entry.headersize+entry.len; i++) { printf("%02x|",p[i]); } printf("\n"); p += entry.headersize; if (ZIP_IS_STR(entry.encoding)) { printf("\t[str]"); if (entry.len > 40) { if (fwrite(p,40,1,stdout) == 0) perror("fwrite"); printf("..."); } else { if (entry.len && fwrite(p,entry.len,1,stdout) == 0) perror("fwrite"); } } else { printf("\t[int]%lld", (long long) zipLoadInteger(p,entry.encoding)); } printf("\n}\n"); p += entry.len; index++; } printf("{end}\n\n"); }
ziplistValidateIntegrity
Verify the integrity of the data structure. Whether deep depth validation, header and node.
int ziplistValidateIntegrity(unsigned char *zl, size_t size, int deep, ziplistValidateEntryCB entry_cb, void *cb_userdata) { // Detect the actual read header size HEADER+END if (size < ZIPLIST_HEADER_SIZE + ZIPLIST_END_SIZE) return 0; // Detects whether the size encoded in the header matches the allocated size size_t bytes = intrev32ifbe(ZIPLIST_BYTES(zl)); if (bytes != size) return 0; // The last byte detected must be a terminator if (zl[size - ZIPLIST_END_SIZE] != ZIP_END) return 0; // The detected tail offset does not overflow the allocated space if (intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)) > size - ZIPLIST_END_SIZE) return 0; // End of non depth verification if (!deep) return 1; unsigned int count = 0; unsigned char *p = ZIPLIST_ENTRY_HEAD(zl); // Positioning head node unsigned char *prev = NULL; size_t prev_raw_size = 0; while(*p != ZIP_END) { // No end byte struct zlentry e; // Decoding node header and tail if (!zipEntrySafe(zl, size, p, &e, 1)) return 0; // Ensure accuracy in describing the size of the previous node if (e.prevrawlen != prev_raw_size) return 0; // Selective callback verification if (entry_cb && !entry_cb(p, cb_userdata)) return 0; // Node offset prev_raw_size = e.headersize + e.len; prev = p; p += e.headersize + e.len; count++; } // Make sure that the zltail node points to the start of the last node if (prev != ZIPLIST_ENTRY_TAIL(zl)) return 0; // Detect the accuracy of the number of nodes counted in the head unsigned int header_count = intrev16ifbe(ZIPLIST_LENGTH(zl)); if (header_count != UINT16_MAX && count != header_count) return 0; return 1; }
ziplistRandomPair
A pair of keys and values are returned randomly and stored in the key and val parameters (Val can be empty if not needed).
// total_count is half of the number of compressed list nodes calculated in advance void ziplistRandomPair(unsigned char *zl, unsigned long total_count, ziplistEntry *key, ziplistEntry *val) { int ret; unsigned char *p; assert(total_count); // Avoid dividing a damaged compressed list by 0 // Even numbers are generated because the compressed list saves k-v pairs int r = (rand() % total_count) * 2; p = ziplistIndex(zl, r); ret = ziplistGet(p, &key->sval, &key->slen, &key->lval); assert(ret != 0); if (!val) return; p = ziplistNext(zl, p); ret = ziplistGet(p, &val->sval, &val->slen, &val->lval); assert(ret != 0); }
ziplistRandomPairs
Arbitrarily returning multiple pairs of key values (the quantity is specified by count) (stored in the keys and vals parameters) may be repeated.
void ziplistRandomPairs(unsigned char *zl, unsigned int count, ziplistEntry *keys, ziplistEntry *vals) { unsigned char *p, *key, *value; unsigned int klen = 0, vlen = 0; long long klval = 0, vlval = 0; // The index attribute must be in the first, because it is used in uintCompare typedef struct { unsigned int index; unsigned int order; } rand_pick; rand_pick *picks = zmalloc(sizeof(rand_pick)*count); unsigned int total_size = ziplistLen(zl)/2; assert(total_size); // Non empty // Create a random index pool (some may be repeated) for (unsigned int i = 0; i < count; i++) { picks[i].index = (rand() % total_size) * 2; // Generate even index picks[i].order = i; // Maintain the order in which they are selected } qsort(picks, count, sizeof(rand_pick), uintCompare); // Fast exhaust according to index // Get the nodes from the compressed list to an output array (in the original order) unsigned int zipindex = 0, pickindex = 0; p = ziplistIndex(zl, 0); // Head node? Why not use ZIPLIST_ENTRY_HEAD positioning while (ziplistGet(p, &key, &klen, &klval) && pickindex < count) { p = ziplistNext(zl, p); // Next node assert(ziplistGet(p, &value, &vlen, &vlval)); // Get val while (pickindex < count && zipindex == picks[pickindex].index) { int storeorder = picks[pickindex].order; ziplistSaveValue(key, klen, klval, &keys[storeorder]); // Store key if (vals) ziplistSaveValue(value, vlen, vlval, &vals[storeorder]); // Storage val pickindex++; } zipindex += 2; p = ziplistNext(zl, p); // Next node } zfree(picks); }
ziplistRandomPairsUnique
Random return of multiple pairs of key values (unique version).
unsigned int ziplistRandomPairsUnique(unsigned char *zl, unsigned int count, ziplistEntry *keys, ziplistEntry *vals) { unsigned char *p, *key; unsigned int klen = 0; long long klval = 0; unsigned int total_size = ziplistLen(zl)/2; unsigned int index = 0; if (count > total_size) count = total_size; /* To only iterate once, every time we try to pick a member, the probability * we pick it is the quotient of the count left we want to pick and the * count still we haven't visited in the dict, this way, we could make every * member be equally picked.*/ p = ziplistIndex(zl, 0); unsigned int picked = 0, remaining = count; // You only need to cycle once, and each time you try to select a node. The probability of selecting it is the quotient of the remaining count to be selected and the number not yet accessed in the dictionary while (picked < count && p) { double randomDouble = ((double)rand()) / RAND_MAX; // Remaining count / count not yet accessed double threshold = ((double)remaining) / (total_size - index); if (randomDouble <= threshold) { assert(ziplistGet(p, &key, &klen, &klval)); ziplistSaveValue(key, klen, klval, &keys[picked]); p = ziplistNext(zl, p); assert(p); if (vals) { assert(ziplistGet(p, &key, &klen, &klval)); ziplistSaveValue(key, klen, klval, &vals[picked]); } remaining--; picked++; } else { p = ziplistNext(zl, p); assert(p); } p = ziplistNext(zl, p); index++; } return picked; }
Summary of this chapter
After reading it, you probably know what kind of data it is. It is a sequential data structure composed of a series of specially encoded continuous memory blocks. It's really compact and saves memory. Each memory block has a different business meaning. Of course, some of these operations are still confused. Let's talk about them later.