In depth interpretation of Redis data type parsing - ZipList

Posted by jpaloyo on Thu, 04 Nov 2021 09:38:22 +0100

Data type analysis of Redis source code - ZipList

The current Redis analysis version is 6.2, which should be noted.

ZipList, a compressed list, can contain multiple nodes at will.

Infrastructure

ZipList

Compress the list. Its overall layout, < zlbytes > < zltail > < zllen > < entry > < entry >... < entry > < Zlend >.

uint32_t zlbytes, the memory byte size occupied by the compressed list, including four bytes of the zlbytes field itself. You need to store this value so that you can resize the entire structure instead of traversing it first.
uint32_t zltail, the offset of the last node in the compressed list. Allows pop-up node operations at the far end of the compressed list without the need for the entire traversal.
uint16_t zllen, number of nodes. When the value of this field is less than UINT16_MAX(2^16-1), the value of this field is the number of nodes. When this value is equal to UINT16_MAX, the real number of nodes needs to traverse the whole list.
uint8_t zlend, mark the special value 0xFF at the end of the compressed list.

Compressed list node

The nodes in the compressed list start with metadata containing two parts of information, followed by entry data. The first is the length of the previous node. The second is the current node encoding, integer or string. The general structure is < prevlen > < encoding > < entry data >, but sometimes encoding includes the node data itself, that is, the simple structure of < prevlen > < encoding >.

prevlen, the length of the previous node (bytes) (the optional value of this value is 1|5). If the length of the previous node is less than 254 bytes, the length of the attribute is 1 byte: the length of the previous node is saved in this byte. If the length of the previous node is greater than or equal to 254 bytes, the attribute length is 5 bytes, the first byte will be set to 254(0xFE), and the remaining four bytes save the length of the previous node.
Encoding, encoding, depends on the content of the node. When the node is a string, the first two bits of the first byte of the attribute will save the encoding type used to store the length of the string, and the rest is the actual length of the string. When the node is an integer, the first two bits are set to 1, and the next two bits are used to represent the integer type stored after the header. The first byte is often enough to determine the node type (integer | string).
- The string 00 encodes a word section long. The length of the saved string is 6 bits, and the maximum length of the string is 2 ^ 6-1 bytes.
- String 01 encodes two bytes long, saves 14 bits of string length (big end), and the maximum length of string is 2 ^ 14-1 bytes.
- The encoding of string 10 is five bytes long, the length of the saved string is 32 bits (big end), and the maximum length of the string is 2 ^ 32-1 bytes. The lower 6 bits of the first byte are not used, 0.
- Integer 11000000, int16_ An integer of type T (two bytes).
- Integer 11010000, int32_ An integer of type T (four bytes).
- Integer 11100000, Int64_ An integer of type T (eight bytes).
- Integer 11110000, 24 bit signed integer.
- Integer 11111110, 8-bit signed integer.
- Integer 1111xxxx, XXXX between 0001 and 1101. Unsigned integer from 0 to 12. This code value is actually 1 to 13. Because 0000 and 1111 cannot be used, it is necessary to subtract 1 from these four bit values to be the correct value.
- 11111111, special tail node of compressed list.

ziplistEntry

Compress the node value standardization template.

// Each node in the compressed list is either a string or an integer
typedef struct {
    // If it is a string, the length is slen
    unsigned char *sval;
    unsigned int slen;
    // If it is an integer, sval is NULL, and lval saves the integer
    long long lval;
} ziplistEntry;

zlentry

Gets the template structure for compressed list node information. This is not the actual coding of nodes, but just to fill them for easy operation.

typedef struct zlentry {
    unsigned int prevrawlensize; // The number of bytes used to encode the length of the previous node?
    unsigned int prevrawlen; // Length of previous node
    unsigned int lensize; // Bytes used to encode the node type or length. For example, a string has a 1 | 2 | 5 byte header, and an integer usually has only one byte.
    unsigned int len; // The actual number of node bytes. For a string, it is the length of the string. For an integer, it depends on its range      
    unsigned int headersize; // Header size = prevrawlensize+lensize
    unsigned char encoding; // Node coding method
    unsigned char *p; // The starting pointer of the node, that is, the length attribute pointing to the previous node.
} zlentry;

Macro constant

ZIP_END

#define ZIP_END 255, the special tail node of the compressed list.

ZIP_BIG_PREVLEN

#define ZIP_BIG_PREVLEN 254, zip for prevlen attribute that represents only one byte before each node_ BIG_ Prevlen-1 is its maximum number of bytes. Otherwise, it is a four byte unsigned integer in the form of FE AA BB CC DD, representing the length of the previous node.

ZIP_STR_MASK

#define ZIP_STR_MASK 0xc0, string mask (1100 0000).

ZIP_INT_MASK

#define ZIP_INT_MASK 0x30, integer mask (0011 0000).

ZIP_STR_06B

#define ZIP_ STR_ 06B (0 < < 6), 6 bits to store string length encoded string (0000).

ZIP_STR_14B

#define ZIP_ STR_ 14b (1 < < 6), 14 bits store string length encoded string (0100 0000).

ZIP_STR_32B

#define ZIP_ STR_ 32B (2 < < 6), 32 bits store string length encoded string (1000 0000).

ZIP_INT_16B

#define ZIP_ INT_ 16b (0xc0 | 0 < < 4), 16 bit signed integer (int16_t) (1100 0000).

ZIP_INT_32B

#define ZIP_ INT_ 32B (0xc0 | 1 < < 4), 32-bit signed integer (int32_t) (1101 0000).

ZIP_INT_64B

#define ZIP_ INT_ 64b (0xc0 | 2 < < 4), 64 bit signed integer (int64_t) (1110 0000).

ZIP_INT_24B

#define ZIP_ INT_ 24B (0xc0 | 3 < < 4), 24 bit signed integer (1111 0000).

ZIP_INT_8B

#define ZIP_INT_8B 0xfe, 8-bit signed integer (1111 1110).

ZIP_INT_IMM_MASK

#define ZIP_INT_IMM_MASK 0x0f, 4-bit unsigned integer mask.

ZIP_INT_IMM_MIN

#define ZIP_INT_IMM_MIN 0xf1, 4-bit unsigned integer, minimum value 0.

ZIP_INT_IMM_MAX

#define ZIP_INT_IMM_MAX 0xfd, 4-bit unsigned integer, max. 12.

INT24_MAX

#define INT24_MAX 0x7fffff, the maximum value of a 24 bit signed integer.

INT24_MIN

#define INT24_MIN (-INT24_MAX - 1), the minimum value of a 24 bit signed integer.

ZIP_ENCODING_SIZE_INVALID

#define ZIP_ENCODING_SIZE_INVALID 0xff, invalid value for encoding size.

Macro function

ZIP_IS_STR

Judge whether the specified encoding enc represents a string. String nodes do not have 11 as the most significant bit of the first byte.

#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)

ZIPLIST_BYTES

Returns a pointer to the total number of bytes contained in the compressed list.

#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))
// Ziplost structure
// zlbytes zltail zllen entry...entry zlend

ZIPLIST_TAIL_OFFSET

Returns the offset pointer of the last node in the compressed list

#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
// zl offset 32-bit zlbytes - > zltail

ZIPLIST_LENGTH

Returns a pointer to the number of nodes in the compressed list. If it is equal to UINT16_MAX, you need to traverse the whole list to calculate the number of nodes.

#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))

ZIPLIST_HEADER_SIZE

Compressed list header size: two 32-bit integers save the total number of bytes and the last node offset, and 16 bit integers are the number of nodes.

#define ZIPLIST_HEADER_SIZE     (sizeof(uint32_t)*2+sizeof(uint16_t))

ZIPLIST_END_SIZE

Compressed list end node size. Only one byte.

#define ZIPLIST_END_SIZE        (sizeof(uint8_t))

ZIPLIST_ENTRY_HEAD

Returns the pointer to the first node in the compressed list. That is, ziplist+headerSize.

#define ZIPLIST_ENTRY_HEAD(zl)  ((zl)+ZIPLIST_HEADER_SIZE)

ZIPLIST_ENTRY_TAIL

Returns the pointer to the last node in the compressed list.

#define ZIPLIST_ENTRY_TAIL(zl)  ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))

ZIPLIST_ENTRY_END

Returns the pointer to the last byte of the compressed list, that is, the end node FF.

#define ZIPLIST_ENTRY_END(zl)   ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)

ZIPLIST_INCR_LENGTH

Increase the length zllen in the compressed list header.

#define ZIPLIST_INCR_LENGTH(zl,incr) { \
    if (ZIPLIST_LENGTH(zl) < UINT16_MAX) \
        ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \
}

ZIPLIST_ENTRY_ZERO

Initialize the compressed list node template structure.

#define ZIPLIST_ENTRY_ZERO(zle) { \
    (zle)->prevrawlensize = (zle)->prevrawlen = 0; \
    (zle)->lensize = (zle)->len = (zle)->headersize = 0; \
    (zle)->encoding = 0; \
    (zle)->p = NULL; \
}

ZIP_ENTRY_ENCODING

Get the encoding method from the ptr pointer byte and set it to the encoding attribute in the zlentry structure.

#define ZIP_ENTRY_ENCODING(ptr, encoding) do {  \
    (encoding) = ((ptr)[0]); \
    if ((encoding) < ZIP_STR_MASK) (encoding) &= ZIP_STR_MASK; \
} while(0)

ZIP_ASSERT_ENCODING

Detect whether the encoding is invalid.

#define ZIP_ASSERT_ENCODING(encoding) do {                                     \
    assert(zipEncodingLenSize(encoding) != ZIP_ENCODING_SIZE_INVALID);         \
} while (0)

ZIP_DECODE_LENGTH

Decode the node type and data length (string length, integer bytes) encoded in ptr. lensize the number of bytes encoded on the node. len node length. Similar to the zipStoreEntryEncoding section.

#define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do {                    \
    if ((encoding) < ZIP_STR_MASK) {                                           \
        // String encoding length 1|2|5
        if ((encoding) == ZIP_STR_06B) {                                       \
            (lensize) = 1;                                                     \
            (len) = (ptr)[0] & 0x3f;                                           \
        } else if ((encoding) == ZIP_STR_14B) {                                \
            (lensize) = 2;                                                     \
            (len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1];                       \
        } else if ((encoding) == ZIP_STR_32B) {                                \
            (lensize) = 5;                                                     \
            (len) = ((ptr)[1] << 24) |                                         \
                    ((ptr)[2] << 16) |                                         \
                    ((ptr)[3] <<  8) |                                         \
                    ((ptr)[4]);                                                \
        } else {                                                               \
            // Exception coding
            (lensize) = 0;  \
            (len) = 0;    \
        }                                                                      \
    } else {                                                                   \
        // The length of integer node encoding is 1 data byte, which needs to be determined according to encoding
        (lensize) = 1;                                                         \
        if ((encoding) == ZIP_INT_8B)  (len) = 1;                              \
        else if ((encoding) == ZIP_INT_16B) (len) = 2;                         \
        else if ((encoding) == ZIP_INT_24B) (len) = 3;                         \
        else if ((encoding) == ZIP_INT_32B) (len) = 4;                         \
        else if ((encoding) == ZIP_INT_64B) (len) = 8;                         \
        else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX)   \
            (len) = 0; /* 4 bit immediate */                                   \
        else                                                                   \
            (lensize) = (len) = 0; // Exception coding\
    }                                                                          \
} while(0)

ZIP_DECODE_PREVLENSIZE

Returns the number of bytes used to encode the length of the previous node. By setting prelensize.

#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do {                          \
    if ((ptr)[0] < ZIP_BIG_PREVLEN) {                                          \
        (prevlensize) = 1;                                                     \
    } else {                                                                   \
        (prevlensize) = 5;                                                     \
    }                                                                          \
} while(0)

ZIP_DECODE_PREVLEN

Returns the length prevlen of the previous node and the number of bytes encoding this length prevlensize.

#define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do {                     \
    ZIP_DECODE_PREVLENSIZE(ptr, prevlensize);                                  \
    if ((prevlensize) == 1) {                                                  \
        (prevlen) = (ptr)[0];                                                  \
    } else { /* prevlensize == 5 */                                            \
        (prevlen) = ((ptr)[4] << 24) |                                         \
                    ((ptr)[3] << 16) |                                         \
                    ((ptr)[2] <<  8) |                                         \
                    ((ptr)[1]);                                                \
    }                                                                          \
} while(0)

Internal function

zipEncodingLenSize

Returns the number of bytes encoded by the node type and length, and returns zip for errors_ ENCODING_ SIZE_ INVALID.

static inline unsigned int zipEncodingLenSize(unsigned char encoding) {
    if (encoding == ZIP_INT_16B || encoding == ZIP_INT_32B ||
        encoding == ZIP_INT_24B || encoding == ZIP_INT_64B ||
        encoding == ZIP_INT_8B)
        return 1; // Integer has only one byte
    if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) //Special integer 4-bit
        return 1;
    if (encoding == ZIP_STR_06B)
        return 1;
    if (encoding == ZIP_STR_14B)
        return 2;
    if (encoding == ZIP_STR_32B)
        return 5;
    return ZIP_ENCODING_SIZE_INVALID;
}

zipIntSize

Returns the bytes required to store an integer encoded by encoding.

static inline unsigned int zipIntSize(unsigned char encoding) {
    switch(encoding) {
    case ZIP_INT_8B:  return 1;
    case ZIP_INT_16B: return 2;
    case ZIP_INT_24B: return 3;
    case ZIP_INT_32B: return 4;
    case ZIP_INT_64B: return 8;
    }
    if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX)
        return 0; /* 4 The bit is directly in encoding */
    redis_unreachable(); // abort
    return 0;
}

zipStoreEntryEncoding

Write the encoding terminal of the node into p. if p is empty, directly return the number of bytes required to encode this length.

unsigned int zipStoreEntryEncoding(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
    unsigned char len = 1, buf[5]; // The maximum length is 5, so it is buf[5]

    if (ZIP_IS_STR(encoding)) { // character string
    	// Because the encoding given may not be for strings, rawlen still needs to judge
        if (rawlen <= 0x3f) { // 1  63 2^6-1 00
            if (!p) return len;
            buf[0] = ZIP_STR_06B | rawlen; // Type and length
            // 00 00 0000
            // 2bits symbol 6bits len
        } else if (rawlen <= 0x3fff) { // 2 10383 2^14-1 01
            len += 1;
            if (!p) return len;
            // ZIP_STR_14B 0100 0000 
            // 0x3f 0011 1111
            buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f); // Two + six
            buf[1] = rawlen & 0xff; // Last eight
            // 0100 0000 0000 0000
            // 2bits symbol 14bits len
        } else { // 5 10
            len += 4;
            if (!p) return len;
            buf[0] = ZIP_STR_32B; // Symbol independent one byte
            // Save length of remaining four bytes
            buf[1] = (rawlen >> 24) & 0xff;
            buf[2] = (rawlen >> 16) & 0xff;
            buf[3] = (rawlen >> 8) & 0xff;
            buf[4] = rawlen & 0xff;
        }
    } else { // integer
        if (!p) return len;
        buf[0] = encoding;
    }
    memcpy(p,buf,len); // storage
    return len;
}

zipStorePrevEntryLengthLarge

Encode the length of the previous node and write p. For larger encodings only (_ziplistCascadeUpdate).

int zipStorePrevEntryLengthLarge(unsigned char *p, unsigned int len) {
    uint32_t u32;
    if (p != NULL) { // Segmentation?
        p[0] = ZIP_BIG_PREVLEN; // 254
        u32 = len;
        memcpy(p+1,&u32,sizeof(u32));
        memrev32ifbe(p+1);
    }
    return 1 + sizeof(uint32_t);
}

zipStorePrevEntryLength

Encode the length of the previous node and write p. If p is null, returns the number of bytes of this length to be encoded.

unsigned int zipStorePrevEntryLength(unsigned char *p, unsigned int len) {
    if (p == NULL) {
        return (len < ZIP_BIG_PREVLEN) ? 1 : sizeof(uint32_t) + 1;
    } else {
        if (len < ZIP_BIG_PREVLEN) { // Big head distinction
            p[0] = len;
            return 1;
        } else {
            return zipStorePrevEntryLengthLarge(p,len);
        }
    }
}

zipPrevLenByteDiff

Returns the length difference of the length of the previous node (prevlen) before encoding. When the size of the current node changes.

int zipPrevLenByteDiff(unsigned char *p, unsigned int len) {
    unsigned int prevlensize; // Length of prevlen
    ZIP_DECODE_PREVLENSIZE(p, prevlensize); // Get old prevlensize
    return zipStorePrevEntryLength(NULL, len) - prevlensize;
}

zipTryEncoding

Detects whether a string node can be converted to an integer.

int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) { // **v integer value * encoding corresponding code
    long long value;

    if (entrylen >= 32 || entrylen == 0) return 0;
    if (string2ll((char*)entry,entrylen,&value)) { // string2ll string conversion to long integer
        // Judge the integer range and determine its encoding type
        if (value >= 0 && value <= 12) {
            *encoding = ZIP_INT_IMM_MIN+value; // Code 1111 xxxx
        } else if (value >= INT8_MIN && value <= INT8_MAX) {
            *encoding = ZIP_INT_8B;
        } else if (value >= INT16_MIN && value <= INT16_MAX) {
            *encoding = ZIP_INT_16B;
        } else if (value >= INT24_MIN && value <= INT24_MAX) {
            *encoding = ZIP_INT_24B;
        } else if (value >= INT32_MIN && value <= INT32_MAX) {
            *encoding = ZIP_INT_32B;
        } else {
            *encoding = ZIP_INT_64B;
        }
        *v = value; // Assignment integer
        return 1;
    }
    return 0;
}

zipSaveInteger

Save the integer value to p.

void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
    int16_t i16;
    int32_t i32;
    int64_t i64;
    if (encoding == ZIP_INT_8B) {
        ((int8_t*)p)[0] = (int8_t)value;
    } else if (encoding == ZIP_INT_16B) {
        i16 = value;
        memcpy(p,&i16,sizeof(i16));
        memrev16ifbe(p);
    } else if (encoding == ZIP_INT_24B) { // 24bits
        i32 = value<<8;
        memrev32ifbe(&i32);
        memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    } else if (encoding == ZIP_INT_32B) {
        i32 = value;
        memcpy(p,&i32,sizeof(i32));
        memrev32ifbe(p);
    } else if (encoding == ZIP_INT_64B) {
        i64 = value;
        memcpy(p,&i64,sizeof(i64));
        memrev64ifbe(p);
    } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
        // The value is in encoding
    } else {
        assert(NULL);
    }
}

zipLoadInteger

Read integer values from p. Reverse operation of zipSaveInteger.

int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) {
    int16_t i16;
    int32_t i32;
    int64_t i64, ret = 0;
    if (encoding == ZIP_INT_8B) {
        ret = ((int8_t*)p)[0];
    } else if (encoding == ZIP_INT_16B) {
        memcpy(&i16,p,sizeof(i16));
        memrev16ifbe(&i16);
        ret = i16;
    } else if (encoding == ZIP_INT_32B) {
        memcpy(&i32,p,sizeof(i32));
        memrev32ifbe(&i32);
        ret = i32;
    } else if (encoding == ZIP_INT_24B) {
        i32 = 0;
        memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t));
        memrev32ifbe(&i32);
        ret = i32>>8;
    } else if (encoding == ZIP_INT_64B) {
        memcpy(&i64,p,sizeof(i64));
        memrev64ifbe(&i64);
        ret = i64;
    } else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
        ret = (encoding & ZIP_INT_IMM_MASK)-1; // - 1 required
    } else {
        assert(NULL);
    }
    return ret;
}

zipEntry

Fill the structure e with the information of node p.

static inline void zipEntry(unsigned char *p, zlentry *e) {
    ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
    ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding);
    ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
    assert(e->lensize != 0); // Verify that encoding is valid
    e->headersize = e->prevrawlensize + e->lensize;
    e->p = p;
}

zipEntrySafe

The secure version of zipEntry. Mainly for untrusted pointers. It ensures that memory outside the ziplost scope is not accessed.

static inline int zipEntrySafe(unsigned char* zl, size_t zlbytes, unsigned char *p, zlentry *e, int validate_prevlen) {
    unsigned char *zlfirst = zl + ZIPLIST_HEADER_SIZE; // Header node pointer
    unsigned char *zllast = zl + zlbytes - ZIPLIST_END_SIZE; // Tail node pointer
    // Determines whether the pointer removes the macro definition function
#define OUT_OF_RANGE(p) (unlikely((p) < zlfirst || (p) > zllast))

    // If there is no possibility of header overflow from the compressed list, take a shortcut (the maximum lensize and prevraw lensize are 5 bytes)
    if (p >= zlfirst && p + 10 < zllast) {
        ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
        ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding);
        ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
        e->headersize = e->prevrawlensize + e->lensize;
        e->p = p;
        if (unlikely(e->lensize == 0)) // Check whether e - > lensize is 0
            return 0;
        if (OUT_OF_RANGE(p + e->headersize + e->len)) // Determine whether the ziplost range is overflowed
            return 0;
        if (validate_prevlen && OUT_OF_RANGE(p - e->prevrawlen)) // Determine whether prevlen overflows
            return 0;
        return 1;
    }

    if (OUT_OF_RANGE(p)) // Detect whether the pointer overflows
        return 0;

    ZIP_DECODE_PREVLENSIZE(p, e->prevrawlensize); // Set prevrawlensize
    if (OUT_OF_RANGE(p + e->prevrawlensize))
        return 0;

    // Check whether the code is valid
    ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding);
    e->lensize = zipEncodingLenSize(e->encoding);
    if (unlikely(e->lensize == ZIP_ENCODING_SIZE_INVALID))
        return 0;

    // Detect whether the node header code overflows
    if (OUT_OF_RANGE(p + e->prevrawlensize + e->lensize))
        return 0;

    // Decode prevlen and node header length
    ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
    ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
    e->headersize = e->prevrawlensize + e->lensize;

    // Detect whether the node overflows
    if (OUT_OF_RANGE(p + e->headersize + e->len))
        return 0;

    // Detect whether prevlen overflows
    if (validate_prevlen && OUT_OF_RANGE(p - e->prevrawlen))
        return 0;

    e->p = p;
    return 1;
#undef OUT_OF_RANGE / / cancel macro definition OUT_OF_RANGE 
}

zipRawEntryLengthSafe

Calculate the total number of bytes occupied by the p node.

static inline unsigned int zipRawEntryLengthSafe(unsigned char* zl, size_t zlbytes, unsigned char *p) {
    zlentry e;
    assert(zipEntrySafe(zl, zlbytes, p, &e, 0));
    return e.headersize + e.len;
}

zipRawEntryLength

Calculate the total number of bytes occupied by the p node. Non secure version of ziprawuntrylengthsafe.

static inline unsigned int zipRawEntryLength(unsigned char *p) {
    zlentry e;
    zipEntry(p, &e);
    return e.headersize + e.len;
}

zipAssertValidEntry

Verify that the node overflows the compressed list range.

static inline void zipAssertValidEntry(unsigned char* zl, size_t zlbytes, unsigned char *p) {
    zlentry e;
    assert(zipEntrySafe(zl, zlbytes, p, &e, 1));
}

ziplistResize

Resize the compressed list.

unsigned char *ziplistResize(unsigned char *zl, unsigned int len) {
    zl = zrealloc(zl,len); // Reallocate space
    ZIPLIST_BYTES(zl) = intrev32ifbe(len); // Set total length
    zl[len-1] = ZIP_END; // Reset end node
    return zl;
}

__ziplistCascadeUpdate

Cascade updates. When a node is inserted, we need to set the prevlen field of the next node to be equal to the length of the currently inserted node. As a result, the length cannot be encoded within 1 byte, and the next node needs to be extended to save the 5-byte encoded prevlen. This can be done freely because it only happens when a node has been inserted (which may lead to realloc and memmove). However, encoding this prevlen may also require corresponding node extensions. This effect may run through the entire compressed list when there is a series of lengths close to ZIP_BIG_PREVLEN node, so it is necessary to detect whether prevlen can be encoded in each continuous node. The inversion of the prevlen field that needs to be shrunk can also cause this effect.

unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) { // It's hard
    // *p points to the first node that does not need to be updated
    zlentry cur; // Template node
    size_t prevlen, prevlensize, prevoffset; // Last update node information
    size_t firstentrylen; // Used to process header insertion
    size_t rawlen, curlen = intrev32ifbe(ZIPLIST_BYTES(zl));
    size_t extra = 0, cnt = 0, offset;
    size_t delta = 4; // The number of additional bytes required to update the prevlen attribute of a node (5-1)
    unsigned char *tail = zl + intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)); // Tail node pointer

    // Empty compressed list
    if (p[0] == ZIP_END) return zl;
	
    zipEntry(p, &cur); // A secure version is not required because the input pointer is validated in the function that returns it
    firstentrylen = prevlen = cur.headersize + cur.len; // Total length of current node = header length + content length
    prevlensize = zipStorePrevEntryLength(NULL, prevlen); // Calculate prevlen field length
    prevoffset = p - zl; // Distance zl offset
    p += prevlen; // Shift back. prevlen here is the total length of the current node, so add and cut to the next node

    // Iteratively compress the list to find the number of extra bytes needed to update it
    while (p[0] != ZIP_END) { // Up to the end node
        assert(zipEntrySafe(zl, curlen, p, &cur, 0)); // Effective node

        // Abort when prevlen is not updated
        if (cur.prevrawlen == prevlen) break;

        // Abort when the prevlensize of the node is large enough
        if (cur.prevrawlensize >= prevlensize) {
            if (cur.prevrawlensize == prevlensize) { // prevlen fields are equal in length
                zipStorePrevEntryLength(p, prevlen); // Set corresponding length
            } else {
                // This will lead to shrinkage, which we need to avoid, so set prevlen to the available bytes
                zipStorePrevEntryLengthLarge(p, prevlen);
            }
            break;
        }

        // Header node before cur.prevrawlen (possible). If it is a header node, it is naturally 0
        // Or original length of the previous node + additional length = = total length of the previous node
        assert(cur.prevrawlen == 0 || cur.prevrawlen + delta == prevlen);

        // Update the previous node information and increase the pointer
        rawlen = cur.headersize + cur.len; // Original length of current node
        prevlen = rawlen + delta;  // Original length + extra length = total length of current node
        prevlensize = zipStorePrevEntryLength(NULL, prevlen); // Recalculate storage length field size
        prevoffset = p - zl; // Recalculate the distance zl offset of the node to be updated
        p += rawlen; // Backward offset
        extra += delta; // Superimposed supplementary length
        cnt++; // Replenishment times?
    }

    // The extra byte is 0. All updates have been completed or no updates are required
    if (extra == 0) return zl;

    // Update tail node offset after loop
    if (tail == zl + prevoffset) { // 
        // When the last node to be updated happens to be the tail node, the tail node offset is updated unless this is the only node to be updated (in this case, the tail node offset will not change)
        if (extra - delta != 0) { // The supplementary byte is larger than the additional byte, indicating that the tail node is not the only node to be updated, and there are nodes to be updated in front
            // Subtracting the additional bytes added by the tail node is the total number of bytes added by the previous node
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra-delta);
        }
    } else {
        // If it is not the tail node, the operation is better. Directly offset the original tail node + supplementary bytes
        ZIPLIST_TAIL_OFFSET(zl) =
            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
    }

    // Now the p pointer is at the first byte of the original compressed list that does not need to be changed
    // Move the subsequent data to the new compressed list
    offset = p - zl; // The node in front of the offset calculation is the node that needs to be changed?
    zl = ziplistResize(zl, curlen + extra); // Adjustment zl size = original total length + supplementary length
    p = zl + offset; // Move directly to the byte that does not need to be updated
    // Total length - offset - 1 the length to be copied (does not change the total length of the node)
    // The p+extra target location does not need to update the new location of the node
    // p copy location
    memmove(p + extra, p, curlen - offset - 1);
    p += extra; // Move to the invariant node and then offset prevlen forward

    // Iterate all nodes that need to be updated from end to end
    while (cnt) { // Replenishment times
        zipEntry(zl + prevoffset, &cur); // Get the first node to be updated, in reverse, that is, the last node to be updated in the previous cycle
        rawlen = cur.headersize + cur.len; // Original length of current node
        // Move the node to the tail to reset prevlen
        // Leave aside the current node and save the length of the original length of the previous node (changed, aligned backward) and the actual data length
        memmove(p - (rawlen - cur.prevrawlensize), 
                zl + prevoffset + cur.prevrawlensize, 
                rawlen - cur.prevrawlensize);
        p -= (rawlen + delta); // Move p before original length + extra length
        if (cur.prevrawlen == 0) { // Head node, update its prevlen to the length of the first node
            zipStorePrevEntryLength(p, firstentrylen);
        } else { // Otherwise, an additional length of 4 bytes will be added
            zipStorePrevEntryLength(p, cur.prevrawlen+delta);
        }
        // Move forward to the previous node. That's all right.
        prevoffset -= cur.prevrawlen;
        cnt--;
    }
    return zl;
}

__ziplistDelete

From the compressed list, delete num nodes starting from p. Returns a pointer to the compressed list.

unsigned char *__ziplistDelete(unsigned char *zl, unsigned char *p, unsigned int num) {
    // i cyclic variable
    // totlen total number of bytes to be deleted
    // Deleted number of nodes to be deleted deleted deleted < = num
    unsigned int i, totlen, deleted = 0;
    size_t offset;
    int nextdiff = 0;
    zlentry first, tail; // Delete header node
    size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl)); 

    zipEntry(p, &first); // Parse first
    for (i = 0; p[0] != ZIP_END && i < num; i++) { // Zip may be encountered_ End aborts prematurely, which is why deleted < num
        p += zipRawEntryLengthSafe(zl, zlbytes, p); // Total bytes occupied by node p + = direct backward offset
        deleted++;
    }
	// p has been offset to the last node to be deleted
    assert(p >= first.p); // Natural migration results
    totlen = p-first.p; // Total number of bytes removed from deleted nodes
    if (totlen > 0) { // obviously
        uint32_t set_tail;
        if (p[0] != ZIP_END) { // End node not found
            // Compared with the current prevrawlen, the prevrawlen storing the current node may increase or decrease the number of bytes required
            // There is a space to store it because it was pre stored when the node was being deleted
            // first.prevrawlen-p.prevrawlen 0|4|-4
            nextdiff = zipPrevLenByteDiff(p,first.prevrawlen);

            // When p jumps back, there's always room
            // If the new previous node is large, there will be a 5-byte prevlen header node in the deleted node, so at least 5 bytes must be released here, and we only need 4 bytes
            p -= nextdiff; // Adjust prevlensize 
            assert(p >= first.p && p<zl+zlbytes-1);
            zipStorePrevEntryLength(p,first.prevrawlen); // Store prevlen

            // Tail node offset forward totlen bytes
            set_tail = intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))-totlen;

            // When the tail contains multiple nodes, you still need to consider nextdiff. Otherwise, the change of prevlen size has no effect on the tail offset
            assert(zipEntrySafe(zl, zlbytes, p, &tail, 1)); // Parse tail
            if (p[tail.headersize+tail.len] != ZIP_END) { // Supplement nextdiff
                set_tail = set_tail + nextdiff;
            }

            // Move the tail p to the front of the compressed list
            // Because of the assertion P > = first. P, we know that totlen > = 0, so p > first. P is saved and cannot overflow, even if the node length is damaged
            size_t bytes_to_move = zlbytes-(p-zl)-1; // -1 does not contain the end node?
            memmove(first.p,p,bytes_to_move);
        } else { // The end node has been reached. The tail node has been deleted and it is no longer necessary to free up space
            set_tail = (first.p-zl)-first.prevrawlen; // Calculation end node position
        }

        // Resize compressed list
        offset = first.p-zl;
        zlbytes -= totlen - nextdiff; // Subtract the total number of bytes removed and the size adjustment of prevlen
        zl = ziplistResize(zl, zlbytes);
        p = zl+offset;

        ZIPLIST_INCR_LENGTH(zl,-deleted); // Number of update nodes

        // Set the tail offset calculated above
        assert(set_tail <= zlbytes - ZIPLIST_END_SIZE);
        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(set_tail);

        // When nextdiff= 0, the original length of the next node has changed, so it is necessary to cascade update through the compressed list
        if (nextdiff != 0)
            zl = __ziplistCascadeUpdate(zl,p);
    }
    return zl;
}

__ziplistInsert

Insert a new node s at the node p specified in the compressed list zl.

unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {
    // curlen total number of bytes in the current compressed list
    size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), reqlen, newlen;
    // prevlensize prevlen size prevlen size of the previous node
    unsigned int prevlensize, prevlen = 0;
    size_t offset;
    int nextdiff = 0;
    unsigned char encoding = 0;
    
    long long value = 123456789; // Initialization avoidance warning
    zlentry tail;

    // Find prevlen and prevlensize for inserted nodes
    if (p[0] != ZIP_END) { // Not at end node
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
    } else { // At the end node
        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl); // Tail node
        if (ptail[0] != ZIP_END) {
            prevlen = zipRawEntryLengthSafe(zl, curlen, ptail);
        }
    }

    // Detect whether the node to be inserted can be encoded
    if (zipTryEncoding(s,slen,&value,&encoding)) { // Try encoding
        // Set the appropriate integer code according to encoding
        reqlen = zipIntSize(encoding);
    } else {
        // Encoding cannot be used. However, zipStoreEntryEncoding can use string length to indicate the encoding method
        reqlen = slen;
    }
    // You need space to store the length of the previous node and its own data
    reqlen += zipStorePrevEntryLength(NULL,prevlen); // Supplement the previous node length prevlen
    reqlen += zipStoreEntryEncoding(NULL,encoding,slen); // Supplement the space required by itself

    // If the access position is not the tail, you need to ensure that the prevlen of the next node can store the length of the new node
    int forcelarge = 0; // Forced extension
    // The end node doesn't care.
    // reqlen.lensize-p.prevlensize 1-5
    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;
    if (nextdiff == -4 && reqlen < 4) {
        nextdiff = 0;
        forcelarge = 1;
    }

    // Store the p offset because realloc may change the zl address
    offset = p-zl;
    newlen = curlen+reqlen+nextdiff; // Original length + new node length + prevlen correction
    zl = ziplistResize(zl,newlen);
    p = zl+offset; // Retrieve p location

    // If possible, apply the memory move and update the tail offset
    if (p[0] != ZIP_END) {
        // -1 exclude ZIP_END byte
        // reqlen newEntry len
        // Move the p-nextdiff to the p+reqlen position with the length of curlen-offset-1+nextdiff
        // After p, the reqlen space is vacated
        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

        // Encode the original length of the previous node (current node) of the next node
        if (forcelarge) // p+reqlen
            zipStorePrevEntryLengthLarge(p+reqlen,reqlen);
        else
            zipStorePrevEntryLength(p+reqlen,reqlen);

        // Update tail offset
        ZIPLIST_TAIL_OFFSET(zl) =
            intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+reqlen);

        // When the tail contains multiple nodes, nextdiff needs to be added. Otherwise, the change of prevlen size has no effect on tail offset
        assert(zipEntrySafe(zl, newlen, p+reqlen, &tail, 1));
        if (p[reqlen+tail.headersize+tail.len] != ZIP_END) {
            ZIPLIST_TAIL_OFFSET(zl) =
                intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+nextdiff);
        }
    } else { // The node becomes the new tail
        ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(p-zl);
    }

    // When nextdiff is not 0, the original length of the next node will change. So you need to cascade updates throughout the compressed list
    if (nextdiff != 0) {
        offset = p-zl;
        zl = __ziplistCascadeUpdate(zl,p+reqlen);
        p = zl+offset;
    }

    // Write node
    p += zipStorePrevEntryLength(p,prevlen); // Write prevlen
    p += zipStoreEntryEncoding(p,encoding,slen); // Write encoding
    // Write entry data
    if (ZIP_IS_STR(encoding)) { // String copy
        memcpy(p,s,slen);
    } else {
        zipSaveInteger(p,value,encoding);
    }
    ZIPLIST_INCR_LENGTH(zl,1); // Number of update nodes
    return zl;
}

uintCompare

Integer comparison, fast row.

int uintCompare(const void *a, const void *b) {
    return (*(unsigned int *) a - *(unsigned int *) b);
}

ziplistSaveValue

Quickly save a string read from val or lval to the target structure.

/* Helper method to store a string into from val or lval into dest */
static inline void ziplistSaveValue(unsigned char *val, unsigned int len, long long lval, ziplistEntry *dest) {
    dest->sval = val;
    dest->slen = len;
    dest->lval = lval;
}

application program interface

ziplistNew

Create a new compressed list. Returns zl the corresponding pointer.

unsigned char *ziplistNew(void) {
    unsigned int bytes = ZIPLIST_HEADER_SIZE+ZIPLIST_END_SIZE; // Calculation head + end length empty list
    unsigned char *zl = zmalloc(bytes); // Allocate space
    ZIPLIST_BYTES(zl) = intrev32ifbe(bytes); // Sets the total length of the compressed list
    ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE); // Sets the offset to the tail node
    ZIPLIST_LENGTH(zl) = 0; // Set node number pointer
    zl[bytes-1] = ZIP_END; // Set end node
    return zl; // Return zl pointer
}

ziplistMerge

Merge two compressed lists, first and second, and append second to first. Reallocate space from the large compressed list to include the new merged list.

unsigned char *ziplistMerge(unsigned char **first, unsigned char **second) {
    // As long as a list is empty, it aborts and returns NULL
    if (first == NULL || *first == NULL || second == NULL || *second == NULL) return NULL;

    // Naturally, equality cannot be combined
    if (*first == *second) return NULL;

    size_t first_bytes = intrev32ifbe(ZIPLIST_BYTES(*first)); // Total zl1 bytes
    size_t first_len = intrev16ifbe(ZIPLIST_LENGTH(*first)); // Total zl1 nodes

    size_t second_bytes = intrev32ifbe(ZIPLIST_BYTES(*second)); // Total zl2 bytes
    size_t second_len = intrev16ifbe(ZIPLIST_LENGTH(*second)); // Total zl2 nodes

    int append;
    unsigned char *source, *target; // Source & target
    size_t target_bytes, source_bytes;
    // Select a larger list as the target list to facilitate resizing
    // You must also know whether to append or pre append to the target list
    if (first_len >= second_len) { // Keep zl1 and append zl2 to large zl1
        target = *first; // target
        target_bytes = first_bytes;
        source = *second; // source
        source_bytes = second_bytes;
        append = 1;
    } else { // Keep zl2 and precede zl1 to zl2
        target = *second;
        target_bytes = second_bytes;
        source = *first;
        source_bytes = first_bytes;
        append = 0;
    }

    // Calculate the total number of final required bytes (minus a pair of metadata HEADER+END)
    size_t zlbytes = first_bytes + second_bytes -
                     ZIPLIST_HEADER_SIZE - ZIPLIST_END_SIZE;
    size_t zllength = first_len + second_len; // The total number of nodes is added without hindrance

    // The total number of joint zl nodes shall be less than UINT16_MAX zllen type restrictions
    zllength = zllength < UINT16_MAX ? zllength : UINT16_MAX;

    // Save the tail offset position before starting to detach memory
    size_t first_offset = intrev32ifbe(ZIPLIST_TAIL_OFFSET(*first));
    size_t second_offset = intrev32ifbe(ZIPLIST_TAIL_OFFSET(*second));

    // Expand the target to a new byte and append or prepose the source
    target = zrealloc(target, zlbytes); // Reallocate space zlbytes
    if (append) { // Add
        // Copy source TARGET - END, SOURCE - HEADER
        memcpy(target + target_bytes - ZIPLIST_END_SIZE,
               source + ZIPLIST_HEADER_SIZE,
               source_bytes - ZIPLIST_HEADER_SIZE);
    } else { // Front
        // Move the target content to SOURCE-END, and then copy the source to the free space SOURCE-END
        // SOURCE-END, TARGET-HEADER
        memmove(target + source_bytes - ZIPLIST_END_SIZE,
                target + ZIPLIST_HEADER_SIZE,
                target_bytes - ZIPLIST_HEADER_SIZE); // Move target backward as a whole
        memcpy(target, source, source_bytes - ZIPLIST_END_SIZE); // source forward
    }

    // Update header meta information zlbytes zllen zltail
    ZIPLIST_BYTES(target) = intrev32ifbe(zlbytes);
    ZIPLIST_LENGTH(target) = intrev16ifbe(zllength);
    ZIPLIST_TAIL_OFFSET(target) = intrev32ifbe(
                                   (first_bytes - ZIPLIST_END_SIZE) +
                                   (second_offset - ZIPLIST_HEADER_SIZE));

    // Cascade update is mainly for prevlen, which starts from target+first_offset start
    target = __ziplistCascadeUpdate(target, target+first_offset);

    // Source list release empty
    if (append) {
        zfree(*second);
        *second = NULL;
        *first = target;
    } else {
        zfree(*first);
        *first = NULL;
        *second = target;
    }
    return target;
}

ziplistPush

Insert the new node s into the compressed list zl, specify the where from the beginning or the end, and then call it. ziplistInsert implementation.

unsigned char *ziplistPush(unsigned char *zl, unsigned char *s, unsigned int slen, int where) {
    unsigned char *p;
    p = (where == ZIPLIST_HEAD) ? ZIPLIST_ENTRY_HEAD(zl) : ZIPLIST_ENTRY_END(zl);
    return __ziplistInsert(zl,p,s,slen);
}

ziplistIndex

Returns the node at the specified offset of the compressed list iteration, with a negative number starting from the tail.

unsigned char *ziplistIndex(unsigned char *zl, int index) {
    unsigned char *p;
    unsigned int prevlensize, prevlen = 0;
    size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl));
    if (index < 0) { // Start from the tail
        index = (-index)-1;
        p = ZIPLIST_ENTRY_TAIL(zl); // Get tail node
        if (p[0] != ZIP_END) { // Non end node
            ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
            while (prevlen > 0 && index--) { // Prevlen > 0 ensure that the head node is not reached
                p -= prevlen; // Forward offset
                assert(p >= zl + ZIPLIST_HEADER_SIZE && p < zl + zlbytes - ZIPLIST_END_SIZE); // Assertion p does not overflow within normal range
                ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
            }
        }
    } else { // Start from scratch
        p = ZIPLIST_ENTRY_HEAD(zl); // Get header node
        while (index--) {
            p += zipRawEntryLengthSafe(zl, zlbytes, p);
            if (p[0] == ZIP_END) // To end node
                break;
        }
    }
    if (p[0] == ZIP_END || index > 0) // End node or idx exceeds zllen
        return NULL;
    zipAssertValidEntry(zl, zlbytes, p); // Verify that the node is legal
    return p; // Return node
}

ziplistNext

Returns the next node of the specified node p in the compressed list.

unsigned char *ziplistNext(unsigned char *zl, unsigned char *p) {
    ((void) zl);
    size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl));
	// The current node or the next node is the end node, and there is no next node
    if (p[0] == ZIP_END) return NULL;
    p += zipRawEntryLength(p);
    if (p[0] == ZIP_END) return NULL;
    zipAssertValidEntry(zl, zlbytes, p);
    return p;
}

ziplistPrev

Returns the previous node of the specified node p in the compressed list.

unsigned char *ziplistPrev(unsigned char *zl, unsigned char *p) {
    unsigned int prevlensize, prevlen = 0;
	// End node returns NULL
    if (p[0] == ZIP_END) { // If the specified node is the end node, the tail node is obtained to judge whether it is the end node
        p = ZIPLIST_ENTRY_TAIL(zl);
        return (p[0] == ZIP_END) ? NULL : p;
    } else if (p == ZIPLIST_ENTRY_HEAD(zl)) { // The header node returns NULL
        return NULL;
    } else {
        ZIP_DECODE_PREVLEN(p, prevlensize, prevlen);
        assert(prevlen > 0); // Non head node
        p-=prevlen; // Forward offset
        size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl));
        zipAssertValidEntry(zl, zlbytes, p);
        return p;
    }
}

ziplistGet

Obtain the node at pointer p and save the relevant information in * sstr or sval (depending on the node code).

// *sstr string slen string length
// sval integer
unsigned int ziplistGet(unsigned char *p, unsigned char **sstr, unsigned int *slen, long long *sval) {
    zlentry entry;
    if (p == NULL || p[0] == ZIP_END) return 0;
    if (sstr) *sstr = NULL; // Reset

    zipEntry(p, &entry);
    if (ZIP_IS_STR(entry.encoding)) { // Judge whether it is a string
        if (sstr) {
            *slen = entry.len;
            *sstr = p+entry.headersize;
        }
    } else {
        if (sval) {
            *sval = zipLoadInteger(p+entry.headersize,entry.encoding);
        }
    }
    return 1;
}

ziplistInsert

Insert a new node after node p. Direct call__ ziplistInsert implementation.

ziplistDelete

Deletes the specified node.

unsigned char *ziplistDelete(unsigned char *zl, unsigned char **p) {
    size_t offset = *p-zl;
    zl = __ziplistDelete(zl,*p,1);
    *p = zl+offset;
    return zl;
}

ziplistDeleteRange

Delete a series of nodes.

unsigned char *ziplistDeleteRange(unsigned char *zl, int index, unsigned int num) {
    unsigned char *p = ziplistIndex(zl,index);
    return (p == NULL) ? zl : __ziplistDelete(zl,p,num);
}

ziplistReplace

Replacing the node at p with s is equivalent to deleting and then inserting.

unsigned char *ziplistReplace(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {

    // Get the meta information of the current node
    zlentry entry;
    zipEntry(p, &entry);

    // Calculate the length of storage nodes, including prevlen
    unsigned int reqlen;
    unsigned char encoding = 0;
    long long value = 123456789;
    if (zipTryEncoding(s,slen,&value,&encoding)) { // Try encoding s
        reqlen = zipIntSize(encoding); // Number of bytes required for encoding
    } else { // character string
        reqlen = slen; /* encoding == 0 */
    }
    reqlen += zipStoreEntryEncoding(NULL,encoding,slen); // encoding length

    if (reqlen == entry.lensize + entry.len) { // Just right
        // Simple rewrite node
        p += entry.prevrawlensize;
        p += zipStoreEntryEncoding(p,encoding,slen); // encoding
        // Copy value
        if (ZIP_IS_STR(encoding)) {
            memcpy(p,s,slen);
        } else {
            zipSaveInteger(p,value,encoding);
        }
    } else { // Delete & add
        zl = ziplistDelete(zl,&p);
        zl = ziplistInsert(zl,p,s,slen);
    }
    return zl;
}

ziplistFind

Finds a pointer to a node equal to the specified node.

unsigned char *ziplistFind(unsigned char *zl, unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) {
    int skipcnt = 0;
    unsigned char vencoding = 0;
    long long vll = 0;
    size_t zlbytes = ziplistBlobLen(zl);

    while (p[0] != ZIP_END) { // Not an end node
        struct zlentry e;
        unsigned char *q;

        assert(zipEntrySafe(zl, zlbytes, p, &e, 1));
        q = p + e.prevrawlensize + e.lensize;

        if (skipcnt == 0) {
            // Compare current node with special node
            if (ZIP_IS_STR(e.encoding)) { // character string
                if (e.len == vlen && memcmp(q, vstr, vlen) == 0) { // If it is found, it will directly return to node p
                    return p;
                }
            } else { // integer
                if (vencoding == 0) {
                    if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) {
                        // If the node vstr cannot be encoded, set vencoding to UCHAR_MAX.  So don't try next time
                        vencoding = UCHAR_MAX;
                    }
                    // Now vencoding must be non-0
                    assert(vencoding);
                }

                // The vencoding is not uchar_ When Max (there is no possibility of such coding and it is not a valid integer), compare the current node with the special node
                if (vencoding != UCHAR_MAX) {
                    long long ll = zipLoadInteger(q, e.encoding);
                    if (ll == vll) {
                        return p;
                    }
                }
            }

            // Reset skip statistics
            skipcnt = skip;
        } else { // Skip node
            skipcnt--;
        }
        p = q + e.len; // Node offset
    }

    return NULL;
}

ziplistLen

Returns the number of nodes in the compressed list.

unsigned int ziplistLen(unsigned char *zl) {
    unsigned int len = 0;
    if (intrev16ifbe(ZIPLIST_LENGTH(zl)) < UINT16_MAX) { // If in uint16_ Within Max
        len = intrev16ifbe(ZIPLIST_LENGTH(zl));
    } else { // Cycle count
        unsigned char *p = zl+ZIPLIST_HEADER_SIZE;
        size_t zlbytes = intrev32ifbe(ZIPLIST_BYTES(zl));
        while (*p != ZIP_END) {
            p += zipRawEntryLengthSafe(zl, zlbytes, p);
            len++;
        }

        // If len is less than UINT16_MAX, update zllen of compressed list
        if (len < UINT16_MAX) ZIPLIST_LENGTH(zl) = intrev16ifbe(len);
    }
    return len;
}

ziplistBlobLen

Returns the total number of bytes in the compressed list.

size_t ziplistBlobLen(unsigned char *zl) {
    return intrev32ifbe(ZIPLIST_BYTES(zl));
}

ziplistRepr

Standard print compressed list?

void ziplistRepr(unsigned char *zl) {
    unsigned char *p;
    int index = 0;
    zlentry entry;
    size_t zlbytes = ziplistBlobLen(zl);

    printf(
        "{total bytes %u} "
        "{num entries %u}\n"
        "{tail offset %u}\n",
        intrev32ifbe(ZIPLIST_BYTES(zl)),
        intrev16ifbe(ZIPLIST_LENGTH(zl)),
        intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)));
    p = ZIPLIST_ENTRY_HEAD(zl);
    while(*p != ZIP_END) {
        assert(zipEntrySafe(zl, zlbytes, p, &entry, 1));
        printf(
            "{\n"
                "\taddr 0x%08lx,\n"
                "\tindex %2d,\n"
                "\toffset %5lu,\n"
                "\thdr+entry len: %5u,\n"
                "\thdr len%2u,\n"
                "\tprevrawlen: %5u,\n"
                "\tprevrawlensize: %2u,\n"
                "\tpayload %5u\n",
            (long unsigned)p,
            index,
            (unsigned long) (p-zl),
            entry.headersize+entry.len,
            entry.headersize,
            entry.prevrawlen,
            entry.prevrawlensize,
            entry.len);
        printf("\tbytes: ");
        for (unsigned int i = 0; i < entry.headersize+entry.len; i++) {
            printf("%02x|",p[i]);
        }
        printf("\n");
        p += entry.headersize;
        if (ZIP_IS_STR(entry.encoding)) {
            printf("\t[str]");
            if (entry.len > 40) {
                if (fwrite(p,40,1,stdout) == 0) perror("fwrite");
                printf("...");
            } else {
                if (entry.len &&
                    fwrite(p,entry.len,1,stdout) == 0) perror("fwrite");
            }
        } else {
            printf("\t[int]%lld", (long long) zipLoadInteger(p,entry.encoding));
        }
        printf("\n}\n");
        p += entry.len;
        index++;
    }
    printf("{end}\n\n");
}

ziplistValidateIntegrity

Verify the integrity of the data structure. Whether deep depth validation, header and node.

int ziplistValidateIntegrity(unsigned char *zl, size_t size, int deep,
    ziplistValidateEntryCB entry_cb, void *cb_userdata) {
    // Detect the actual read header size HEADER+END
    if (size < ZIPLIST_HEADER_SIZE + ZIPLIST_END_SIZE) return 0;

    // Detects whether the size encoded in the header matches the allocated size
    size_t bytes = intrev32ifbe(ZIPLIST_BYTES(zl));
    if (bytes != size) return 0;

    // The last byte detected must be a terminator
    if (zl[size - ZIPLIST_END_SIZE] != ZIP_END) return 0;

    // The detected tail offset does not overflow the allocated space
    if (intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)) > size - ZIPLIST_END_SIZE) return 0;

    // End of non depth verification
    if (!deep) return 1;

    unsigned int count = 0;
    unsigned char *p = ZIPLIST_ENTRY_HEAD(zl); // Positioning head node
    unsigned char *prev = NULL;
    size_t prev_raw_size = 0;
    while(*p != ZIP_END) { // No end byte
        struct zlentry e;
        // Decoding node header and tail
        if (!zipEntrySafe(zl, size, p, &e, 1)) return 0;

        // Ensure accuracy in describing the size of the previous node
        if (e.prevrawlen != prev_raw_size) return 0;

        // Selective callback verification
        if (entry_cb && !entry_cb(p, cb_userdata)) return 0;

        // Node offset
        prev_raw_size = e.headersize + e.len;
        prev = p;
        p += e.headersize + e.len;
        count++;
    }

    // Make sure that the zltail node points to the start of the last node
    if (prev != ZIPLIST_ENTRY_TAIL(zl)) return 0;

    // Detect the accuracy of the number of nodes counted in the head
    unsigned int header_count = intrev16ifbe(ZIPLIST_LENGTH(zl));
    if (header_count != UINT16_MAX && count != header_count) return 0;

    return 1;
}

ziplistRandomPair

A pair of keys and values are returned randomly and stored in the key and val parameters (Val can be empty if not needed).

 // total_count is half of the number of compressed list nodes calculated in advance
void ziplistRandomPair(unsigned char *zl, unsigned long total_count, ziplistEntry *key, ziplistEntry *val) {
    int ret;
    unsigned char *p;

    assert(total_count); // Avoid dividing a damaged compressed list by 0

    // Even numbers are generated because the compressed list saves k-v pairs
    int r = (rand() % total_count) * 2;
    p = ziplistIndex(zl, r);
    ret = ziplistGet(p, &key->sval, &key->slen, &key->lval);
    assert(ret != 0);

    if (!val) return;
    p = ziplistNext(zl, p);
    ret = ziplistGet(p, &val->sval, &val->slen, &val->lval);
    assert(ret != 0);
}

ziplistRandomPairs

Arbitrarily returning multiple pairs of key values (the quantity is specified by count) (stored in the keys and vals parameters) may be repeated.

void ziplistRandomPairs(unsigned char *zl, unsigned int count, ziplistEntry *keys, ziplistEntry *vals) {
    unsigned char *p, *key, *value;
    unsigned int klen = 0, vlen = 0;
    long long klval = 0, vlval = 0;

    // The index attribute must be in the first, because it is used in uintCompare
    typedef struct {
        unsigned int index;
        unsigned int order;
    } rand_pick; 
    rand_pick *picks = zmalloc(sizeof(rand_pick)*count);
    unsigned int total_size = ziplistLen(zl)/2;
    assert(total_size); // Non empty

    // Create a random index pool (some may be repeated)
    for (unsigned int i = 0; i < count; i++) {
        picks[i].index = (rand() % total_size) * 2; // Generate even index
        picks[i].order = i; // Maintain the order in which they are selected
    }
    qsort(picks, count, sizeof(rand_pick), uintCompare); // Fast exhaust according to index

    // Get the nodes from the compressed list to an output array (in the original order)
    unsigned int zipindex = 0, pickindex = 0;
    p = ziplistIndex(zl, 0); // Head node? Why not use ZIPLIST_ENTRY_HEAD positioning
    while (ziplistGet(p, &key, &klen, &klval) && pickindex < count) {
        p = ziplistNext(zl, p); // Next node
        assert(ziplistGet(p, &value, &vlen, &vlval)); // Get val
        while (pickindex < count && zipindex == picks[pickindex].index) {
            int storeorder = picks[pickindex].order;
            ziplistSaveValue(key, klen, klval, &keys[storeorder]); // Store key
            if (vals)
                ziplistSaveValue(value, vlen, vlval, &vals[storeorder]); // Storage val
             pickindex++;
        }
        zipindex += 2;
        p = ziplistNext(zl, p); // Next node
    }

    zfree(picks);
}

ziplistRandomPairsUnique

Random return of multiple pairs of key values (unique version).

unsigned int ziplistRandomPairsUnique(unsigned char *zl, unsigned int count, ziplistEntry *keys, ziplistEntry *vals) {
    unsigned char *p, *key;
    unsigned int klen = 0;
    long long klval = 0;
    unsigned int total_size = ziplistLen(zl)/2;
    unsigned int index = 0;
    if (count > total_size) count = total_size;

    /* To only iterate once, every time we try to pick a member, the probability
     * we pick it is the quotient of the count left we want to pick and the
     * count still we haven't visited in the dict, this way, we could make every
     * member be equally picked.*/
    p = ziplistIndex(zl, 0);
    unsigned int picked = 0, remaining = count;
    // You only need to cycle once, and each time you try to select a node. The probability of selecting it is the quotient of the remaining count to be selected and the number not yet accessed in the dictionary
    while (picked < count && p) {
        double randomDouble = ((double)rand()) / RAND_MAX;
        // Remaining count / count not yet accessed
        double threshold = ((double)remaining) / (total_size - index);
        if (randomDouble <= threshold) {
            assert(ziplistGet(p, &key, &klen, &klval));
            ziplistSaveValue(key, klen, klval, &keys[picked]);
            p = ziplistNext(zl, p);
            assert(p);
            if (vals) {
                assert(ziplistGet(p, &key, &klen, &klval));
                ziplistSaveValue(key, klen, klval, &vals[picked]);
            }
            remaining--;
            picked++;
        } else {
            p = ziplistNext(zl, p);
            assert(p);
        }
        p = ziplistNext(zl, p);
        index++;
    }
    return picked;
}

Summary of this chapter

After reading it, you probably know what kind of data it is. It is a sequential data structure composed of a series of specially encoded continuous memory blocks. It's really compact and saves memory. Each memory block has a different business meaning. Of course, some of these operations are still confused. Let's talk about them later.

Topics: Redis list

Programmer Think