Reference for this blog:

Redis Deep Adventures: Core Principles and Applied Practices

Redis internal data structure (4) – Ziplist

Redis’ ZipList ZipList

In my last blog post, I gave you a brief introduction to the secrets of SDS in Redis, and explained that Redis is so fast, and there is a very important, but often overlooked point, that is Redis carefully designed data structure. In this blog post, we will continue the topic and introduce you to another underlying data structure of Redis: Ziplist.

In Redis, there are five basic data types. In addition to String, there are list, Hash, zset, and set. List, Hash, and zset all use Ziplist either indirectly or directly, so it’s important to understand Ziplist.

What does ziplist mean

When I first started looking at Ziplist, I always felt that the word ZIP was very familiar, as if it was often seen in the daily use of computers, so I baidu the next:Oh oh, no wonder so familiar, the original is the meaning of “compression”, that ziplist can be translated into “compression list”.

Why ziplist

There are two reasons:

  • An ordinary bidirectional linked list will have two Pointers. In the case of storing very small data, the size of the actual data we store may not be as large as the memory occupied by the pointer. Isn’t it a bit of a loss for the gain? And Redis is based on memory, and is resident memory, memory is precious, so Redis developers must do all they can to optimize the memory footprint, so Ziplist appeared.
  • Linked list in memory, is generally discontinuous, traversal is relatively slow, and Ziplist can be a good solution to this problem.

Check out the existence of Ziplist

Zadd programmings 1.0 go 2.0 Python 3.0 JavaCopy the code

Create a zset with three elements and look at the data structure it takes:

debug object  programmings
"Value at:0x7f404ac30c60 refcount:1 encoding:ziplist serializedlength:36 lru:2689815 lru_seconds_idle:9"
Copy the code
HSET website google "www.g.cn
Copy the code

Create a hash with a single element, and look at the data structure it takes:

debug object website
"Value at:0x7f404ac30ac0 refcount:1 encoding:ziplist serializedlength:30 lru:2690274 lru_seconds_idle:14"
Copy the code

You can clearly see that both Zset and Hash use ziplist data structures.

Zsets and hashes no longer use Ziplist data structures when certain conditions are met:

debug object website
"Value at:0x7f404ac30ac0 refcount:1 encoding:hashtable serializedlength:180 lru:2690810 lru_seconds_idle:2"
Copy the code

As you can see, the underlying data structure of the hash becomes a HashTable.

Szet will not do the experiment, interested partners can experiment under their own.

What that transformation condition is, I’ll leave that to later.

For those of you curious, try to see what the underlying data structure of the list is, and it’s not Ziplist:

LPUSH languages python debug object languages "Value at:0x7f404c4763d0 refcount:1 encoding:quicklist serializedlength:21 Lru :2691722 LRU_SECONds_IDLE :22 QL_nodes :1 ql_AVG_node: 1.00QL_ziplist_max :-2 ql_compressed:0 ql_uncompressed_size:19"Copy the code

As you can see, the underlying data structure used by list is QuickList, not Ziplist.

In the lower version of Redis, the list uses ziplist+linkedList as the underlying data structure. In the higher version of Redis, QuickList replaces Ziplist +linkedList, and QuickList uses Ziplist. So you can say that list indirectly uses ziplist data structures. What this Quicklist is is not the content of this blog.

To explore the ziplist

Ziplist source code: Ziplist source code

Ziplist source code notes to write very clear, if English is better, you can directly look at the above notes, if your English is not too good, or there is no certain spirit of study, or look at my blog.

Ziplist layout

<zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
Copy the code

Here is the Ziplist layout explained in the comments. Let’s look at what these fields are:

  • Zlbytes: a 32bit unsigned integer, indicating the total number of bytes (including four bytes) occupied by ziplist.
  • Zltail: a 32-bit unsigned integer that records the offset of the last entry, facilitating quick locating to the last entry.
  • Zllen: a 16-bit unsigned integer that records the number of entries.
  • Entry: A number of stored elements, which can be byte arrays or integers.
  • Zlend: The last byte of ziplist, which is a closing flag bit with a fixed value of 255.

Redis implements access to ziplist fields through the following macro definitions:

// Char *zl refers to the ziplist header // refers to the zlBytes field #define ZIPLIST_BYTES(zl) (*(uint32_t*)(zl))) // refers to the zltail field (zl+4) #define *(uint32_t*)((uint32_t) +sizeof(uint32_t))))) (*(uint16_t*)((zl)+sizeof(uint32_t)*2))) Zl + intrev32IFbe (ZIPLIST_TAIL_OFFSET(zl))) 255 (0xFF) #define ZIPLIST_ENTRY_END(zl) ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-1)Copy the code

The structure of the entry

From the Ziplist layout, we can clearly know that our data is saved in ziplist each entry, let’s look at the composition of the entry.

<prevlen> <encoding> <entry-data>
Copy the code

Let’s see what these three fields are:

  • Prevlen: the length of the previous element in bytes to quickly find the initial address of the previous element. If the initial address of the current element is x, (x-prevlen) is the initial address of the previous element.
  • Encoding: Specifies the encoding of the current element. This field is too complex, so we will save it for later.
  • Entry-data: indicates the data actually stored.
prevlen

The prevlen field is variable length:

  • Prevlen is 1 byte if the length of the previous element is less than 254 bytes;
  • Prevlen is represented as 5 bytes if the previous element is 254 bytes or more long. In this case, the first byte of prevlen is a fixed 254 (0xFE) (as an indication of this), followed by four bytes representing the length of the previous element.
encoding

Encoding encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding Encoding If you really don’t understand, skip this paragraph.

Redis uses encoding to determine the type of data to be stored in the encoding field in order to save space.

  1. 00xxxxxxA short string of up to 63 bits, followed by six bits to store the number of bits in the string;
  2. 01xxxxxx xxxxxxxxA medium-length string, followed by 14 bits to indicate the length of the string;
  3. 10000000 aaaaaaaa bbbbbbbb cccccccc ddddddddA very large string that requires an additional 4 bytes to represent the length. The first byte prefix is10, the remaining 6 bits are not used and are uniformly set to zero;
  4. 11000000Said int16;
  5. 11010000Int32;
  6. 11100000Said int64;
  7. 11110000Said int24;
  8. 11111110Said int8;
  9. 11111111Represents the end of ziplist, which is the zlend value 0xFF;
  10. 1111xxxxRepresents a minimal integer. The value of XXXX can only be (0001 ~ 1101), that is1 ~ 13.

If it’s case 10, then the composition of the entry changes:

<prevlen> <encoding> 
Copy the code

Because the data is already stored in the Encoding field.

Redis determines whether the stored data is a string (byte array) or an integer based on the first two digits of the Encoding field. If it is a string, Redis also determines the length of the string based on the first two digits of the Encoding field. If it is shaping, the length should be determined by the following bits.

The structure of the entry

We have said so much about Entry, but what we are about to say may surprise you. We can see the structure of entry in the source code, and there is a very important comment:

/* We use this function to receive information about a ziplist entry.
 * Note that this is not how the data is actually encoded, is just what we
 * get filled by a function in order to operate more easily. */
typedef struct zlentry {
    unsigned int prevrawlensize; /* Bytes used to encode the previous entry len*/
    unsigned int prevrawlen;     /* Previous entry len. */
    unsigned int lensize;        /* Bytes used to encode this entry type/len.
                                    For example strings have a 1, 2 or 5 bytes
                                    header. Integers always use a single byte.*/
    unsigned int len;            /* Bytes used to represent the actual entry.
                                    For strings this is just the string length
                                    while for integers it is 1, 2, 3, 4, 8 or
                                    0 (for 4 bit immediate) depending on the
                                    number range. */
    unsigned int headersize;     /* prevrawlensize + lensize. */
    unsigned char encoding;      /* Set to ZIP_STR_* or ZIP_INT_* depending on
                                    the entry encoding. However for 4 bits
                                    immediate integers this can assume a range
                                    of values and must be range-checked. */
    unsigned char *p;            /* Pointer to the very start of the entry, that
                                    is, this points to prev-entry-len field. */
} zlentry;
Copy the code

Focus on the notes above. In one sentence: This structure is defined but not used because entry would take up too much memory if it were.

Ziplist storage form

Instead of encapsulating a structure to hold Ziplist, as SDS did in the previous blog post, Redis defines a series of macros to manipulate the data, meaning ziplist is a bunch of bytes of data, The ziplist layout mentioned above and the layout of entries in ziplist are just abstract concepts.

Why can’t it always be Ziplist

In the earlier part of the article, we did experiments to prove that the underlying storage structure of Zset and Hash is no longer Ziplist after certain conditions are met. Since Ziplist is so awesome, Redis developers have spent so much energy on the design of Ziplist. Why can’t the underlying storage structure of zset and Hash always be ziplist? Because Ziplist is compact storage, there is no redundant space, which means that when you insert new elements, you need to expand the memory, which can be divided into two cases:

  • Allocate new memory, copy the original data to the new memory;
  • Expand existing memory.

So Ziplist shouldn’t store large strings or too many elements.

Ziplist storage bounds

What conditions can be met so that the underlying storage structure of zset and Hash is no longer ziplist? In the configuration file, you can set:

Hash-max-ziplist-entries 512 # hash elements exceeding 512 must be stored in a standard structure for hash-max-ziplist-value 64 # hash elements exceeding 64 key/value lengths Zset-max-ziplist-entries must be stored in a standard structure if the number of 128 # zset elements exceeds 128. Zset-max-ziplist-value 64 # zset must be stored in a standard structure if the number of 128 # zset elements exceeds 64 It has to be stored in a standard structureCopy the code

For this configuration, I am just a porter, and did not go to the experiment, after all, no one will modify this bar, interested partners can test.

Too many Ziplist elements, what to do

In the introduction of ziplist layout, said that Ziplist uses two bits to record the number of elements in Ziplist, if the number of elements is too many, two bits is not enough to do? In this case, finding the number of Ziplist elements is a traversal.

See, Redis is not so simple as imagined, there are still a lot of things to study, and it is very complicated. If we don’t learn, we may feel that we have completely mastered Redis, but once we start to learn, we find that we have only scratched the surface before. It proves that the more you know, the more you don’t know.

That’s the end of this blog.