This article is the fifth in the Redis Internal Data Structure Series. In this article, we introduce a Redis internal data structure, quickList. The list data type that Redis exposes, and the internal data structure that its underlying implementation relies on is QuickList.

We will also cover two Redis configurations (in the ADVANCED CONFIG section of redis.conf) in our discussion:

list-max-ziplist-size -2
list-compress-depth 0Copy the code

We will explain the implications of these two configurations in more detail in the discussion.

Note: The QuickList implementation discussed in this article is based on the 3.2 branch of the Redis source code.

Quicklist overview

Redis exposes the upper-layer List data type, which is often used as a queue. For example, it supports the following operations:

  • lpush: Inserts data to the left (head of the list).
  • rpop: Deletes data on the right (at the end of the list).
  • rpush: Inserts data to the right (at the end of the list).
  • lpop: Deletes data on the left side (head of the list).

These operations are O(1) time.

Of course, lists also support access anywhere in the middle, such as Lindex and Linsert, but both require traversal of the list, so time complexity is high.

In summary, list has the following characteristics: it is an ordered list, which makes it easy to append and delete data at both ends of the table, and it has O(N) time complexity for accessing middle places. Isn’t that what a bidirectional linked list is all about?

The internal implementation of list, QuickList, is just a two-way list. In the header comment of the quicklist.c file, it describes quickList like this:

A doubly linked list of ziplists

It is indeed a bidirectional list, and a Ziplist bidirectional list.

What does that mean?

As we know, a bidirectional linked list is composed of multiple nodes. Each node of the QuickList is a Ziplist. Ziplist was introduced in the last article.

Ziplist itself is an ordered list, and a memory – tight list (items are contiguous in memory). For example, a quickList of three nodes, if the ziplist of each node contains four more items, then the list appears to contain a total of 12 items.

Why is quickList structured this way? To sum up, it’s probably another compromise between space and time:

  • Bidirectional linked lists are easy to push and pop on both sides of the table, but they are expensive in memory. First, it holds two Pointers on each node in addition to data; Secondly, each node of bidirectional linked list is a separate memory block, the address is not continuous, more nodes will easily generate memory fragmentation.
  • Ziplist is efficient because it is a single block of contiguous memory. However, it is not conducive to modification operations, as each data change causes an in-memory realLOC. Especially when the Ziplist length is very long, one realloc may result in a large number of data copies, further reducing performance.

Quicklist, then, combines the best of bidirectional lists and Ziplist.

However, this raises a new question: How long is a Ziplist appropriate for a QuickList node? For example, the same 12 data items can be stored in a Quicklist containing three nodes and a Ziplist of each node containing four data items, or a Quicklist containing six nodes and a Ziplist of each node containing two data items.

Again, it’s a balancing act. Let’s just analyze it from the perspective of storage efficiency:

  • The shorter the Ziplist on each QuickList node, the more memory fragmentation. A large number of memory fragments may generate many small fragments that cannot be used, reducing storage efficiency. At the extreme, the Ziplist on each QuickList node contains only one item, which is reduced to a normal two-way list.
  • The longer the Ziplist on each QuickList node, the more difficult it is to allocate large contiguous memory space for the Ziplist. It’s possible to have small chunks of free space in memory (they add up to a lot) but not enough free space to allocate to a Ziplist. This also reduces storage efficiency. At the extreme end of this scenario, the entire QuickList has only one node, and all the data items are allocated within the ziplist of that single node. This is actually a degenerate ziplist.

As you can see, the ziplist on a QuickList node is kept to a reasonable length. How long is that reasonable? This may depend on the application scenario. In fact, Redis provides a configuration parameter, list-max-ziplist-size, so that users can adjust it according to their own situation.

list-max-ziplist-size -2Copy the code

Let’s explain what this parameter means in detail. It can be positive or negative.

A positive value limits the ziplist length on each QuickList node according to the number of data items. For example, when this parameter is set to 5, the ziplist of each QuickList node contains a maximum of five items.

When negative, the ziplist length on each QuickList node is limited by the number of bytes consumed. In this case, it can only take five values from -1 to -5. Each value has the following meanings:

  • -5: The ziplist size of each QuickList node cannot exceed 64 Kb. (note: 1kb => 1024 bytes)
  • -4: The ziplist size of each QuickList node cannot exceed 32 Kb.
  • -3: The ziplist size of each QuickList node cannot exceed 16 Kb.
  • -2: The ziplist size of each QuickList node cannot exceed 8 Kb. (-2 is the default value given by Redis)
  • -1: The ziplist size of each QuickList node cannot exceed 4 Kb.

Additionally, lists are designed to be able to store long lists of data. Writing a Simple Twitter Clone with PHP and Redis, for example, uses list to store Timeline data similar to Twitter.

When the list is long, the data that is most easily accessed is likely to be at both ends, with the data in the middle being accessed infrequently (and with low performance). If this is the case, the list also provides an option to compress the data nodes in the middle, further saving memory. The Redis configuration parameter, list-compressed-depth, is used to do this.

list-compress-depth 0Copy the code

This parameter represents the number of uncompressed nodes at both ends of a QuickList. Note: The number of nodes here refers to the number of nodes in quickList, not the number of items in Ziplist. In fact, a Ziplist on a QuickList node, if compressed, is compressed as a whole.

The list-compressed-depth parameter has the following meanings:

  • 0: this is a special value, indicating that none is compressed. This is the default value for Redis.
  • 1: indicates that one node at both ends of the QuickList is not compressed, and the middle node is compressed.
  • 2: indicates that two nodes at both ends of the QuickList are not compressed, and the middle node is compressed.
  • 3: indicates that three nodes at both ends of the QuickList are not compressed, and the middle node is compressed.
  • And so on…

Since 0 is a special value, it is easy to see that the head and tail of the QuickList are always uncompressed for quick access at both ends of the table.

Redis uses LZF, a lossless compression algorithm, to compress quickList internal nodes.

Data structure definition for QuickList

Quicklist-related data structure definitions can be found in quicklist.h:

typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;
    unsigned int sz;             /* ziplist size in bytes */
    unsigned int count : 16;     /* count of items in ziplist */
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* NONE==1 or ZIPLIST==2 */
    unsigned int recompress : 1; /* was this node previous compressed? * /
    unsigned int attempted_compress : 1; /* node can't compress; too small */
    unsigned int extra : 10; /* more bits to steal for future usage */
} quicklistNode;

typedef struct quicklistLZF {
    unsigned int sz; /* LZF size in bytes*/
    char compressed[];
} quicklistLZF;

typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        /* total count of all entries in all ziplists */
    unsigned int len;           /* number of quicklistNodes */
    int fill : 16;              /* fill factor for individual nodes */
    unsigned int compress : 16; /* depth of end nodes not to compress; 0=off */
} quicklist;Copy the code

The quicklistNode structure represents a quickList node, where the fields have the following meanings:

  • Prev: pointer to the previous node in the list.
  • Next: a pointer to the next node in the list.
  • Zl: data pointer. If the current node’s data is not compressed, it points to a Ziplist structure; Otherwise, it points to a quicklistLZF structure.
  • Sz: represents the total size of the Ziplist that zL points to (includingzlbytes.zltail.zllen.zlendAnd individual data items). Note that if ziplist is compressed, the sZ value is still the ziplist size before compression.
  • Count: indicates the number of data items contained in ziplist. This field is only 16 bits long. We’ll figure out later if that’s enough.
  • Encoding: Indicates whether ziplist is compressed (and which compression algorithm is used). Currently, there are only two values: 2 means compressed (using LZF compression) and 1 means not compressed.
  • Container: Indicates a reserved field. It is designed to indicate whether a QuickList node is used to store data directly, ziplist, or some other structure (used as a data container, hence container). However, in the current implementation, this value is a fixed value of 2, indicating the use of ziplist as a data container.
  • Recompress: When you want to extract the compressed data using a command like lindex, set recompress=1 and wait for the opportunity to compress the data.
  • Attempted_compress: This value is only useful to Redis automated test programs. We don’t have to worry about that.
  • Extra: other extended fields. It is not currently used in the Redis implementation.

The quicklistLZF structure represents a ziplist that has been compressed. Among them:

  • Sz: indicates the ziplist size after compression.
  • Compressed: Is a flexible array member, store ziplist byte array compressed.

The data structure that really represents QuickList is the quickList struct of the same name:

  • Head: pointer to the head node (the first node on the left).
  • Tail: Pointer to the last node (the first node on the right).
  • Count: total number of all Ziplist data items.
  • Len: Indicates the number of QuickList nodes.
  • Fill: 16bit, ziplist size set, storelist-max-ziplist-sizeParameter value.
  • Compress: 16 bits. This parameter is used to compress a nodelist-compress-depthParameter value.

Above is an example of a QuickList structure diagram. The ziplist size and node compression depth configurations corresponding to the examples in the figure are as follows:

list-max-ziplist-size 3
list-compress-depth 2Copy the code

A few things to note in this example are:

  • There are two orange nodes at each end, which are not compressed. Their data pointer zL points to the real Ziplist. The other nodes in the middle are compressed, and their data pointer ZL points to a ziplist structure compressed, which is a Quicklist ZF structure.
  • There are 2 items of data in the Ziplist on the left head node, 1 item of data in the ziplist on the right tail node, and 3 items of data in the Ziplist on other nodes in the middle (including the inside of the compressed node). This means that it has been executed multiple times on both sides of the tablepushandpopA state after an operation.

Now let’s roughly calculate whether the 16 bits in the Count field in the quicklistNode structure are sufficient.

We already know that the ziplist size is limited by the list-max-ziplist-size parameter. There are two cases in terms of positive and negative values:

  • When this parameter is positive, it exactly represents the maximum number of items contained in the ziplist that zl points to in a quicklistNode structure.list-max-ziplist-sizeThe parameter is stored by the QuickList structure’s Fill field, which is 16 bits, so the value it can express can be represented in 16 bits.
  • When this parameter is negative, the maximum ziplist length that can be represented is 64 Kb. Each data item in Ziplist requires at least two bytes to represent: one byteprevrawlen, 1 bytedata(lenFields anddata2. To become one; As shown in theIn the previous). Therefore, the number of items in ziplist should not exceed 32 K, and 16 bits is sufficient.

In fact, in the current QuickList implementation, the ziplist size is subject to additional limitations and will never reach the maximum analyzed here.

Let’s go to the code analysis phase.

The creation of quicklist

When we insert data into a non-existent list for the first time using lpush or rpush, Redis first calls the quicklistCreate interface to create an empty QuickList.

quicklist *quicklistCreate(void) {
    struct quicklist *quicklist;

    quicklist = zmalloc(sizeof(*quicklist));
    quicklist->head = quicklist->tail = NULL;
    quicklist->len = 0;
    quicklist->count = 0;
    quicklist->compress = 0;
    quicklist->fill = 2 -;
    return quicklist;
}Copy the code

In many data structure books, bidirectional lists are often implemented with a spare head node, mainly for the convenience of insertion and deletion operations. As you can see from the quicklistCreate code above, quickList is a bidirectional linked list with no free head nodes (head and tail are both initialized to NULL).

Quicklist push operation

Quicklist push is implemented by calling QuickList Push.

void quicklistPush(quicklist *quicklist, void *value, const size_t sz,
                   int where) {
    if (where == QUICKLIST_HEAD) {
        quicklistPushHead(quicklist, value, sz);
    } else if(where == QUICKLIST_TAIL) { quicklistPushTail(quicklist, value, sz); }}/* Add new entry to head node of quicklist. * * Returns 0 if used existing head. * Returns 1 if new head created. */
int quicklistPushHead(quicklist *quicklist, void *value, size_t sz) {
    quicklistNode *orig_head = quicklist->head;
    if (likely(
            _quicklistNodeAllowInsert(quicklist->head, quicklist->fill, sz))) {
        quicklist->head->zl =
            ziplistPush(quicklist->head->zl, value, sz, ZIPLIST_HEAD);
        quicklistNodeUpdateSz(quicklist->head);
    } else {
        quicklistNode *node = quicklistCreateNode();
        node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_HEAD);

        quicklistNodeUpdateSz(node);
        _quicklistInsertNodeBefore(quicklist, quicklist->head, node);
    }
    quicklist->count++;
    quicklist->head->count++;
    return(orig_head ! = quicklist->head); }/* Add new entry to tail node of quicklist. * * Returns 0 if used existing tail. * Returns 1 if new tail created. */
int quicklistPushTail(quicklist *quicklist, void *value, size_t sz) {
    quicklistNode *orig_tail = quicklist->tail;
    if (likely(
            _quicklistNodeAllowInsert(quicklist->tail, quicklist->fill, sz))) {
        quicklist->tail->zl =
            ziplistPush(quicklist->tail->zl, value, sz, ZIPLIST_TAIL);
        quicklistNodeUpdateSz(quicklist->tail);
    } else {
        quicklistNode *node = quicklistCreateNode();
        node->zl = ziplistPush(ziplistNew(), value, sz, ZIPLIST_TAIL);

        quicklistNodeUpdateSz(node);
        _quicklistInsertNodeAfter(quicklist, quicklist->tail, node);
    }
    quicklist->count++;
    quicklist->tail->count++;
    return(orig_tail ! = quicklist->tail); }Copy the code

There are two cases of inserting data in either the header or the tail:

  • If the ziplist size on the head node (or tail node) does not exceed the limit (i.e_quicklistNodeAllowInsertReturn 1), then the new data is inserted directly into the ziplist (callziplistPush).
  • If the ziplist on the head node (or tail node) is too large, create a new QuickList Node (and a new Ziplist is created accordingly) and insert the newly created node into the QuickList bidirectionally-linked list (call)_quicklistInsertNodeAfter).

In the _quicklistInsertNodeAfter implementation, the nodes are also compressed according to the list-compressed-depth configuration. The implementation of this is a bit cumbersome, so we won’t discuss it here.

Other operations for QuickList

Quicklist has a lot of operations, and the implementation details are complicated, so I don’t want to analyze the source code, we briefly introduce some of the more important operations.

Quicklist’s POP operations are implemented by calling QuickList PopCustom. QuicklistPopCustom is implemented basically the opposite of quicklistPush, by removing the corresponding data item from the ziplist of the header or tail node. If the ziplist is empty after the deletion, the header or tail node is removed as well. Deletion may also involve decompression of nodes.

Quicklist implements inserts not only from the head or tail, but also from any specified location. QuicklistInsertAfter and quicklistInsertBefore insert data items after and before the specified location, respectively. The operation of inserting data at any given location is complicated and has many logical branches.

  • When the size of the Ziplist in the insertion position does not exceed the limit, it is good to directly insert the Ziplist;
  • If the ziplist size of the adjacent QuickList node exceeds the limit, but the ziplist size of the adjacent QuickList node does not exceed the limit, the ziplist size of the adjacent QuickList node is inserted into the ziplist of the adjacent QuickList node.
  • If the ziplist size at both ends of the Ziplist exceeds the limit and the ziplist size of the adjacent QuickList nodes also exceeds the limit, you need to create a New QuickList node to insert.
  • For other cases where the size of the ziplist at the insertion location exceeds the limit (mainly for inserting data in the middle of the Ziplist), you need to split the current Ziplist into two nodes and then insert data on one node.

The quicklistSetOptions command is used to set the Ziplist size (list-max-ziplist-size) and node compression depth (list-compressedepth) parameters. The code is simple, setting the corresponding values to the Fill and COMPRESS fields of the QuickList structure.


Stay tuned for skiplist and the Redis data type it supports, sorted Set, in the next article.