Concern public number: IT elder brother, read a dry goods article every day, a year later, you will find a different self.

preface

In Redis, there is a data type that uses two data structures to store data separately. Why is Redis doing this? Does this cause the same data to take up twice as much space?

Collection objects of the five basic types

The collection object in Redis is an unordered collection of elements of type string. The elements in the collection are unique and not repeatable.

There are two underlying data structures for collection objects: IntSet and HashTable. The internal code is used to distinguish:

Intset coding

Intset (integer set) can hold integer values of type INT16_t, int32_t, int64_t, and ensure that there are no duplicate elements in the set.

The intSet data structure is defined as follows (source inset.h) :

typedef struct intset { uint32_t encoding; // Uint32_t length; Int8_t contents[]; } intSet;Copy the code

Here is a sketch of intSet object storage:

encoding

The encoding inside intSet records the data store type of the current set of integers. There are three main types:

  • INTSET_ENC_INT16

In this case, each element in contents[] is an integer value of type INT16_t, ranging from -32768 to 32767 (-2 ^ 15 ~ 2 ^ 15 -1).

  • INTSET_ENC_INT32

In this case, each element in contents[] is an integer value of type INT32_t, ranging from -2147483648 to 2147483647 (-2 ^ 31 ~ 2 ^ 31 -1).

  • INTSET_ENC_INT64

In this case, each element in contents[] is an integer value of type INT64_t, ranging from -9223372036854775808 to 9223372036854775807 (-2 ^ 63 ~ 2 ^ 63 -1).

contents[]

Contents [] Although the structure is defined as int8_t, the actual storage type is determined by encoding above.

Upgrade of integer collection

If all the elements in the set of integers are 16 bits, and the type is INT16_T, then we need to store a 32 bit integer, then we need to upgrade the original set of integers, before we can store the 32 bit integer in the set of integers. This involves the type upgrade of the integer set, which has four main steps:

  • The size of the underlying array space is expanded according to the type of the newly added element, and the new space is allocated according to the number of bits of the existing element after the upgrade.

  • Casts existing elements and puts them back into the array one by one.

  • Place the new element at the head or end of the array (because the condition that triggers the upgrade is that the integer type of the current array cannot store the new element, so the new element is either larger or smaller than the existing element).

  • Change the Encoding property to the latest encoding, and change the Length property synchronously.

PS: Like the encoding of string objects, the type of the collection of integers remains encoded and cannot be degraded once upgraded.

To upgrade the sample

1. Suppose we have a set where encoding is INT16_t and three elements are stored internally:

2. Insert the integer 50000, which is an integer of int32_T type. Therefore, apply for new space.

3. Now there are four elements in the new array. The old array is ranked third, so we need to move the upgraded 3 to 64-95 bits.

4. Continue to move the upgraded 2 to bits 32-63.

5. Continue to move 1 to bits 0-31 after the upgrade.

6. The 50000 will then be placed in the 96-127 digits.

7. The encoding and Length attributes are modified to complete the upgrade.

Hashtable coding

The Hashtable structure was examined in detail in the previous section on hash objects. For more information, click here.

Intset and Hashtable encoding conversion

Redis chooses to use intSet encoding when a collection meets the following two conditions:

  • The collection object holds all elements as integer values.

  • Collection objects can hold 512 or less elements (this threshold can be controlled by the set-max-intset-entries configuration file).

Once the elements in the collection do not meet the above two criteria, hashTable encoding is chosen.

Common commands for collection objects

  • Sadd key member1 member2: Adds one or more element members to the set key and returns the number of successful additions. If the element already exists, it is ignored.

  • Sismember key member: Checks whether the element member is in the set key.

  • Srem key member1 MEMBER2: Removes elements in the set key. Non-existing elements are ignored.

  • Smove source dest member: Moves the element member from the collection source to dest. If the member does not exist, nothing is done.

  • Smembers Key: Returns all elements in the collection key.

To flushes the Redis database, run the flushall command to prevent interference from other key values.

Run the following commands in sequence:

Sadd num 1 2 3 // Set a set of 3 integers, Sadd name 1 2 3 test // Set the set of 3 integers and 1 string, Will use hashtable encoding type name // view the type object encoding name // view the encodingCopy the code

The following results are obtained:

As you can see, the collection uses the intSet encoding when the set element contains only integers, and the HashTable encoding when the set element contains non-integers.

An ordered collection object of five basic types

The difference between an ordered set in Redis and a set is that each element in an ordered set is associated with a number of type double, and then sorted in order of the number from smallest to largest. In other words, the order of ordered sets is determined by fractions when we set the values ourselves.

There are two underlying data structures for ordered collection objects: Skiplist and Ziplist. The interior is also distinguished by code:

Skiplist coding

Skiplist is a skiplist, sometimes referred to simply as a skiplist. Skiplist-encoded ordered collection objects use the zset structure as the underlying implementation, which contains both a dictionary and a skiplist.

Jump table

Skip list is a kind of ordered data structure, its main characteristic is to maintain multiple Pointers to other nodes in each node, so as to achieve the purpose of fast access to nodes.

In most cases, the efficiency of skip lists can be equal to that of balanced trees, but the implementation of skip lists is far simpler than that of balanced trees, so Redis chooses to use skip lists to achieve ordered sets.

Below is a normal and orderly linked list, if we want to find this element, 35 only traverse from the beginning to the end (element in the list does not support random access, so can’t use binary search, and can pass in an array subscript random access, so the binary search is generally applicable to orderly array), the time complexity is O (n).

So if we can jump directly to the middle of the linked list, we can save a lot of resources. This is the principle of a hop table, as shown in the following figure is an example of a hop table data structure:

Level1, Level2, level3 = level1, level2, level3 = level1, level2, level3 = level1, level2, level3 = level1, level2 = level3

  • The first is to execute the pointer at Level1 and iterate 7 times (1->8->9->12->15->20->35) to find element 35.

  • The second is to execute the pointer at Level2 and only need to iterate 5 times (1->9->12->15->35) to find element 35.

  • The third way is to execute level3 elements, which only need to iterate 3 times (1->12->35) to find element 35, greatly improving efficiency.

Skiplist storage structure

Each node in the skiplist is a zskiplistNode node (in source server.h) :

typedef struct zskiplistNode { sds ele; // Element double score; Struct zskiplistNode *backward; Struct zskiplistNode *forward; struct zskiplistNode *forward; // Forward pointer unsigned long span; // Span from the current node to the next node (number of nodes crossed)} level[]; } zskiplistNode;Copy the code
  • Level (level)

Level refers to the layer in the skip list, which is an array, that is, the element of a node can have multiple layers, that is, multiple Pointers to other nodes. The program can use Pointers of different levels to select the fastest path to improve access speed.

Level is a number between 1 and 32 randomly generated according to the Power law every time a new node is created.

  • Forward (forward pointer)

Each layer has a pointer to the element at the end of the list, and the forward pointer is used to traverse the element.

  • Span = span

The span records the distance between two nodes. Note that the span is 0 if it points to NULL.

  • Backward (backward pointer)

Unlike the forward pointer, there is only one backward pointer, so you can only go back to the previous node at a time (the backward pointer is not drawn in the figure above).

  • Ele (element)

The element in the skip list is an SDS object (earlier versions used a redisObject), and the element must be unique and unrepeatable.

  • Score (score)

The point value of a node is a floating point number of type double. The nodes are arranged in the order of point value from smallest to largest in the skip table. The points of different nodes can be repeated.

This is just one node in a skiplist. Multiple zskiplistNode nodes form a zskiplist object:

typedef struct zskiplist { struct zskiplistNode *header, *tail; // The head and tail Pointers of the skip list are unsigned long length; // Skip table number of nodes int level; } zskiplist;Copy the code

At this point you might think that ordered collections are implemented using this Zskiplist, but Redis doesn’t use Zskiplist directly, but instead wraps it up again with a Zset object.

typedef struct zset { dict *dict; Zskiplist * ZSL; // skip list object} zset;Copy the code

So finally, an ordered set with Skiplist encoding would have the following data structure:

In the above part, the key in the dictionary corresponds to the element (member) in the ordered set, and the value corresponds to the score. The skip table integers 1,8,9 and 12 in the lower part of the figure also correspond to elements (member), and the double number in the last row is the score.

That is, both the dictionary and the skip list point to the element we store (both data structures end up pointing to the same address, so there is no redundant storage). Why does Redis do this?

Why use dictionaries and skip lists at the same time

Ordered set use jump table directly or use the dictionary alone can achieve alone, but we think about it, if used alone jumping table to do this, so although you can use a span pointer to traverse the elements to find the data we need, but its complexity to O (logN) still, and the complexity of the access to an element in a dictionary is O (1), However, if you use the dictionary alone, it will be quick to obtain elements, but the dictionary is unordered, so if you want to search the scope, you need to sort it, which is a time-consuming operation. Therefore, Redis integrates two data structures to maximize performance, which is also the exquisite design of Redis.

Ziplist coding

Compressed lists are used in both list objects and hash objects. For more information, click here.

Blog.csdn.net/zwx900102/a…

Ziplist and Skiplist coding conversion

The Ziplist encoding is used when ordered collection objects meet both of the following conditions:

  • The number of elements held in an ordered collection object is less than 128 (this can be changed by configuring zset-max-ziplist-entries).

  • The total length of all elements held in an ordered collection object is less than 64 bytes (this can be changed by configuring zset-max-ziplist-value).

Common commands for ordered collection objects

  • Zadd Key score1 MEMBER1 SCOre2 member2: Adds one or more elements (member) and their score to the ordered set key.

  • Zscore key member: Returns the score of member members in the ordered set key.

  • Zincrby key num member: Adds member of the ordered set key to num, which can be a negative number.

  • Zcount key min Max: returns the number of members in the ordered set key whose score is in the range of [min, Max].

  • Zrange key start stop: returns all members in the range [start,stop] in the ordered set key whose score is arranged from small to large.

  • Zrevrange key start stop: returns all members in the range [start,stop] of the ordered set key whose score is arranged from large to small.

  • Zrangebyscore key min Max: Returns all elements in the range [min, Max] in the ordered set sorted byscore from smallest to largest. Note that the default is the closed interval, but you can control the open and closed interval by prefixing the values of Max and min with (or [.

  • Zrevrangebyscore key Max min: Returns all elements in [min, Max] from the ordered set sorted byscore. Note that the default is the closed interval, but you can control the open and closed interval by prefixing the values of Max and min with (or [.

  • Zrank key member: Returns the rank (from smallest to largest) of the elements in the ordered set, counting from 0.

  • Zrevrank key member: Returns the rank (from largest to smallest) of the elements in the ordered set, counting from 0.

  • Zlexcount key min Max: Returns the number of members between min and Max in the ordered set. Note that min and Max in this command must be preceded by (or [to control the open and closed intervals. the special values – and + denote negative and positive infinity, respectively.

To flushes the Redis database, run the flushall command to prevent interference from other key values.

Before executing the command, we changed the zset-max-ziplist-entries parameter in the configuration file to 2 and restarted the Redis service.

After the restart, run the following commands:

Zadd name 1 zs 2 lisi // Set 2 elements using ziplist type name // View the type object encoding name // view the encoding zadd address 1 Beijing 2 Shanghai 3 Guangzhou 4 shenzhen // Skiplist encoding type address // View the type object Encoding address // view the encodingCopy the code

The following results are obtained:

conclusion

This paper mainly analyzes the implementation principle of intset and Skiplist, the underlying storage structures of set object and ordered set object, and focuses on how to sort ordered set and why two data structures (dictionary and hop table) are used to store data at the same time.

Concern public number: IT elder brother, read a dry goods article every day, a year later, you will find a different self.