preface

  • When we talked about the list structure, we mentioned a special type of structure called a ziplist, or compressed list. It is also one of the underlying structures used by the hash and list structures. It is a structure designed to save memory.

  • When a list structure contains a small number of items and the elements are small integers or short strings, the Redis base uses ziplist to store the data

structure

  • Redis’ ziplist is a contiguous block of memory. We can simply think of it as an array, but compared to an array, it has many more descriptions of the node association, such as the total length of the array, the offset of the last address, the length of the last node, and so on.

  • The above is redis source code on the Ziplist structure of a description! The overall structure of the Ziplist can be easily understood according to the parts circled in the figure

  • So what does this mean? How do they store content in contiguic blocks of memory?

  • Each block has a fixed memory space to represent its function. Only entry is dynamic in length because it is used by the storage node.
field type The length of the role
zlbytes unit32_t Four bytes, 32 bits The number of bytes consumed by the entire Ziplist.
zltail unit32_t Four bytes, 32 bits The offset of the tail node from the start position. We need the tail node handle offset from the head
zllen unit16_t Two bytes, 16 bits Number of nodes. 16-bit indicates the maximum value of 65535. When the number of elements exceeds 65535, we can only iterate to get the number of nodes
entry … entry dynamic
zlend unit8_t One byte, eight bits Fixed value 0XFF. Used to represent a flag bit
  • Let’s look at an example

node

  • In view of redis source code, I made an explanation. His memory structure is shown in figure

  • The head is related. We can simply plan the head as a head

  • The previous_entry is the length of the previous node. From this we can reverse the value of the previous node. This also explains how we locate elements in the Ziplist. We can get the tail node based on zlbytes and zltAIL. Retrieve the previous node based on the previous_entry of the tail node.

previous_entry

  • In this part we simply understand that it takes 5 bytes (maximum case). Redis takes into account the compact size of the memory and this place will preferentially use 1 byte for the record and when 1 byte is not satisfied like intset it will use 5 bytes for the record. A fixed value of 5 bytes minus the first byte is exactly 32.
  • This also echoes the ziplist usage scenario I mentioned at the beginning. Ziplist is mainly used to store small integers or short characters. When they are very small or very short, the length of an entry node is not very large. In extreme cases, 8 bits can represent their length!
  • It also shows that Redis is really dead on memory. When the previous_entry exceeds 254redis, it is upgraded to 5 bytes, with the first byte set to a fixed value of 0XFE. The next four bytes represent the actual length of the previous node

Is it? Why is the fixed value 0XFE

  • First, only the first byte is set to a fixed value. One byte is commonly represented in hexadecimal by FF. But why not FF here. Because 0XFF is used in ziplist to indicate the ZLend flag. So here we move forward with 0XFE as the entry tag.

encoding

  • The encoding attribute records the data type and length of the node. A brief description records the characteristics of the content

  • The above is an explanation of Entry in redis source code. I made a spreadsheet of him.
coding Length (bytes) value
00aaaaaa 1 Length <=63 A string
01pppppp|qqqqqqqq 2 Length <=16383 Character string
10xxxxxx|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt 5 Length <=4294967295 The value is a character string
11000000 1 int16
11010000 1 int32
11100000 1 int64
11110000 1 A 24-bit signed integer
11111110 1 An 8-bit signed integer
1111xxxx (value range of XXXX [0000, 1101]) 1 0 ~ 12
11111111 Ziplist stop bits
  • The first two bits of all encoding except the 11111111 fixed end bit are used to indicate the data type

  • The first two digits of 11 represent integers; 00 is a string of bytes; 01 indicates a byte string. 10 indicates a 5-byte string

  • The calculated length is encoding. The number denoted by the first two digits is the length of the attribute.

00001000

  • First of all, it begins with 00. According to the manual, we can know that the entry stores a 1-byte string, so what is its length? It’s 1000 in binary which is length 8.
  • 8 bits is just one byte, which means that the internal storage of the node is 1 byte of content, which is probably a

01000001000000 0

  • It starts with 01 for a 2 byte character. Removing the first two bits translates to 256 in decimal. Represents 32 bytes of content. We can understand that this entry stores 32 English letters
  • The following examples will not be given. From encoding we can determine the content and length of the entry.

content

  • The p of the node is used to hold the actual content. From the introduction to Encoding above we know that the content can be either a string or an integer.
  • So we don’t need to look at the actual length of the entry when calculating the length of the entry because it is already in encoding.

Go back in time

  • Remember the ziplist sketch we showed above, where we labeled zlbytes, Zltail, Zllen, entryx, Zlend, etc. Now we will refine the sketch

  • In the above diagram we added a memory representation of three nodes. Here we will clearly understand the structure of an entry. But the final interior still subdivides the display of the head

Chain update

  • We know that previous_entry is used to mark the length of the previous node, and we also said that Redis will try to save memory by first using 1 byte and then expanding to 4 bytes when the length exceeds 254. This means that the length of each of our entries is dynamic. In other words, each of our entries is affected by the previous node

  • For example, now our entries are between 250 and 254 in length. The following figure

  • At this point we need to add a node of length 254 to the header. The previous_entry of our first node above will change from 1 to 5 bytes because of this node 254. This affects the entire entry to change from 250 bytes to 254 bytes. Because the first node is 254 bytes and this recursively affects the second node!

  • Redis calls this chain reaction of adding a node to a header a chain update. The same reason we will encounter a similar situation when deleting, should these two cases collectively referred to as chain update!

conclusion

  • We first parse the ziplist properties structurally, and then continue to parse the internal structure specification from the main ziplist role entry.
  • We divide the entry into three parts. Previous_entry is dependent on the previous node, which leads to an important redis issue [chain update].
  • An overview of where cascading updates occur is rare. Interlocking updates occur only when the length of each node ranges from 25 to 253, which in itself is a low profile. The second node added is 254 itself is also very low. Ziplist is mainly used for small integers or short characters.
  • And when there are very few nodes, even if the chain update occurs, we can accept it. After all, it’s in memory.
  • In conclusion! Ziplist is mainly used in small integers, short characters and other data amount is the best!!

Ask a praise, thank you big ye!!