preface

When we talked about the list structure, we mentioned a special type of structure called a ziplist, or compressed list. It is also one of the underlying structures used by the hash and list structures. It is a structure designed to save memory.
When a list structure contains a small number of items and the elements are small integers or short strings, the Redis base uses ziplist to store the data

structure

Redis’ ziplist is a contiguous block of memory. We can simply think of it as an array, but compared to an array, it has many more descriptions of the node association, such as the total length of the array, the offset of the last address, the length of the last node, and so on.

The above is redis source code on the Ziplist structure of a description! The overall structure of the Ziplist can be easily understood according to the parts circled in the figure

So what does this mean? How do they store content in contiguic blocks of memory?

Each block has a fixed memory space to represent its function. Only entry is dynamic in length because it is used by the storage node.

field	type	The length of the	role
zlbytes	unit32_t	Four bytes, 32 bits	The number of bytes consumed by the entire Ziplist.
zltail	unit32_t	Four bytes, 32 bits	The offset of the tail node from the start position. We need the tail node handle offset from the head
zllen	unit16_t	Two bytes, 16 bits	Number of nodes. 16-bit indicates the maximum value of 65535. When the number of elements exceeds 65535, we can only iterate to get the number of nodes
entry …	entry	dynamic
zlend	unit8_t	One byte, eight bits	Fixed value 0XFF. Used to represent a flag bit

Let’s look at an example

node

In view of redis source code, I made an explanation. His memory structure is shown in figure

The head is related. We can simply plan the head as a head

The previous_entry is the length of the previous node. From this we can reverse the value of the previous node. This also explains how we locate elements in the Ziplist. We can get the tail node based on zlbytes and zltAIL. Retrieve the previous node based on the previous_entry of the tail node.

previous_entry

In this part we simply understand that it takes 5 bytes (maximum case). Redis takes into account the compact size of the memory and this place will preferentially use 1 byte for the record and when 1 byte is not satisfied like intset it will use 5 bytes for the record. A fixed value of 5 bytes minus the first byte is exactly 32.
This also echoes the ziplist usage scenario I mentioned at the beginning. Ziplist is mainly used to store small integers or short characters. When they are very small or very short, the length of an entry node is not very large. In extreme cases, 8 bits can represent their length!
It also shows that Redis is really dead on memory. When the previous_entry exceeds 254redis, it is upgraded to 5 bytes, with the first byte set to a fixed value of 0XFE. The next four bytes represent the actual length of the previous node

Is it? Why is the fixed value 0XFE

First, only the first byte is set to a fixed value. One byte is commonly represented in hexadecimal by FF. But why not FF here. Because 0XFF is used in ziplist to indicate the ZLend flag. So here we move forward with 0XFE as the entry tag.

encoding

The encoding attribute records the data type and length of the node. A brief description records the characteristics of the content

The above is an explanation of Entry in redis source code. I made a spreadsheet of him.

coding	Length (bytes)	value
00aaaaaa	1	Length <=63 A string
01pppppp\|qqqqqqqq	2	Length <=16383 Character string
10xxxxxx\|qqqqqqqq\|rrrrrrrr\|ssssssss\|tttttttt	5	Length <=4294967295 The value is a character string
11000000	1	int16
11010000	1	int32
11100000	1	int64
11110000	1	A 24-bit signed integer
11111110	1	An 8-bit signed integer
1111xxxx (value range of XXXX [0000, 1101])	1	0 ~ 12
11111111	Ziplist stop bits

The first two bits of all encoding except the 11111111 fixed end bit are used to indicate the data type
The first two digits of 11 represent integers; 00 is a string of bytes; 01 indicates a byte string. 10 indicates a 5-byte string
The calculated length is encoding. The number denoted by the first two digits is the length of the attribute.

00001000

First of all, it begins with 00. According to the manual, we can know that the entry stores a 1-byte string, so what is its length? It’s 1000 in binary which is length 8.
8 bits is just one byte, which means that the internal storage of the node is 1 byte of content, which is probably a

01000001000000 0

It starts with 01 for a 2 byte character. Removing the first two bits translates to 256 in decimal. Represents 32 bytes of content. We can understand that this entry stores 32 English letters
The following examples will not be given. From encoding we can determine the content and length of the entry.

content

The p of the node is used to hold the actual content. From the introduction to Encoding above we know that the content can be either a string or an integer.
So we don’t need to look at the actual length of the entry when calculating the length of the entry because it is already in encoding.

Go back in time

Remember the ziplist sketch we showed above, where we labeled zlbytes, Zltail, Zllen, entryx, Zlend, etc. Now we will refine the sketch

In the above diagram we added a memory representation of three nodes. Here we will clearly understand the structure of an entry. But the final interior still subdivides the display of the head

Chain update

We know that previous_entry is used to mark the length of the previous node, and we also said that Redis will try to save memory by first using 1 byte and then expanding to 4 bytes when the length exceeds 254. This means that the length of each of our entries is dynamic. In other words, each of our entries is affected by the previous node
For example, now our entries are between 250 and 254 in length. The following figure

At this point we need to add a node of length 254 to the header. The previous_entry of our first node above will change from 1 to 5 bytes because of this node 254. This affects the entire entry to change from 250 bytes to 254 bytes. Because the first node is 254 bytes and this recursively affects the second node!

Redis calls this chain reaction of adding a node to a header a chain update. The same reason we will encounter a similar situation when deleting, should these two cases collectively referred to as chain update!

conclusion

We first parse the ziplist properties structurally, and then continue to parse the internal structure specification from the main ziplist role entry.
We divide the entry into three parts. Previous_entry is dependent on the previous node, which leads to an important redis issue [chain update].
An overview of where cascading updates occur is rare. Interlocking updates occur only when the length of each node ranges from 25 to 253, which in itself is a low profile. The second node added is 254 itself is also very low. Ziplist is mainly used for small integers or short characters.
And when there are very few nodes, even if the chain update occurs, we can accept it. After all, it’s in memory.
In conclusion! Ziplist is mainly used in small integers, short characters and other data amount is the best!!

Ask a praise, thank you big ye!!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

[Redis prequel] A data structure that saves memory to the extreme

preface

structure

node

previous_entry

Is it? Why is the fixed value 0XFE

encoding

00001000

01000001000000 0

content

Go back in time

Chain update

conclusion

[Redis prequel] A data structure that saves memory to the extreme

preface

structure

node

previous_entry

Is it? Why is the fixed value 0XFE

encoding

00001000

01000001000000 0

content

Go back in time

Chain update

conclusion

Related Posts

Here’s what you should know about Java thread pool construction parameters!

Algorithms and Interviews – How to prepare for algorithmic interviews

Want to write code quickly? With the help of dataclass