Principles of Redis data structures and basic data types

1sds

SDS data structure

What are the benefits of SDS over C strings

1 can directly obtain the length of the string at o (1) time load, while C needs to traverse, is O (n).

2. String can be modified to avoid memory overflow or leak. When SDS is modified, if the length is not enough, it will be automatically lengthened. If the string length is larger than the original length, memory overflow can occur. Furthermore, if you use trim () to remove whitespace from the C string, the C string does not automatically reduce memory usage, resulting in a memory leak

3SDS is binary safe, c strings end with ‘\0’ by default, which may not be available when storing multimedia content such as audio and video, while SDS does not have this problem

4SDS adds \0 ‘to the end so that SDS can use some C API

Memory allocation principle of SDS

When the size of SDS is smaller than 1m, the capacity of SDS is doubled each time. When the size of SDS is larger than 1m, the capacity of SDS is increased by 1m each time. The maximum capacity is 512m

2 list,

3 a dictionary

Dictionaries hold data and resolve conflicts

1 Compute the hash value

2 Save the data in the corresponding hash location and continuously append data in the following chain

Load factor calculation

Capacity calculation after capacity expansion

5*2 = 10, and the nearest 2^n is 16

How to increase

Using progressive hash, you create an expanded dictionary to coexist with the atoms, slowly move the data over, and finally delete the original data dictionary.

4 jump table

Skip table is the original data structure to add multi-level index, easy to find data

5 Set of integers

The benefits of integer sets

1Redis saves integers with 16 bits, 32 bits and 64 bits. Redis only saves integer sets in progressive upgrade mode, using the largest median of digits as the number of digits saved. If all of them are 16 bits, it uses 16 bits. If 14632 is removed it will be downgraded to 16 bits. The advantage of this is to save memory.

6 Compression table zipList

Previous_entity_length: Records the length of the last node. If the size of the last node is less than 254 bytes, previous_entity_length requires only one byte. Otherwise, it requires five bytes

Encoding: encoding

Chain update problem

If a node with a length greater than 254 is inserted at the front, the previous_entity_length of the next node increases from 1 to 5, causing the current node to change. If the length of the next node is between 251 and 253, a chain reaction occurs. All previous_entity_length lengths are expanded, resulting in performance problems. However, such performance problems are rare

7 Principle of Basic Data Type String