intset

Encoding is intSet (set of small integers) when set stores integers

typedef struct intset {
    int32 encoding;
    int32 length;
    int contents[];
}
Copy the code
field describe instructions
encoding Determines whether the integer bit width is 16, 32, or 64 bits Enumeration said
length Number of elements
contents An integer array that stores element values

Intset stores the elements in order of smallest to largest. When storing an element, the encoding is upgraded based on the integer size, finds the location where the element is to be inserted, and if it is not the last, moves the element after the location one bit, and inserts the element last. If the inserted element is not an integer, the storage becomes a hash structure.

ziplist

If hash and Zset meet the following conditions, the encoding type is ziplist (compressed list). For details, see the configuration file.

Hash-max-ziplist-entries 512 # Hash-max-ziplist-value 64 # zset-max-ziplist-entries 128 for hash keys or values less than 64 # zset-max-ziplist-value 64 # zset < 64Copy the code
typedef struct ziplist {
    int32 zlbytes;
    int32 zltail_offset;
    int16 zllength;
    T[] entries;
    int8 zlend;
}

typedef struct entry {
    int<var> prevlen;
    int<var> encoding;
    byte[] content;
}
Copy the code
field describe instructions
zlbytes Number of bytes occupied by ziplist
zltail_offset The offset of the last element from the start of the compression list It is used to quickly locate the last node and traverse in reverse order
zllength Number of elements
entries Compression element
zlend Marks the end of the compressed list Constant for the FF
field describe instructions
prevlen Length of the previous entry in bytes The first entry is always 0 and the byte length changes dynamically. If the string length is less than 254, one byte is used; otherwise, five bytes are used
encoding The encoding type The encoding type changes dynamically based on the element content, which is extremely complex and will not be described in detail in this articleZiplist encoding type
content Element content, optional

Below is a ziplist demo

  • Byte 1-4, zlbytes is 25, indicating that the compressed list occupies 25 bytes in total
  • Bytes 5-8, zltail_offset is 22, indicating that the last element starts at 22
  • Bytes 9-10, zllength 3, indicating that there are three elements
  • Bytes 11-16, the first entry: prevlen=0 because it is preceded by no data item; Encoding =4, which indicates that the following 4 bytes are stored as strings and the data value is name
  • Bytes 17-21, second entry: Prevlen =6, indicating that the previous entry occupies 6 bytes. Encoding =3, which indicates that the following 3 bytes are stored as strings. The value of the data is why
  • Prevlen =5, indicating that the previous entry occupies 5 bytes. Encoding =0xFE, which is an integer stored in the following 1byte. The value of the data is 14
  • Byte 25, zlend for FF, marks the end of the compressed list

When ziplist is used to store hash structures, the key and value are stored as one entry.

As you can see, the compressed list is very compact. When an entry length becomes 254, the prevlen of the next entry expands from 1 byte to 5 bytes. This is a cascading update

quicklist

Quicklist is used to store lists. It is a hybrid of Ziplist and LinkedList, which is similar to a two-way list structure.

By default, a Ziplist in QuickList is 8K long. If the ziplist is longer than this length, another node will be created, which can be configured in the configuration file.

# -2 indicates 8K, enumeration types can be seen in the configuration file list-max-ziplist-size-2Copy the code

Quicklist has a default compression depth of 0, that is, no compression. If the compression depth is 1, the first and last two packets are not compressed. If the compression depth is 2, the first two packets and the last two packets are not compressed. You can configure this parameter in the configuration file.

list-compress-depth 0
Copy the code

skiplist

Zset uses dict to map values to scores, but also provides sorting by score, hence skiplist.

Let’s start with a demo of Skiplist

typedef struct zsl {
    zslnode* header;
    zslnode* tail;
    int maxLevel;
}
Copy the code
typedef struct zslnode {
    sds value;
    double score;
    zslforward*[] forwards;
    zslnode* backward;
}
Copy the code
typedef struct zslforward {
    zslnode* item;
    int span;
}
Copy the code
field describe instructions
header A header pointer to the jump list Value is fixed to NULL, score is fixed to 0, and BACKWARD is fixed to NULL
tail Tail pointer to jump list
maxLevel Maximum number of layers in the current skip table The maximum is 64
value Used to store string data
score Used to store score values
backward The fallback node The ← arrow in the figure
forwards Forward node → arrows in the figure, one for each layer
span Span, which stores how many nodes are skipped between one node and the next If score1 points to score5, the span value is 4,This is how ranking works

The minimum score backward is fixed with NULL, and for each newly inserted node, a random algorithm is called to assign it a reasonable number of layers

Level1 =0.75 *0.25=0.1875 Level3 =0.1875…… Leveln has a probability of (1-0.25)* math.pow (0.25,n-1)

conclusion

Redis, as a single thread memory service, has made a lot of optimization in response and data structure, which is worth learning

Object type The encoding type
string Int, raw, embstr
list quicklist
hash Dict, ziplist
set Intset, dict
zset Ziplist, skiplist + dict

HyperLogLog

The principle of HyperLogLog is Bernoulli’s test, that is, when a coin is lost, according to the number of consecutive tails X, it is calculated that a total of 2 X power coins have been lost. When X is very large, the error between the calculated total and the actual total is very close. For details, please refer to other articles.

pfadd

Element hashes to a fixed 64-bit value. The lower 14 bits are the bucket. The lower 50 bits are the bucketCopy the code

Assume that the hash value is {omit 45 bits}01100 00000000000101

  • The lower 14 bits of binary are converted to base 10 with a value of 5 (regnum), that is, we place the data in the fifth bucket
  • The first 1 in the top 50 is 3, so the count value is 3
  • Registers [5] extracts the historical value oldcount
  • Registers [5] = count is updated if count > oldcount
  • If count <= oldcount, no processing is done

HyperLogLog uses 16,384 buckets and each bucket takes up 6 bits, so a HyperLogLog takes up 12K of memory.

pfcount

Harmonic average: If my salary is 10_000 and Jack ma's salary is 1_000_000, then the average salary of Jack ma and I is 505_000, I definitely disagree... If harmonic mean is used, it is 2/(1/10_000+1/1_000_000)=19_801 similarly, the average number of buckets is n/(1/ bucket 1 bit +1/ bucket 2 bits +... +1/ bucket n digits) average number of buckets is: math.pow (2, average number of bucket digits) Total number: Case 4: constant = 0.673 * m * m; case 4: constant = 0.673 * m * m; case 4: constant = 0.673 * m * m; Case 5: constant = 0.697 * m * m; Case 6: constant = 0.709 * m * m; Default: constant = (0.7213 / (1 + 1.079 / m)) * m * m; }Copy the code