“This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!”

What is the set of integers in Redis? What is the internal structure when we want to add an integer to a set? What is the default range of the integer set? How is the out-of-range data stored? Will it degrade when the longest element is removed? Today, we’re going to take a look at integer sets

Why can’t a integer set after the upgrade in the relegation operations | frequency intset an upgrade

Review past

Redis redis prequel 】 【 rapid mapping dictionary + hash broken | progressive rehash of the single thread does not affect the background work

Redis eliminated + redis of being wild 】 【 expired two-way guarantee high availability | single-threaded how to achieve fast response

[redis prequel] redis list five top values basic data | how to growth from in to out further study

Handwritten redis prequel 】 【 a LRU strategy | catch the tail of the time

Redis based on distributed lock

Redis prequel zset 】 how to solve the internal chain table lookup inefficient | jump table build

preface

Integer set believe some students have not heard of, because Redis external only provides the encapsulation of five objects! The main purpose of this series is to study the internal structure of Redis. The internal structure is an important support for the five structures of Redis!

Previously, we analyzed the List, Hash and Zset data structures of Redis respectively from the internal structure of Redis. Today we will look at how the set data structure is stored inside

The basic structure

  • We find such code in SRC /t_set.c

  • From this we know that there are two data structures in a set: HashTable + IntSet. Other structures within Redis are covered in the Redis column. Instead of focusing on hashtables, we’re going to focus on intsets.

  • From the figure above, we can see that I constructed two sets: [commonSet] and [cs]. The former stores strings and the latter exclusively stores numbers.

  • When we look at the underlying data structures of the next two collections using the Object Encoding key, we find that one is a Hashtable and the other is an IntSet. This also validates our description of the basic structure of the set above.

  • The five types provided externally in Redis are actually an abstract object of Redis called RedisObject. Internally maps our data structures inside Redis

  • The internal data structure of the CommonSet and CS collections can be roughly interpreted as this

When to use intSet

  • You can simply assume that any number will be stored using intset, but I’m afraid I’m going to hit you in the head. Not really

  • The following two conditions must be met:

intset

  • Figure said in a clear, there are three kinds of encoding in the intset values represent the contents stored data types. Now some of you might be wondering if contents is of type int8_t? Why do we need Encoding? Tracking internally through source code here really has nothing to do with int8_t. And the default data type is int16_t. I don’t need to explain much about length, but remember that the number of contents elements is not the length of the contents array!
  • Intset intset intset intset intset intset intset intset In C and C++, int ranges are defined
  • The value of int8_t ranges from -128,127. Similar to Java, byte is 1 byte or 8 bits. And his range of values is 0

2 7 …… 2 7 1 namely 128 …… 127 -2^{7} \sim 2^{7}-1 \\ 即 \\ -128 \sim 127

Add elements

sadd juejin -123
sadd juejin -6
sadd juejin 12
sadd juejin 56
sadd juejin 321	
Copy the code
  • The juejin key is intset inside.

  • We added 5 elements above and all of them are less than 16 in length! So current intset encoding=INTSET_ENC_INT16. -123 is the first 16 bits of contents.

  • So the current five elements of the length of contents is 16 times 5 is 80;

  • Note that when a set stores int data, it is stored internally in order from smallest to largest.

Types of changes

  • The above question does not know whether you have considered, or have met! Intset defaults to Int16 bits, as we added the five elements above. At this point we add the sixth element which is 65535(32 bits). If 16 bits is not enough to store, what will intset do?
  • In addition, when we add the sixth element and remove 65535, the structure is the same as before! Here we take these two questions to find out!!

upgrade

  • Let’s take a look at the first problem. The original length of five elements is 16 bits, so the added length is 32 bits. Can you directly append 32 bits to 65535?
  • The answer is absolutely no, because first of all, appending directly does not guarantee the size order of the array elements! Secondly, if the first five are 16 bits and the sixth is 32 bits, there are no extra fields in the intSet structure to mark. This means that there is no way to determine whether to parse 16 bits or 32 bits.
  • Redis will upgrade the entire contents when there is a high length added in order to facilitate parsing. I’m just going to expand the entire contents, and then I’m going to repopulate it

Add in 65535

  • Firstly, according to length, it can be determined that the number of elements after expansion is 6, and each placeholder is 32, so the length of contents is 32*6=192. The first 80 bits remain the same

Old data shift

  • So once we’ve cleared up enough space, we can shift the old data so we start at the end of the old array, and before we move it, we need to specify where we’re sorting in the new array.
  • At this point, we first compare 321 to determine that it ranks fifth in the new array, so it will occupy the range of 128 to 159 in the new contents.

  • Eventually the first five elements will be moved.

  • Finally, the new elements are filled in. When an upgrade occurs it must be because the new element is longer than the original. So its value must be at either end of the new array. Negative numbers on the far left, positive numbers on the far right

demotion

  • The second problem is what to do when the redis is deleted after the addition of 65535. In this case, the element length is 16 bits, but the encoding is 32 bits. In accordance with my view should be in the implementation of relegation!
  • Unfortunately Redis didn’t, so why not? How would you achieve it if you had to

Why not implement demotion

  • It’s easy to know if you need to upgrade when you add more than the current length, but it’s hard to know if you need to downgrade when you delete a piece of data. You need to iterate again to see if the remaining elements are smaller than the current length, implementing complexity O(N). That’s one of the reasons why we don’t downgrade, right
  • You might say it’s very quick to go through it again but it’s in memory anyway, so have you ever thought that if you downgrade it and then you get an upgrade, and then you go back and forth and upgrade it and degrade the performance of our program. We know that promotion is necessary so redis is taking a strategy of neglect here

summary

Resources: Memory upgrade optimization Memory degradation