Basic data types

There are five basic data types of Redis, respectively

  • String
  • List
  • Hash
  • Set
  • SortedSet

These basic data types form the foundation of other data types, and these basic data types correspond to different underlying implementations, which are often special optimizations for different usage scenarios. The following will summarize the basic data types and sort out their relationship with the underlying implementations.

String

Redis uses strings a lot. Although they are called strings, they can also represent long, so they can be used with the INCR instruction to increment, etc. As a number type, they are relatively space saving because they reuse the pointer field in the RedisObject (8 bytes). It is divided into two parts, 8-byte Redis object metadata information, 8-byte pointer, where the metadata information of the Redis object stores the LRU information of the type, the true encoding format, etc., and if the string stores the numeric type, the 8-byte pointer position is reused.

When the stored string, the pointer points to the structure of the SDS in Redis, namely simple dynamic string, dynamic string, since it is a dynamic word shows that it has the ability to automatically expands, so in the basic structure of SDS, capacity and len the two basic attributes, to identify the current a total of how much space is available, And how much space is used.

According to the structure of SDS, Redis has two different encoding formats to store simple dynamic strings, namely EMBSTR encoding and RAW encoding. Embstr is compact, RedisObject and real string structure are stored continuously, and ROW encoding is stored separately. It relies on Pointers in RedisObject for positioning, using embstr for strings less than or equal to 44 bytes, and row for other cases. The two encoding formats are roughly as shown in the figure below.

Note that SDS adds ‘\0’ to the end of the string, which allows the reuse of C functions, but also the introduction of delimiters that do not support full binary data.

For simple dynamic strings, the capacity expansion mechanism is more important. If the string length is smaller than 1 MB, double the capacity expansion method. If the string size is larger than 1 MB, expand the string capacity by 1 MB. However, the total length of the string cannot exceed 512 MB

List

There are different encoding formats at the bottom of the list, which can be basically viewed as a linkedlist, so searching for subscripts is mostly O(n). However, when the list has very few elements, it is stored internally using ziplist (compressed list). When the list has too many elements, it is converted to linkedlist. However, to reduce malloc calls and reduce fragmentation, Redis uses multiple Ziplist implementations linked together into a linked list, known as a QuickList. Below is a brief description of Ziplist and QuickList.

A compressed list is a compactly held list in memory with some basic information and entries and a tail at the end of the ziplist that is the 0xFF flag at the end of the compressed list. Compression list the size of the list of basic information, including compression, compression and the length of the list of the last entry, the reason for this shift, is to find the last entry, and the size of an element on every entry records, knew by calculation on an entry in the address, after such a convenient from traverse forward. The structure of the compressed list is shown below.

Since memory is compact, it is easy to reallocate and copy memory when changes are made, so it is used when there are fewer elements. Not only list, zset and Hash container objects also use compressed lists when there are fewer elements.

The quicklist is a bidirectional linkedlist composed of multiple ziplists. This design can reduce the overhead of bidirectional linkedlist Pointers, both ziplist and linkedlist advantages

Hash

Similar to Java HashMap, it uses the zipper method to resolve Hash conflicts. However, for middleware such as Redis, multiple services may be in use. In terms of Rehash, to avoid the blocking caused by rehash operations performed at one time, Progressive Rehash is used. The so-called progressive Rehash is to decompose a rehash operation into multiple executions. Redis has two hashtables for Hash dictionary types.

  • Then, for each request processed, a bucket from the original HashTable is randomly assigned
  • Redis also actively relocates dictionaries in scheduled tasks

Redis Hash can be expanded or shrunk

  • Generally, when the number of elements in the hash table is the number of buckets, the hash table starts to expand. The expanded array is twice the size of the original array
  • The reduction condition is that the number of elements is less than 10% of the size of the array, that is, the number of buckets

Set

Set is a hash with a null Value. If all values are integers, intSet will be used. If non-integer values are inserted, the encoding will change from intset to Hashtable encoding. In contrast to Ziplist,intset is ordered and binary lookup is possible, whereas Ziplist is unordered

SortedSet

A Set can be sorted naturally with basic data structures, such as TreeMap AVLTree, and SkipList is used in Redis to do this. About jump lists, recommended site articles: juejin.cn/post/684490…