[Redis source series] Redis6.0 DB structure and progressive rehash super detailed source interpretation

preface

In the previous article, we studied the multithreading mechanism of Redis6. Redis provides services by the form of multithreading processing the read and write logic of packets and the main thread processing the database execution process of commands. So what is the execution flow of the command? Before understanding the execution process of the command, we must first understand the overall structure of Redis Server. In this issue, we will explore the source code around the following issues:

How to store key data structure in Redis source code?
How does Redis handle hash conflicts?
How does a progressive Rehash process work?

Redis multi-process IO review: Redis6.0 hyper detailed multithreaded IO source analysis

Redis database core data structure relationship overview

How to store key data structure in Redis source code?

First, the Redis Server object has the field redisDb *db, which is 16 db by default and can be configured as required. Dict *dict is an important structure in Redis. Most key Spaces are built based on this structure. The source structure is as follows:

Dictht HT [2] is an array of length 2. During normal operation, all key operations are performed in buckets. During rehash, both buckets are operated at the same time.

The *val field in the union is a redisObject structure. The redisObject store identifies the actual data type of val and points to the address of the data store.

Different key objects in Redis do not use a fixed encoding. In a certain case, value will be transformed into encoding. The encoding structure that can be used by different data types is shown in the following table:

How does Redis handle hash conflicts?

At present, mature theories and methods for dealing with hash conflicts in the industry are as follows:

Open addressing

This method is also called the rehash method, its basic idea is: when the key hash address P =H (key) conflict, based on p, generate another hash address P1, if P1 still conflict, based on p, generate another hash address P2,… Until it finds a non-conflicting hash address, PI, where it stores the corresponding element. This method has a general form of the rehash function:

Hi= (H (key) +di) % m I = 1,2,... N.Copy the code

Where H (key) is the hash function, m is the table length, and di is called the increment sequence. The rehashing varies depending on how the incremental sequence is valued. There are three main types:

Linear detection and re-hashing

Dii = 1,2,3… , m – 1. The characteristic of this approach is that when a conflict occurs, the next cell in the table is viewed sequentially until an empty cell is found or the entire table is searched.
Second probe and then hash

Di =12, -12, 22, -22… , k2, -k2 (k<=m/2). The characteristic of this method is: when the conflict occurs, it is more flexible to skip detection in the left and right sides of the table.
Pseudo-random detection and rehashing

Di = sequence of pseudo-random numbers. Concrete implementation, should establish a pseudo-random number generator, (such as I =(I +p) % m), and given a random number to do the starting point.

Link method

That is, when hash conflict occurs, the hash bucket element address is used as the starting point, and the conflict element is added to the current element before or after the current element. When the conflict occurs during search, the linked list can be traversed.

Once again the hash method

That is, when a hash conflict occurs, a different hash function is used to evaluate the hash value of the element again, until the calculated value can be found for an available address

Public overflow area

The hash table is divided into two parts, the normal region and the overflow region. The elements that hash conflicts are placed in the overflow region.

The above methods have their advantages and disadvantages, linking method is used more in various major languages, such as Java, PHP to deal with conflicts are using the linked list for processing, Redis is also using the linked list to deal with hash conflicts, through the dictEntry structure of the next field to build the conflict list, add element source:

Figure 1-1 shows adding hash operations

Next is set as the index of the current table list. If there is no conflict, entry.next is Null. If there is a conflict, it is the latest element in the conflict, which forms the linked list in turn. When deleting an entry, check for conflicts and key comparisons.

Figure 1-2: Delete the hash operation

Redis is a progressive Rehash

Hashtable initialization

Dict. Dictht [0] field in server. C through the following call chain to initialize:

server.c:2919:

server.db[j].dict = dictCreate(&dbDictType,NULL);
Copy the code

dict.c:111:

dict *d = zmalloc(sizeof(*d));
_dictInit(d,type,privDataPtr);
Copy the code

dict.c:121:

int _dictInit(dict *d, dictType *type,
        void *privDataPtr)
{
    _dictReset(&d->ht[0]);
    _dictReset(&d->ht[1]);
    d->type = type;
    d->privdata = privDataPtr;
    d->rehashidx = -1;
    d->iterators = 0;
    return DICT_OK;
}
Copy the code

dict.c:102:

static void _dictReset(dictht *ht)
{
    ht->table = NULL;
    ht->size = 0;
    ht->sizemask = 0;
    ht->used = 0;
}
Copy the code

Hashtable Expansion policy

We can see that no memory space is allocated for the table field during initialization. As the storage key capacity increases, the table field must be expanded. Generally, the expansion of the hash table will have its own loadfactor, such as the default loadfactor of Java hashmap is 0.75. When we need to expand or shrink the capacity, according to the corresponding strategy can be allocated to the current need of memory, similarly, Redis also has its own load factor and allocation strategy, the specific source code is as follows:

The load factor is defined as 5, dict.c:63

static unsigned int dict_force_resize_ratio = 5;
Copy the code

Dict.c :953 to determine whether to expand the capacity

Static int _dict_expandifneeded (dict *d) {// // 2. The dictionary can be expanded or // 3. /size > 5, That more than half of the capacity / / three conditions to satisfy an expansion and can start operation if (d - > ht [0]. 2 > = d - > ht [0]. Size && (dict_can_resize | | d - > ht [0]. 2 / d - > ht [0]. Size > dict_force_resize_ratio)) { return dictExpand(d, d->ht[0].used*2); } return DICT_OK; }Copy the code

To calculate the size required for expansion, dict.c:975

static unsigned long _dictNextPower(unsigned long size) { unsigned long i = DICT_HT_INITIAL_SIZE; if (size >= LONG_MAX) return LONG_MAX + 1LU; while(1) { if (i >= size) return i; i *= 2; }}Copy the code

You can see from the source code that the size value required next time is the current size * 2

Rehash process

After calculating the required size through the expansion strategy, the memory space is allocated to the size, the source code is as follows:

Redis does not assign all values in the ht[0] field to the new variable n and replace ht[0], as follows (pseudo-code):

Because the redis as a high-performance memory server, a current and save the value of the field may be more than millions of key, dozens of GB of memory, if to recalculate the hash and assignment again, in the case of large amount of data to compare is bound to cause a lot of calculation work service delay, at the same time generate too many fragments of the memory, Of course, unreasonable use leads to the storage of too many elements in variables, which will also cause similar problems. Therefore, Redis uses progressive Rehash to solve such problems. The specific methods are:

distributionht[1]Memory, where the data table hasht[0]andht[1]Two tables;
Set up thedict.rehashidx = 0, indicating that the rehash operation starts. The default value is- 1;
During rehash, all add, delete, modify and check operations are checked to be in the rehash state. In addition to completing the corresponding operations, they are also checkedht[0]On theht[1]And thendict.rehashidxIt increments by 1 until some point in timeht[0]Is migrated toht[1], then perform the replacement operation, and therehashidxSet to 1:
```
if (d->ht[0].used == 0) {
    zfree(d->ht[0].table);
    d->ht[0] = d->ht[1];
    _dictReset(&d->ht[1]);
    d->rehashidx = -1;
    return 0;
}
Copy the code
```
During rehash, the addition of elements is saved directly toht[1]See you,FIG. 1-1The delete, find, and update operations look up both tables through the for loop, seeFigure 1-2

The process of progressive rehash is shown in the following figure (note: the figure is not original, from redis progressive Rehash):

1. Prepare the rehash

2. Rehash the key-value pair on index 0

3. Rehash the key-value pair on index 1

4. Rehash the key-value pair on index 2

5. Rehash the key-value pair on index 3

6. Rehash the key-value pair on index 4. Rehash is complete

So far, the overall structure, process and progressive rehash analysis of Redis data storage have been summed up. Those who have finished reading this article should not forget to give us three likes, giving us the motivation to learn and move forward together. The next part plans to analyze the data structure and commands I am interested in redis