This is my third article on getting Started.

A, the dictionary

1.1 Dictionary Introduction 1.2 Dictionary implementation 1.3 Hash algorithm 1.4 Key conflict resolution 1.5 Rehash 1.6 Progressive Rehash

1.1 Dictionary Introduction

A dictionary, also known as a symbol table, associative array, or map, is an abstract data structure used to hold key-value pairs. In a dictionary, a key can be associated with a value (or mapped to a value). These associated keys and values are called key-value pairs. The keys in a dictionary are unique. Dictionaries are often a data structure built into many high-level programming languages, but the C language Redis uses does not have such a data structure built into it, so Redis built its own dictionary implementation. Dictionary is widely used in Redis. For example, Redis database uses dictionary as the underlying implementation, or Redis hash key also uses dictionary as the underlying implementation, etc.

1.2 Dictionary implementation

Redis’ dictionary uses a hash table as the underlying implementation. A hash table can have multiple hash table nodes, each of which holds a key and value pair from the dictionary. Next, we will learn the implementation of hash tables, hash nodes and dictionaries in Redis.

1.2.1 a hash table

The hash table is defined by the dict.h/dictht structure as follows:

typedef struct dictht {
    // Hash table array
    dictEntry **table;
    // Hash table size
    unsigned long size;
    // Hash table size mask, used to calculate the index value
    // is always equal to size minus 1
    unsigned long sizemask;
    // The number of existing nodes in the hash table
    unsigned long used;
} dictht;
Copy the code
  • The table attribute is an array, and each element in the array is a pointer to a dict.h/dictEntry structure, each of which holds a key-value pair.
  • The size attribute records the size of the hash table, which is the size of the table array.
  • The used attribute records the number of existing nodes (key-value pairs) in the hash table.
  • The value of the sizemask attribute is always equal to size-1, and this attribute, along with the hash value, determines which index in the table array a key should be placed on.



The figure shows an empty hash table of size 4 (without any key-value pairs).

1.2.2 Hash table nodes

Hash table nodes are defined using dictEntry constructs, each of which holds a key-value pair:

typedef struct dictEntry {
    / / key
    void *key;
    / / value
    union{
        void *val;
        uint64_t u64;
        int64_t s64;
    } v;
    // Point to the next hash table node to form a linked list
    struct dictEntry *next;
} dictEntry;
Copy the code
  1. The key attribute holds the key of the key-value pair, while the V attribute holds the value of the key-value pair, which can be a pointer, a uint64_t integer, or an Int64_t integer.
  2. The next property is a pointer to another hash table node, which can concatenate multiple key-value pairs with the same hash value to solve the collision problem.

1.2.3 dictionary

The dictionary dict.h/dict in Redis has the following structure:

typedef struct dict {
    // Type specific functions
    dictType *type;
    // Private data
    void *privdata;
    / / a hash table
    dictht ht[2];
    / / rehash index
    // When rehash is not in progress, the value is -1
    int rehashidx;
} dict;
Copy the code
  • The type attribute is a pointer to a dictType structure. Each dictType structure holds a cluster of functions that operate on key-value pairs of a particular type. Redis sets different type-specific functions for different dictionaries.
  • The privData property, on the other hand, holds optional arguments that need to be passed to those type-specific functions.

A dictType structure is defined as follows:

typedef struct dictType {
    // The function that calculates the hash value
    unsigned int (*hashFunction)(const void *key);
    // The function that copies the key
    void *(*keyDup)(void *privdata, const void *key);
    // Copy the value of the function
    void *(*valDup)(void *privdata, const void *obj);
    // Compare the key function
    int (*keyCompare)(void *pridata, const void *obj);
    // Destroy key function
    void (*keyDestructor)(void *privdata, void *key);
    // Destroy the value of the function
    void (*valDestructor)(void *privdata, void *obj);
} dictType;
Copy the code
  • The HT attribute is an array of two hash tables. In general, dictionaries only use the HT [0] hash table, and the HT [1] hash table is only used when rehashing the HT [0] hash table.
  • The rehashidx property, which records the current progress of the rehash, has a value of -1 if no rehash is currently being performed.

The following figure shows a dictionary in its normal state (without rehash).

1.3 Hashing algorithm

To add a new key-value pair to the dictionary, the program first calculates the hash and index values based on the keys of the key-value pair, and then places the hash table node at the specified index of the hash table array based on the index values.

Redis computes the hash and index values as follows:

// Calculate the hash value of the key using the hash function set by the dictionary
hash = dict->type->hashFunction(key);
// Calculate the index value using the hash table sizemask attribute and the hash value
// Ht [x] can be either ht[0] or HT [1] depending on the situation.
index = hash & dict->ht[x].sizemask;
Copy the code

For example, to add a key-value pair k0 and v0 to a dictionary, the program would first use the following statement: hash = dict->type->hashFunction(k0); Index = hash&dict->ht[0].sizemask = 8&3 = 0; Calculate the index value 0 for key k0, which means that nodes containing key pairs k0 and v0 will be placed at index 0 of the hash table array.

When a dictionary is used as the underlying implementation of a database, or the underlying implementation of a hash key, Redis uses the MurmurHash2 algorithm to calculate the hash value of the key. The advantages of the MurmurHash algorithm, originally invented by Austin Appleby in 2008, are that even if the input keys are regular, the algorithm can still give a good random distribution, and the algorithm is very fast. The latest version of the MurmurHash algorithm is MurmurHash3.

1.4 Resolving key conflicts

When two or more keys are assigned to the same index in a hash table array, we say that there is a collision between the keys. The Redis hash table uses separate chaining to resolve key conflicts. Each hash table node has a next pointer. Multiple hash table nodes can be joined together using the next pointer. This solves the problem of conflict.

The above figure uses a linked list to resolve the K2 and K1 conflict

Because a list of dictEntry nodes does not have a pointer to the end of the list, the program always adds a new node to the list head (O(1)) for speed, ahead of any other existing node.

1.5 rehash

With the continuous execution of operations, the hash table to save the key/value pair will gradually increase or decrease, in order to make a hash table load factor (the load factor) to maintain in a reasonable scope, only the number of key/value pair hash table to save too much or too little, the application needs to be on the size of the hash table corresponding expansion or contraction. The work of expanding and shrinking a hash table can be done by performing a rehash operation, where the rehash process is as follows:

Hash table expansion and contraction

The program automatically starts extending the hash table when any of the following conditions are met:

  1. The server is not currently running BGSAVE or BGREWRITEAOF, and the load factor of the hash table is greater than or equal to 1.
  2. The server is currently running the BGSAVE command or BGREWITEAOF command and the load factor of the hash table is greater than or equal to 5.

Where, the load factor of the hash table can be obtained by the following formula:

// Load factor = Number of nodes saved in hash table/size of hash table
load_factor = ht[0].used / ht[0].size;
Copy the code

The load factor for the server to perform the expansion operation is different depending on whether the BGSAVE or BGREWRITEAOF command is being executed. This is because Redis needs to create a child of the current server process during the BGSAVE or BGREWRITEAOF command execution. While most of the operating system is used to write the copy (copy – on – write) technology to optimize the use of the child process efficiency, so in the period of the child process, the server will improve execution load factor required for operation of the extension, so as to avoid as much as possible in the period of the child process to hash table to expand operations, this can avoid unnecessary memory write operation, Maximum memory savings.

On the other hand, when the load factor of the hash table is less than 0.1, the program automatically starts to perform a shrink operation on the hash table.

1.6 Progressive Rehash

To expand or shrink a hash table, you need to rehash all key pairs in HT [0] to HT [1], but this rehash action is not done once and in a centralized manner, but in multiple and gradual ways. The reason for this is that when there are millions, millions, or hundreds of millions of key pairs in the hash table, rehashing them all into HT [1] at once may cause the server to go out of service for a period of time due to the sheer amount of computation.

Therefore, in order to avoid the impact of rehash on server performance, Redis gradually rehashed the key-value pairs in HT [0] to HT [1]. The following is a detailed process for progressive rehash of a hash table: