Redis Cluster data sharding principle (2/3)

Wechat search “code road mark”, point attention not lost!

Through the content of the last section, we have known the Redis Cluster structure, design concept and create a Cluster from scratch, generally speaking, for Redis Cluster has a preliminary understanding. This section will focus on analyzing more details of Redis Cluster data sharding to help you better understand and use.

Data sharding mechanism

Data fragmentation

Different from the stand-alone Redis and Sentinel mode in which one node manages all keys, the Redis Cluster adopts a hash slot mechanism similar to the consistent hash algorithm, and multiple primary nodes share all key management work.

Redis Cluster uses CRC16 algorithm to distribute key space in 16384 hash slots. Hash slots are numbered from 0 to 16383 according to sequence number. Each group of master and slave nodes is only responsible for part of hash slot management operations. Moreover, the mapping between hash slots and nodes is maintained through the cluster state, which is updated as the cluster runs. As in our example above, the relationship between hash slots and nodes is as follows:

Master[0] is responsible for Slots: 0-5460
Master[1] is responsible for Slots: 5461-10922
Master[2] is responsible for Slots: 10923-16383

Whenever we perform an operation on a key through Redis Cluster, the node receiving the request will first perform a calculation on the key to get the hash slot corresponding to the key, and then find the node responsible for the hash slot from the mapping relationship between the hash slot and the node. If it is the node itself, it is directly processed. If it is another node, the client is told by redirection to connect to the correct node for processing.

HASH_SLOT = CRC16(key) mod 16384
Copy the code

Due to the existence of data sharding mechanism, different keys may be stored on different nodes, which leads to some calculation commands between multiple keys in common Redis cannot be supported. Different keys may have different hash slots. As a result, the data is stored on different nodes. If a command involves keys of multiple nodes, the performance is low. Therefore, the Redis Cluster implements all single-key commands in the normal Redis version. Complex operations that use multiple keys, such as the union and intersection operations of sets, are available only if the keys are in the same hash slot.

However, in practical application, we do store the case that a single command involves multiple keys. Based on this problem, Redis Cluster provides hash tags to meet the requirements of use to a certain extent.

The hash tag

Redis Cluster provides Hash Tags to force multiple keys to be stored in the same Hash slot. The Hash tag extracts the key used to calculate the Hash slot by matching the string between {and} in the key. For example, if the client type {abcd}test, abcd will only be used for the calculation of the hash slot. {abcd}test and {abcd}prod will be stored in the same hash slot. However, there may be multiple ‘{‘ or’} ‘in the key entered by the client. In this case, Redis Cluster will handle the following rules:

The key contains {characters and} characters to the right of {characters.
One or more characters exist between {and};

If the above two conditions are met, Redis Cluster will hash the contents between ‘{‘ and’} ‘as the real key, otherwise the original input will be used. Note: the matching of {and} follows the leftmost matching principle. Here’s an example:

{user1000}. Following and {user1000}. Followers: finally use user1000;
Foo {}{bar} : finally adopt foo{}{bar};
Foo {{bar}}zap: ends with {bar;
Foo {bar}{zap} : Finally adopt bar;

The shard

When the pressure of nodes in the cluster is too high, we will consider expanding the capacity so that new nodes share the hash slots of other nodes. When the pressure of the nodes in the cluster is unbalanced, we will consider transferring some of the hash slots from the nodes with higher pressure to the nodes with lower pressure.

Redis Cluster supports adding and removing nodes without downtime, moving out and importing hash slots between nodes, and this dynamic expansion or configuration approach is beneficial to our production practices. For example: in the electric shopping mall scene, the daily flow is relatively stable, as long as the allocation of resources on demand to ensure the safe water level can be; When there is a big rush and the traffic is large, we can add new resources to realize the horizontal expansion of service capability without stopping the machine and affecting the business. The above two situations are called Resharding or Live Reconfiguration. Let’s analyze how Redis is implemented.

Through the data structure of the cluster status, we know that the hash slot allocation is actually an array, the array index number corresponds to the hash slot, and the array value is the node responsible for the hash slot. In theory, the redistribution of hash slots is essentially to modify the corresponding node object according to the array index, and then achieve the final consistency among all nodes in the cluster through state propagation. In the figure below, change the node responsible for hash slot 1001 from 7000 to 7001.In practice, in order to implement the above process, more aspects need to be considered.

As we know, the hash slot is calculated by the key through CRC16. The hash slot is just a virtual existence for storing the key in the real node, and all operations have to return to the key. When the node responsible for the hash slot is changed from the old node to the new node, the migration of the existing key of the old node needs to be considered, that is, all the keys in the hash slot of the old node need to be transferred to the new node.

However, no matter how many keys are in the hash slot and how much data is stored in the key, migrating the key from one node to another takes time and requires atomicity. In addition, the client requests do not stop during the resharding process, and Redis needs to respond correctly to the client requests so that they are not affected.

Next, we use the sample cluster to do a re-sharding practice, and combined with the source code in-depth analysis of Redis implementation process. The following example migrates two hash slots from node 7002 to node 7000 as follows:

Using the commandRedis - cli - cluster reshard 127.0.0.1:7000Initiate a re-sharding request for the cluster;
Redis -cli output the current hash slot allocation information of the cluster, and ask the number of migrated hash slotsHow many slots do you want to move (from 1 to 16384)?, enter the number 2 and press Enter to confirm;
Redis -cli asks which node receives the migrated hash slot:What is the receiving node ID?Enter the ID of node 7000 and press Enter.
Redis – CLI ask to migrate the source of the hash slot: EnterallIndicates that the node ID is evenly divided from all other nodes and entered line by linedoneEnd means to migrate the hash slot from the input node. Here I entered the node ID 7002;
Redis – CLI output the resharding plan, source node, target node, migration hash slot number and other contents; The outputyesConfirm execution, enternoStop;
The inputyesAfter that, redis-CLI performs hash slot migration;

The following figure shows the execution process:

Above process the corresponding source for file redis – cli. C clusterManagerCommandReshard function, code is more, we focus on how the hash slot migration between nodes, so we just put up a hash slot migration analysis part of the code:

static int clusterManagerCommandReshard(int argc, char **argv) {
    /* omit code */
    int opts = CLUSTER_MANAGER_OPT_VERBOSE;    
    listRewind(table, &li);
    // Hash slot by slot
    while((ln = listNext(&li)) ! =NULL) {
        clusterManagerReshardTableItem *item = ln->value;
        char *err = NULL;
        // Migrate the hash slot from source to target
        result = clusterManagerMoveSlot(item->source, target, item->slot,
                                        opts, &err);
        /* omit code */}}/* Move slots between source and target nodes using MIGRATE.*/
static int clusterManagerMoveSlot(clusterManagerNode *source, clusterManagerNode *target, int slot, int opts,  char**err)
{
    if(! (opts & CLUSTER_MANAGER_OPT_QUIET)) {printf("Moving slot %d from %s:%d to %s:%d: ", slot, source->ip,
               source->port, target->ip, target->port);
        fflush(stdout);
    }
    if(err ! =NULL) *err = NULL;
    int pipeline = config.cluster_manager_command.pipeline,
        timeout = config.cluster_manager_command.timeout,
        print_dots = (opts & CLUSTER_MANAGER_OPT_VERBOSE),
        option_cold = (opts & CLUSTER_MANAGER_OPT_COLD),
        success = 1;
    if(! option_cold) {// Set the hash slot of the target node to importing
        success = clusterManagerSetSlot(target, source, slot, "importing", err);
        if(! success)return 0;
        // Set the source node hash slot to migrate
        success = clusterManagerSetSlot(source, target, slot, "migrating", err);
        if(! success)return 0;
    }
    // Migrate the key in the hash slot
    success = clusterManagerMigrateKeysInSlot(source, target, slot, timeout, pipeline, print_dots, err);
    if(! (opts & CLUSTER_MANAGER_OPT_QUIET))printf("\n");
    if(! success)return 0;
    /* Set the new node as the owner of the slot in all the known nodes. */
    /* Inform all nodes in turn that the node responsible for the hash slot has changed */
    if(! option_cold) { listIter li; listNode *ln; listRewind(cluster_manager.nodes, &li);while((ln = listNext(&li)) ! =NULL) {
            clusterManagerNode *n = ln->value;
            if (n->flags & CLUSTER_MANAGER_FLAG_SLAVE) continue;
            // Send the CLUSTER SETSLOT command to the node
            redisReply *r = CLUSTER_MANAGER_COMMAND(n, "CLUSTER SETSLOT %d %s %s", slot, "node", target->name);
            /* omit code */}}/* Update the node logical config */
    if (opts & CLUSTER_MANAGER_OPT_UPDATE) {
        source->slots[slot] = 0;
        target->slots[slot] = 1;
    }
    return 1;
}
Copy the code

clusterManagerCommandReshardThe function first finds the list of hash slots to be migrated according to the hash slot allocation and migration plan in the cluster, and then usesclusterManagerMoveSlotThe function migrates hash slots one by one. It is the core method of migrating hash slots and consists of several steps. You can combine the schematic diagram and text description to understand (the source node is at the top of each picture, and the target node is at the bottom) :The figure above shows the changing process of cluster status when hash slot 1000 is migrated from 7000 nodes to 7001 nodes. The steps are as follows:

Modify the migration status of source node and target node, corresponding to the first figure, where:
- Notifies the target node to set the specified slot toimportingState;
- Notifies the source node to set the specified hash slot tomigratingState;
Migrate the key from the source node slot to the target node, corresponding to the second and third figures (this step may be time-consuming, because the command processing thread of the node is occupied during the key migration)
- Using the commandCLUSTER GETKEYSINSLOT <slog> <pipeline>Query all keys in slot from source node;
- useMIGRATEThe program migrates keys from the source node to the target node, moving keys one by one. Each key migrates atomically, locking both nodes.
Tell all nodes to set the slot node to the newest node, and remove importing and migrating states from the source and target nodes, which correspond to figure 4.

Ok, so that’s the resharding process.

redirect

Data sharding causes all keys to be distributed on different nodes, and with re-sharding or failover, the mapping between hash slots and nodes will change. Then, when the client initiates an operation that can be handled, how will the cluster node and client deal with it? Let’s take a look at the two redirection mechanisms.

Version redirection

Due to the data sharding mechanism, each node in the Redis cluster is only responsible for part of the hash slots, that is, part of the storage and management of keys. The client can initiate a command request to any node in the cluster at will. At this time, the node calculates the hash slot corresponding to the current request key and queries the node responsible for the hash slot based on the mapping between the hash slot and the node. According to the query result, Redis performs the following operations:

If the current node is responsible for the key, the node executes the command immediately.
If another node is responsible for the key, the node returns one to the clientMOVEDError.

For example, first connect to port 7000 using redis-CLI in the normal way, and then execute the get TestKey command as follows:

Redis -cli -p 7000 127.0.0.1:7000> GET TestKey (error) MOVED 15013 127.0.0.1:7002Copy the code

The result tells us that TestKey corresponds to hash slot 15013, which should be handled by node 7002. The client can resolve the IP and port of the node responsible for the key based on the MOVED error information in the returned result, establish a connection with it, and then execute again. Take a test and the results are as follows:

Redis -cli -p 7002 127.0.0.1:7002> GET TestKey (nil)Copy the code

Why is that?

Because each node of Redis Cluster keeps the mapping between the hash slot and the node, when the key requested by the client is not within the responsibility of the current node, the node will not act as the proxy of the target node, but inform the client in the wrong way that the node should be responsible for the key in its opinion. Of course, if the hash slot migration happens, the node may not return accurate information, and the client may receive a MOVED or ASK error.

Therefore, this requires the client to have the ability to redirect, in time to connect to the correct node to re-initiate the command request. If commands are always redirected between the client and the node, the performance will not be as good as normal Redis mode.

What to do? Redis officially proposes two options for caching:

Before executing the request, the client first calculates the hash slot based on the input key. If the node corresponding to the current connection can handle the request, the hash slot and node (IP and port) mapping is saved. If the redirect occurs, connect to the new node, re-request until the execution can be successful, and finally save the relationship between the hash slot and the node. In this way, when the client can query the cache first, then execute the request, improving efficiency.
Through the commandCLUSTER NODESQuery cluster node status, obtain the mapping between hash slots and nodes, and cache it locally on the client. Each time a request is made, the hash slot of the key is calculated, the node is queried, and the request is executed, which is more efficient.

During the stable running of the cluster, which is most of the time, the above methods can greatly improve the efficiency of command execution. However, since re-sharding may occur during cluster operation, the information maintained by the client will become inaccurate, so when the node corresponding to the client’s hash slot changes, the client should correct it in time.

Redis-cli has MOVED redirection capabilities since version 5.0. Connect to node 7000 as a cluster client and run the preceding command. The effect picture is as follows:

Redis -cli -c -p 7000 127.0.0.1:7000> GET TestKey-> Redirected to slot [15013] located at 127.0.0.1:7002(nil) 127.0.0.1:7002 >Copy the code

A request is made to node 7000, but the client automatically connects to 7002 and re-executes the request after receiving the result from 7000.

If the client requests the key (CRC16=1000) command from the node during the re-sharding process, does it matter? With that in mind, let’s take a look at ASK redirection.

ASK a redirect

In the process of resharding, some keys stored in the hash slot are in the source node or have been migrated to the target node. At this point, the client makes a command request to the source node (especially in the case of multiple keys), and the MOVED redirection will not work properly. The following figure shows the status diagram of the cluster. Let’s analyze it:

In order to fully explain the ASK redirection process, the commands issued to nodes described in this part will contain multiple keys with the same hash slot, such as {test}1, {test}2, represented by plural keys, and assume that the hash slot corresponding to test is 1000.

As mentioned above, when a client requests for keys from a node, it calculates the hash slot for keys, and then uses the mapping between the hash slot and the node to find the node that is responsible for the hash slot. Finally, it will execute immediately or return a MOVED error.

However, if the cluster is in the process of re-sharding, the keys requested by the client may not have been migrated or may have been migrated. Let’s see what happens.

No migration: The client directly or MOVED redirects requests to the 7000 node. After checking, the 7000 node needs to process requests by itself, and keys are stored in the 7000 node.
MOVED: The client directly or through MOVED redirects requests to node 7000. After checking, node 7000 needs to process requests by itself, but keys have been completely or partially migrated to node 7001. Therefore, the client cannot find keys and process requests properly.

Therefore, the MOVED redirect is not appropriate in this case. To this end, Redis Cluster introduces ASK redirection. Let’s take a look at how ASK redirection works.

The client sends a keys request to node 7000 based on the mapping between the local cache hash slot and nodes. Based on the keys migration progress, node 7000 performs the keys request as follows:

The key hash slot is used by node 7000 if:
- When slot 1000 is not in the process of migrating, the current request is performed by node 7000 and the result is returned.
- In a migrating process, hash slot 1000 cannot be migrated, but keys corresponding to keys are not migrated. If the current request can be performed on node 7000, node 7000 processes the request and returns the result.
- When a key corresponding to the keys has been fully or partially migrated to 7001, the client is told of the node to be requested using an ASK redirection error, in the following format:

(error) -ASK <slot> <ip>:<port>
#The corresponding result is as follows:(error) - ASK 127.0.0.1 1000:7001Copy the code

When the client receives the ASK redirect error message, it sets the hash slot (1000) with a one-time identity that forces it to point to the new node (7001), and then does the following:
- Send the ASKING command to node 7001 and remove the one-time identifier.
- Then send the command that the 7001 node really needs to request;
7001 After receiving the ASKING request from the client, the node performs the following operations:
- In the process of importing hash slot 1000, it is importing the keys of the current requested keys, then the 7001 node executes the request and returns the result.
- Hash slot 1000 is importing, and it is importing the current requested keys that are not completely imported, so a TRYAGAIN error is returned.

This way, if the keys requested by the client are in the process of migration, the node will return to the client with an ASK redirect error, and the client will initiate a request to the new node. Of course, there is a certain probability that the request will fail because keys are not migrated, at which point the node will reply with “TRYAGAIN” and the client can TRYAGAIN later.

Once the move is complete, the client will receive a MOVED redirection error from the node, indicating that the management of the hash slot has been transferred to the new node. In this case, the client can change the mapping between the local hash slot and the node and send requests to the new node using the MOVED redirection logic.

MOVED and ASK redirection

Through the introduction of the previous part, I believe that we have a certain understanding of the difference between the two, a brief summary.

Both tell clients the wrong way to make target requests to other nodes;
MOVED Redirection: Tells the client which node is responsible for the current hash slot, based on the mapping between the hash slot and the node. If the client receives this error, it can update the local hash slot and node mapping cache directly. This is a relatively stable state.
ASK redirection: Tells the client that the hash slot corresponding to the keys it requested is currently being migrated to a new node and that the current node is unable to complete the request and should initiate a request to the new node. The client receives this error and will temporarily (one-time) redirect to ask for a request from the new node. This error does not affect subsequent requests from the client for the same hash slot unless it receives an ASK redirection error again.

Can the cluster provide services during capacity expansion or reduction?

This is a common interview question, and it’s not difficult to answer if you understand the nature of the question. Let’s break it down:

Cluster expansion: If a master node is added: After the master node is added, it is not responsible for any hash slots at first. In order to distribute the system stress, we need to reshard, moving some of the hash slots to the new nodes, so it’s essentially a reshard process. If you add a slave node, you only need to perform a master/slave replication with the specified master node.
Cluster scaling: Scaling means removing nodes from the cluster. If the master node is removed, normally the master node is responsible for the read and write of some hash slots. To remove the master node safely, the node responsible for the hash slot needs to be transferred to other nodes first, which is also a re-sharding process. If you remove the slave node, simply remove it.

The expansion or reduction of master nodes is essentially a process of re-sharding, which involves the migration of hash slots, that is, the migration of keys in hash slots. Redis Cluster provides ASK redirection to tell clients what is happening in the Cluster so that the client can adjust: ASKING for redirection or retry.

Therefore, on the whole, the cluster can normally provide services during capacity expansion or reduction.

conclusion

Data sharding is the foundation of Redis Cluster dynamic shrinkage and scalability. Although the content of this paper is quite wordy, the principle is relatively simple. If you focus on understanding the process and nature of expansion, you can respond to all changes with no change.

More exciting to follow