Redis cluster (ii) send commands and failover

Send the command

In the previous article, the cluster was brought online after 16,384 slots were assigned

127.0.0.1:7000> cluster info // Cluster status: OK: online Cluster_STATE: OK Cluster_SLOts_assigned :16384 Cluster_SLOts_OK :16384 Cluster_SLOts_pFAIL :0 Cluster_SLOts_FAIL :0 Cluster_known_nodes :3 Cluster_SIZE :3 Cluster_CURRENT_EPOCH :3 Cluster_MY_EPOCH :1Copy the code

The client can now send commands to the nodes in the cluster.

When a client sends a command related to a database key to a node, the node receiving the command calculates which slot the database key to be processed by the command belongs to and checks whether the slot has been assigned to it:

If the slot in which the key is located happens to be assigned to the current node, the node will execute the command directly.
If it is not on the current node, the node will return a MOVED error to the client, redirect it to the correct node and send the command it wanted to execute again.

Compute which slot the key belongs to

CLUSTER KEYSLOT <key>

127.0.0.1:7000> cluster keyslot "date"
(integer) 2022
Copy the code

Note The “date” key will be assigned to slot 2022.

When the node calculates slot I for the key, it checks its own item I in the ClusterState. slots array to determine whether the key is its own responsibility.

Version error

When the node discovers that the key is in a slot that it is not handling, it will return a MOVED error to the client directing it to the node that is handling the slot.

The realization of node database

Clustered nodes store key-value pairs and their expiration dates in the same way as stand-alone Redis servers, with only one difference: nodes can only use database 0, while stand-alone Redis servers do not have this restriction.

The shard

The Redis cluster resharding operation can change any number of slots assigned to one node (the source node) to another node (the target node), and the key-value pairs of the associated slots are moved from the source node to the target node.

During the resharding process, the cluster does not need to go offline, and both source and target nodes can continue to make command requests.

Redis’s cluster management software, Redis-Trib, takes care of this. Redis provides all the commands needed for resharding, while Redis-Trib sends commands to the source and destination nodes for resharding.

ASK the wrong

In the process of resharding, when the source node migrates a slot to the target node, it may occur that some key-value pairs belonging to the migrated slot are kept in the source node, while the other key-value pairs are kept in the target node.

When a client sends a command related to a database key to the source node and the database key to be processed by the command is in the slot being migrated:

The source node looks for the specified key in its own database and executes the command if it finds it
Conversely, if the key is not found, it may be migrated to the target node, and the source node will return an ASK error to the client, directing the client to the target node that is importing the slot and sending the command it wanted to execute again.

The difference between ASK and move

MOVED The error indicates that the responsibility of the slot has been MOVED from one node to another: After the client receives a MOVED error about slot I, each time the client encounters a command request about slot I, it can send the command request directly to the node that the MOVED error points to, because that node is currently responsible for slot I.
The ASK error is just a temporary measure used by two nodes in the process of migrating slots: After the client receives an ASK error from slot I, the client will only send a command request about slot I to the node indicated by the ASK error on the next command request, but this redirection will not affect the client to send a command request about slot I in the future. The client will still send command requests about slot I to the node currently responsible for processing slot I, unless the ASK error occurs again

Replication and failover

The nodes in the Redis cluster are divided into Master node and slave node. The Master node is responsible for processing slots, while the slave node is responsible for a Master node and continues to process command requests in place of the offline Master node when the replicated Master node goes offline.

I feel very similar to the standalone version of the master-slave configuration, with one more step, that is, after the master node hangs, the responsibility of the processing slot is inherited from the node.

Set the slave node in the cluster

CLUSTER REPLICATE <node_id>
Copy the code

You can make the node receiving the command the slave node of the node specified by node_id and start replication of the host.

Fault detection

Each node in the cluster will periodically send PING messages to other nodes in several districts to check whether they are online. If the node receiving the PING message does not reply the PONG message within the specified time, The sending node will then flag the receiving node as suspected to be offline (PROBABLE fail, PFAIL).

If more than half of the primary nodes responsible for processing slots in a cluster report this primary node X as suspected offline, this primary node X will be marked as the following line (FAIL) and a FAIL message will be broadcast to the cluster about primary node X.

failover

After the primary node goes offline, the secondary node starts to failover the primary node. The steps are as follows:

Selects the new master node from the slave nodes
The selected secondary node will run the SLAVEOF no one command to become the new primary node
The new master undoes all slots assigned to the offline master and assigns them all to itself
The new master node broadcasts the PONG message like the cluster, letting other nodes know that this node has become the new master node, taking over the slot that the offline node was responsible for processing
The new master node starts accepting command requests related to the slots it is responsible for processing, and the failover is complete.

summary

Redis cluster pit is only a little filled in, and about the Redis message is not filled in, this first leave it, first to write a paper. After starting to masturbate NIO or SPRING, daily entanglements 😖