A brief description of the Redis cluster pattern

The official recommended minimum best practice for a cluster mode is a six-node, three-master, three-slave mode, as shown in Figure 00.

Key Slot and forwarding mechanism

Redis divides the key space into 16,384 slots and determines the slot of each key by the following algorithm:

CRC16(key) mod 16384
Copy the code

Since 16384 is equal to 2 to the 14th power, mod a 2 to the NTH power is the same thing as taking and subtract from 2 to the NTH power. Therefore, the optimization is:

CRC16(key) & 16383
Copy the code

When a key contains hash tags (such as key{sub}1), the slot is calculated using the string specified in the sub tags, so key{sub}1 and key{sub}2 will be in the same slot.

The client can send a command to read any slot to any cluster instance. If the slot belongs to the requested instance, it will be processed. Otherwise, the client will be told where the slot is, for example, if the following command is sent to the second Master:

Return: MOVED slot IP :port (first Master)Copy the code

By default, all read/write commands can only be sent to the Master. To use Slave to process read requests, run the readonly command on the client first.

Automatic primary/secondary switchover mechanism

If a Master fails, the Slave is switched to the Master.

How do you tell if the Master is malfunctioning? Redis cluster configuration has a configuration, cluster-node-timeout cluster heartbeat timeout. ClusterCron function (github.com/redis/redis…) A node is randomly selected to send a heartbeat every second. If no heartbeat response is received within the cluster-node-timeout period, the node is marked as pFAIL.

If more than half of the masters in the cluster mark a node as pFAIL, the node’s state becomes FAIL.

An automatic master/slave switchover is triggered when a node becomes fail. The master-slave process also involves a similar election:

  1. When a Master node is marked as FAIL, the Slave node performs the scheduled task clusterCron function and selects the replication offset, that is, the Slave node with the highest synchronization progress and the latest data, to try to become active.
  2. This Slave sets its own currentEpoch += 1 (Normally all currentEpoch in the cluster are the same, each election is incresed by 1, and each currentEpoch can only be voted once. After that, a failover request is sent to all the masters. If the majority of the masters agree, the Master/Slave switchover starts.

The cluster is unavailable

According to the above description, we can summarize the following unavailability conditions

  1. When a user attempts to access a slot where both the Master and Slave nodes are attached, the user reports that the slot cannot be obtained.
  2. When the number of Master nodes in the cluster is less than 3 or the number of available nodes in the cluster is even, the automatic primary/secondary switchover process based on fail may not work properly. Both the process of marking fail and the process of electing a new Master may be abnormal.

reference

  1. IO /topics/clus…
  2. Source code: [github.com/redis/redis