Redis Cluster principle illustrated in depth

If you don’t want to talk good guitar, you are not a good programmer. Welcome to wechat.

preface

We have talked about the Sentinel based Redis high availability architecture. We have learned about the master-slave architecture based on read/write separation. We also know how the Sentinel cluster performs failover when the master of Redis fails and how it performs failover.

To recap, the Sentinel cluster monitors the instances of Redis in the master/slave architecture of Redis. If the master node is found to be down, a Sentinel node is elected to perform failover. One of the original slave nodes is elected and promoted to master. Then let other nodes copy the newly elected master node.

You might think that this would be fine, even for our production environment, but why do we need Redis Cluster?

Why Redis Cluster

Indeed, on the data, a replication copy is guaranteed; When the master is down, failover is automatically performed.

So what’s the problem?

First of all, Redis Sentinel is also based on master-slave replication. In master-slave replication, the data of the slave is completely from the master.

If the master node has only 4 gb of memory, the slave node can store only 4 GB of data. And before the following bar fine perspective to understand Redis master-slave replication also said in the article, the separation of master-slave replication architecture is to read and write, we can increase the slave node to extend the master-slave concurrent reading ability, writing ability and storage capacity but is unable to extend, can only be to accommodate the upper limit of the master node.

So when you only need to store 4 gigabytes of data, master-slave replication and Sentinel-based high availability architectures are perfectly adequate.

But what if you’re dealing with massive amounts of data? What about 16GB, 64GB, 256GB or even 1TB? Now in the Internet business, if you are big enough, I think you will definitely face the scene of massive cache data.

This is why we need to introduce Redis Cluster.

What is a Redis Cluster

Now that we know why the Redis Cluster is needed, we can explore it.

So what is a Redis Cluster?

Quite simply, you can think of it as n master-slave architectures combined to serve the external world. A Redis Cluster requires at least three masters to form a Cluster, and each master must have at least one slave node.

Thus, if one master slave can store 32 GIGABytes of data, and if the cluster contains two masters and slaves, the entire cluster can store 64 gigabytes of data.

We know that in the master/slave architecture, we can increase the concurrency of read requests by adding slave nodes. How to do this in the Redis Cluster? Although a slave node is mounted to each master, the read and write requests in the Redis Cluster are actually completed on the master.

The slave node only serves as a data backup. When the master is down, the slave node is promoted to the master node to provide external services again.

Nodal load balancing

Now that we know what a Redis Cluster is, we can continue our discussion.

I don’t know if you’ve thought about this, but there are so many master nodes. Which node should I select when I store it? Generally, this load balancing algorithm will choose hash algorithm. So what does the hash algorithm do?

The first thing is to compute a hash value for the key, and then modulo the number of masters using the hash value. This allows the key load to be balanced across each Redis node. This is the implementation of a simple hash algorithm.

Does Redis Cluster use the hash algorithm above? The answer is no.

Redis Cluster actually adopts the algorithm similar to consistent hashing to achieve node selection. So why not use a hash algorithm for instance selection? And why is it similar? Let’s move on.

If a master goes down, all caches in Redis will be invalidated. Why all of them? Assuming there were three masters, the previous algorithm would have been Hash % 3, but if one of the masters went down, the algorithm would have changed to Hash % 2, affecting all the previously stored keys. This is a fatal blow to the DB protected behind the cache.

What is a consistent hash

Knowing the disadvantages of using traditional hash algorithms to achieve load balancing on nodes, we need to further understand what is consistent hash.

We mentioned above that hashing modulates the number of master instances, whereas consistent hashing modulates 2^32, with values in the range [0, 2^ 32-1]. Consistent hashing abstracts its range into a ring, and the hash calculated using CRC16 falls somewhere on the ring.

Then our Redis instances are also distributed on the ring, and we find the first Redis instance in clockwise order on the ring, thus completing the node allocation for the key. Let’s take an example.

Assuming that we have three Redis instances A, B and C distributed on the ring according to the position as shown in the figure, the hash value calculated at this time falls into position D after taking the modulus, then we can find the Redis instance B that should be allocated by our key in the clockwise order. Similarly, if we calculate that the position is E, then the corresponding instance of Redis selected is A.

Even if Redis instance B fails at this point, the caches of instances A and C will not be affected.

For example, node B hangs, and the key calculated at position D will be found in clockwise order. It automatically transfers the traffic from node B to node C. Other data already on nodes A and C is not affected at all.

This is a consistent hash, so that when we need to add or delete nodes later, the normal operation of other nodes is not affected.

Virtual Node Mechanism

But consistent hashing has its own small problems, such as when our Redis nodes are distributed as follows.

At this point, the probability of data falling on node A is obviously higher than the other two nodes, and the probability of data falling on node C is the lowest. In this case, data stores in the entire cluster are unbalanced, causing heavy pressure on AB node and insufficient resource utilization on C node. To solve this problem, the consistent hash algorithm introduces the virtual node mechanism.

In the ring, the virtual node corresponding to the node is added, and then the mapping from the virtual node to the real node is completed. Assuming we have now calculated position D, the first node we find in clockwise order is C #1, and the data will actually end up at node C.

By adding virtual nodes, the positions of the three ABC nodes on the ring are more even, and the probability of falling on each node is averaged. This solves the problem of uneven data storage mentioned above, which is the virtual node mechanism of consistent hashing.

What algorithm does Redis Cluster use

As mentioned above, Redis Cluster uses a class-consistent hashing algorithm, which is called class-consistent hashing because they are implemented in slightly different ways.

For example, a consistent hash is modelled on 2^32, whereas a Redis Cluster is modelled on 2^14 (16384). The Redis Cluster divides itself into 16,384 slots. The hash value calculated by CRC16 algorithm will be modulo with 16384, and the value obtained after modulo is the corresponding slot. Then each Redis node will be responsible for processing part of the slot, as shown in the table below.

node	Handle slot
A	0-5000.
B	5001-10000.
C	10001-16383.

Each Redis instance maintains its own slot-redis node mapping. Suppose you set A key on node A, but the CRC16 slot for that key is maintained by node B, then you will be prompted to go to node B.

How does Redis Cluster become highly available

If a master node in a Redis Cluster fails, how does it make the Cluster itself highly available? If the cluster needs to expand nodes at this time, which slots should it be responsible for? Let’s look at one problem at a time.

How can I expand a cluster

When a new node is added to a Cluster, how does it obtain the corresponding slot?

The answer is through reshard. A reshard can migrate any number of slots that have been assigned to a node to another node, and this is done by redis-trib within Redis. Redis-trib implements the reshard by sending commands to the node that acquired the slot and to the node where the slot was transferred.

Suppose we need to add A D node to A cluster that already has nodes A, B, and C.

In this case, redis-trib sends A request to migrate slots out of nodes A, B, and C, and A request to import slots into node D. After the request is ready, source nodes A, B, and C migrate the key/value pairs corresponding to the slot to target node D. Finally, redis-trib sends slot changes to all primary nodes in the cluster.

High availability and failover

The idea and implementation of Redis Cluster to ensure Cluster high availability is the same as Redis Sentinel. If you are interested, you can read my previous article about Sentinel.

In simple terms, for node A, if A node thinks that NODE A is down, then it is subjective down. If more than half of the nodes in the cluster believe that A is down, then A is marked as an objective outage.

Once node A is marked as objectively down, the cluster starts to fail over. The other normal master nodes vote to select one slave node from node A and switch it to the new master to provide external services. A slave is elected when it receives more than half of the master votes.

After being elected, the new master will execute Slaveof no one to stop copying node A and become master. Then it transfers all the slots handled by node A to itself, and sends PONG messages to the cluster to broadcast its latest status.

According to the idea of consistent hashing, if a node fails, the first instance of Redis encountered is found clockwise along that ring.

For a Redis Cluster, a key does not care which node it will eventually go to, but only which slot it will eventually fall into. No matter how you migrate the node, you only need to find the corresponding slot, and then find the node associated with the slot, and finally find the final Redis instance.

So what is this PONG message? Don’t worry, we’ll get to that.

Learn about the Gossip protocol

This is the protocol used by nodes in the Redis Cluster to exchange data and communicate with each other. It is called Gossip.

-Gossip girl: Gossip girl hereCopy the code

Gossip was published in a paper in 1989, I read a bunch of sources that said it was published in 1987, but the date in the article is definitely January 1989.

In case you’re interested, check out Epidemic Algorithms for Replicated. Database Maintenance, which was originally developed to solve the problem of synchronizing data between Replicated nodes in a distributed Database. However, with the development of technology, gossip has been widely used for information diffusion, fault detection and so on.

Redis Cluster uses Gossip to spread its own information. How exactly do you communicate using Gossip?

It’s very simple, just like in the picture. Each Redis node sends a PING to the other node every second, and the pinged node returns a PONG.

Gossip protocol message type

In a Redis Cluster, there are five types of messages between nodes: MEET, PING, PONG, FAIL, and PUBLISH. What do these messages say? Let me summarize this briefly.

Message type	The message content
MEET	A MEET message is sent to a node, asking the node receiving the message to join the cluster
PING	Every second, select the five nodes that have not communicated with each other for the longest time and send a PING message to check whether the corresponding nodes are online. Another strategy is if the communication delay of a node is greater than`cluster-node-time`Is half of the value of, a PING message is immediately sent to the node to avoid a long delay in data exchange
PONG	When the node receives the MEET or PING message, it will return a PONG message to the sender, indicating that it has received the MEET or PING message. At the same time, nodes can also actively broadcast their own information to the cluster through PONG messages, so that other nodes can obtain their latest attributes, just like the new master sends PONG messages to the cluster after failover
FAIL	It is used to broadcast its judgment on the breakdown of A node. Assuming that the current node judges node A to be down, it will immediately broadcast its judgment on node A to Redis Cluster, and all nodes that receive the message will mark node A
PUBLISH	A node receives a PUBLISH message and broadcasts it directly across the cluster so that clients can subscribe to the Channel regardless of which node they are connected to

The pros and cons of using Gossip

Since Redis Cluster selects Gossip, there must be some advantages of gossip.

advantages	describe
scalability	The network can allow arbitrary addition and subtraction of nodes, and the state of the newly added node will eventually be the same as that of other nodes.
Fault tolerance	Since each node holds a complete copy of metadata, any node failure does not affect the operation of Gossip
Robustness,	Similar to fault tolerance, because all nodes hold data, the platform is a decentralized design, and no node will affect the operation of the service
Final consistency	When new information needs to be transmitted, messages can be quickly sent to all nodes, so that all nodes have the latest data

Gossip can propagate information to all nodes in the O(logN) round. With each ping, the current node sends its own information along with 1/10 of the total Cluster of nodes. You can simply abstract this model as:

You forward a particularly interesting article to your circle of friends, and your friends all think it’s good, so it spreads ten, ten and so on. This is the fission spread in your circle of friends.

Of course, Gossip still has some drawbacks. For example, messages may end up taking many rounds to reach the target node, which can lead to significant latency. At the same time, the node will randomly select 5 nodes that have not communicated with each other for the longest time, which may cause a node to receive N repeated messages at the same time.

conclusion

In general, Redis Cluster integrates the master-slave architecture of Redis with Sentinel. The high availability mechanism of Redis Cluster, and the process of determining and performing a failover are all related to master-slave and Sentinel, which is why I said in a previous article, Master-slave is the cornerstone of Redis high availability architecture.

That’s all for this blog post. If you found it helpful, please give it a thumbs up, a comment, a share and a comment.

Welcome to wechat search to follow [SH full stack Notes] and check out more related articles

Recommended reading:

Redis Basics – Dissects basic data structures and their usage

Brief introduction to the JVM and garbage collection

Redis Sentinel- Simple principles and Combat

Take a look at master/slave replication in Redis

Understand basic principles of InnoDB