More excellent articles.

“Microservices are not all, just a subset of a specific domain”

Selection and process should be careful, otherwise it will be out of control.

Out of all the monitoring components, there’s always one for you

What are we Developing with Netty?

This is probably the most pertinent Redis specification

Portrait of a Programmer: A Decade of Ups and Downs

Most useful series:

The Most Common Set of “Vim” Techniques for Linux Production

The most Common set of “Sed” Techniques for Linux Production

The Most Common Set of “AWK” Tips for Linux Production

If you agree with this knowledge, welcome to follow the wechat public account Xiaojie Taste

ID: xjjdog

I have maintained thousands of Redis instances using a simple master-slave structure, and the clustering scheme is mainly client JAR packages. At the beginning, I didn’t like Redis Cluster very much, because its routing was too rigid and its operation and maintenance was complicated. See our previous article, routing Rules in real life can be More Complex than you think.

But the official push for this thing is destined to become more widely used, which can be found in ordinary communication. Although there are such shortcomings, but always can not resist the tide of authority to promote. As the Redis Cluster becomes more and more stable, it is time to have a soul exchange with the Redis Cluster.

Introduction to the

Redis Cluster is the biological cluster solution, at present, in high availability and stability, have made great progress. According to statistics and observation, more and more companies and communities adopt the Redis Cluster architecture, which has become the de facto standard. Its main feature is decentralization, without proxy proxy. One of the main design goals is to achieve linear scalability.

The Redis Cluster server, by itself, cannot do what it promises. A broad redis cluster should contain both redis server and client implementation such as Jedis. They are a whole.

Distributed storage is nothing more than dealing with shards and replicas. For Redis Cluster, the core concept is slot, understand it, basically understand the cluster management mode.

The advantages and disadvantages

Once you understand these features, operations are actually much simpler. Let’s start with the obvious pros and cons.

advantages

1. No additional Sentinel clusters are required, providing users with a consistent solution and reducing learning costs. 2. Decentralized architecture, node peer-to-peer, cluster can support thousands of nodes. 3. The concept of slot is abstracted and o&M operations are performed on slot. 4. Replicas can automatically fail over without human intervention in most cases.

disadvantages

1, the client to cache part of the data, the implementation of Cluster protocol, relatively complex. 2. Data is asynchronously replicated, which cannot ensure strong data consistency. 3. Resource isolation is difficult and traffic is often unbalanced, especially when multiple services share a cluster. Data does not know where, for hot data, can not be completed through special optimization. 4, from the library is completely cold standby, can not share the read operation, is really a waste of wife. Extra work is needed. 5, MultiOp and Pipeline support is limited, if the old code architecture upgrade, be careful. 6. Data migration is key-based rather than slot-based, which is slow.

The basic principle of

The positioning process from slot to key is obviously a two-layer route.

The key of the routing

The Redis cluster has little to do with common consistent hashes and uses the hash slot concept. When accessing a key, the Redis client first uses crC16 to calculate a value for the key and then mod the value.

crc16(key)mod 16384
Copy the code

Simple principle of server

As mentioned above, the Redis Cluster defines a total of 16384 slots, and all cluster operations are encoded around this slot data. The server uses a simple array to store this information.

Using a Bitmap is the most space-efficient way to store the presence or absence of operations. The Redis cluster uses an array called slot to store whether the current node holds the slot.

As shown in the figure, the length of this array is 16384/8=2048 bytes, so 0 or 1 can be used to identify whether this node owns a slot.

When all 16384 slots in the database have nodes processing, the cluster is online (OK). Conversely, if any slot in the database is not processed, the cluster is in an offline state (fail).

When a client sends a command to a node, the node receiving the command calculates which slot the key belongs to and checks whether the slot is assigned to it. If it is not its own, it directs the client to the correct node.

Therefore, the client can connect to any machine in the cluster to complete the operation.

Install a 6-node cluster

The preparatory work

Suppose we want to assemble a 3-shard cluster with one copy for each shard. Then there are 3*2=6 node instances in total. Redis can be started by specifying a configuration file, and all we have to do is modify the configuration file.

Make six copies of the default configuration files.

for i in{0.. 5}do  
cp redis.conf  redis-700$i.conf
done  
Copy the code

To modify the contents of the configuration file, in the case of Redis-7000. conf, we need to enable its cluster mode.

cluster-enabled yes
port 7000
cluster-config-file nodes-7000.conf
Copy the code

Nodes-7000. conf saves some cluster information to the current node, so it must be independent.

Start & Close

Again, we use scripts to launch it.

for i in{0.. 5}do
nohup ./redis-server redis-700$i.conf &
done
Copy the code

For demonstration purposes, we’re going to force it off.

ps -ef| grep redis | awk '{print $2}' | xargs kill9 -Copy the code

Combination of the cluster

We use Redis – CLI for cluster composition. Redis will do this automatically. This sequence of processes is combined by sending instructions to each node.

/redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 --cluster-replicas 1Copy the code

Several advanced principles

Node failure

Each node in the cluster periodically sends ping messages to other nodes in the cluster to check whether they are online. If the node receiving the ping message does not return the PONG message within the specified time, the node that sent the ping message will mark the node receiving the ping message as suspected offline (PFAIL).

If more than half of the nodes in a cluster report a primary node X as suspected offline, the primary node X will be marked as offline (FAIL). The node marked as FAIL will broadcast a FAIL message about X to the cluster, and all nodes receiving this FAIL message will immediately mark X as FAIL.

You can notice that this process is similar to the judgment of es and Zk nodes, which are judged only after more than half, so the number of primary nodes is usually odd. Since there is no maximum group configuration, there is theoretically a split brain (not yet encountered).

Master-slave switch

When a node finds that its primary node is in the Fail state, it selects one of its primary nodes and runs the Slaveof no one command to become the primary node.

When the new node completes its slot assignment, it broadcasts a PONG message to the cluster so that the other nodes are immediately aware of these changes. It tells someone that I have become the master node, that I have taken over the node in question and become its stand-in.

Redis makes extensive use of these directives already defined for the management of the cluster. So these instructions are not only for us to use from the command line, but also for redis itself.

Data synchronization

When a slave machine is connected to the master, a sync command is sent. After receiving this command, the master starts the saving process in the background. After the execution, the master transfers the entire database file to the slave, thus completing the first full synchronization.

The master then sends the changes it receives to the slaves, in turn, to achieve final synchronization. Since Redis 2.8, breakpoint continuation of master/slave replication is supported. If the network connection is down during master/slave replication, the replication can continue where the last copy was made, rather than starting from scratch.

Data loss

The redis cluster uses asynchronous replication between nodes. There is no ack concept like Kafka. Status information is exchanged between nodes through the Gossip protocol, and the role promotion from Slave to Master is completed through the voting mechanism. It is bound to take time to complete this process. It is easy to have Windows in the process of failure, resulting in the loss of written data. For example, there are two cases.

1. The command has been sent to the master, but the data has not been synchronized to the slave. The master will reply ok to the client. If the master node goes down at this time, this data will be lost. Redis avoids a lot of problems by doing this, but it is intolerable for a system that requires high data reliability.

Second, because the routing table is stored in the client, there is a timeliness problem. If a partition causes a node to become unreachable, a secondary node is promoted, but the original primary node is available at that point (without failover). In this case, if the routing table of the client is not updated, it will write data to the wrong node, causing data loss.

Therefore, Redis Cluster works well in general, but in extreme cases some value loss problems are not solved at present.

Complex operation and maintenance

The operation and maintenance of Redis Cluster is very complicated. Although it has been abstracted, the process is still not simple. Some instructions, must understand its implementation principle in detail, in order to really rest assured to use.

Here are some of the commands used for capacity expansion. In actual use, these commands may need to be typed several times and their status monitored, so it is almost impossible to run them manually.

There are two entrances to operation and maintenance. One is to use redis-CLI to connect to any machine and send commands starting with cluster. Most of these commands operate slots. When you begin to assemble a cluster, it is the concrete logical execution of these commands that you call repeatedly.

Another entry point is to use the redis-cli command, plus the –cluster parameter and instruction. This mode is used to control cluster node information, such as adding and deleting nodes. So this is recommended.

Redis Cluster provides very complex commands that are difficult to operate and remember. You are advised to use a tool such as CacheCloud.

Here are a few examples.


By sending the CLUSTER MEET command to node A, the client can ask the receiving node A to add another node B to the CLUSTER where node A is currently located:

CLUSTER MEET  127.0.0.1 7006
Copy the code

Using the Cluster addslots command, you can assign one or more slots to a node.

127.0.0.1:7000> CLUSTER ADDSLOTS 0 12 3 4.. 5000Copy the code

Set the slave node.

CLUSTER REPLICATE <node_id>
Copy the code

redis-cli –cluster

Redis-trib. rb is the official redis Cluster management tool, but the latest version has recommended using redis-cli for operation.

Add a new node to the cluster

Redis -cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7007 --cluster-replicas 1Copy the code

Delete a node from a cluster

Redis - cli - cluster del -node 127.0.0.1:54 abb85ea9874af495057b6f95e0af5776b35a52 in 7006Copy the code

Migrate slots to new nodes

Redis - cli - cluster reshard 127.0.0.1:7006 - cluster - from 54 abb85ea9874af495057b6f95e0af5776b35a52 - cluster - the to 895e1d1f589dfdac34f8bdf149360fe9ca8a24eb --cluster-slots 108Copy the code

There are many similar commands.

Reshard: migrate the cluster online. Rebalance: balance the number of slot nodes in the cluster. Add-node: add a new node. Delete a node set-timeout: sets the timeout period for a node call: runs the import command on all nodes in the cluster: imports external REDis data into the cluster

Overview of other options

A master-slave mode

Redis first supported m-S mode, that is, one master, many followers. A single Redis QPS can reach 10W +, but in some high-traffic scenarios, it is still not enough. Generally, read/write separation is used to increase the slave and reduce the pressure on the host.

Since it is a master-slave architecture, it faces the synchronization problem. The synchronization of Redis master-slave mode is divided into full synchronization and partial synchronization. When creating a slave machine, it is inevitable to perform a full synchronization. After the full synchronization is complete, the system enters the incremental synchronization phase. This is no different from the Redis cluster.

This model is stable, but it takes a little extra work. Users need to develop their own master-slave switching functionality, which uses sentries to detect the health of each instance and then commands to change the cluster state.

As the cluster size increases, the master-slave model quickly hits a bottleneck. Therefore, client-side hashes are typically used for extensions, including memcached consistency hashes.

However, by adding features like ZK active notification to maintain the configuration in the cloud, you can significantly reduce the risk. The thousands of Redis nodes I have maintained are managed in this way.

The proxy pattern

Code patterns were popular before Redis Cluster, such as CODIS. The proxy layer simulates itself as a Redis, receives requests from clients, and then shards and migrates data according to custom routing logic without changing any code on the business side. In addition to smooth capacity expansion, some primary/secondary switchover and FailOver functions are also completed at the agent layer, and the client even does not have any sense of it. This type of program is also known as distributed middleware.

A typical implementation is shown below, and the redis cluster behind it can even be mixed.

Multiple proxies generally use LVS and other pre-load balancing designs. If the proxy layer has few machines and the traffic of the back-end REDis is high, the network card will become the main bottleneck.

Nginx can also serve as the Proxy layer of Redis, which is technically called Smart Proxy. It’s a bit of a backdoor approach, and if you’re familiar with Nginx, it’s an elegant approach.

Use restrictions and pits

Redis is extremely fast. Generally, the faster something is, the greater the consequences when it goes wrong. Not long ago, I wrote a redis specification: “This is probably the most pertinent Redis specification.” Specifications, like architectures, are best suited to your company’s environment, but will provide some basic ideas.

Strictly prohibited things are generally the place where predecessors stepped on the pit. In addition to the content of this specification, the following points about Redis-cluster are added.

1. Redis Cluster claims to support 1K nodes, but you’d better not do that. As the number of nodes increases to 10, you can feel some jitter in the cluster. Such a large cluster proves that your business is already great, so consider client sharding.

2. Avoid generating hot spots. If all traffic reaches a node, the consequences are generally very serious.

3, big key do not put redis, it will produce a large number of slow query, affect the normal query.

4. If you are not using the cache as storage, be sure to set the expiration time. It is very unpleasant to be in the manger.

5, large flow, do not open AOF, open RDB.

6, the operation of Redis cluster, less use pipeline, less use multi-key, they will produce a lot of unpredictable results.

The above is some supplement, more or refer to the specification bar “this is probably the most pertinent Redis specification”.

End

Redis code is so tiny that it certainly won’t be able to implement very complex distributed power. Redis is positioned for performance, horizontal scaling, and availability, which is sufficient for simple, moderate traffic applications. Production environments are not trivial and are destined to be a combinatorial optimization solution for complex and highly concurrent applications.