Before Redis 3.0, sentinel mechanism was used to monitor the status between nodes.

Redis Cluster is a distributed solution of Redis, which was officially launched in version 3.0, effectively solving the needs of Redis in the distributed aspect. Cluster architecture can be used to achieve load balancing when encountering bottlenecks such as single memory, concurrency and flow.

This paper will introduce Redis Cluster from Cluster scheme, data distribution, Cluster building, node communication, Cluster scaling, request routing, failover, Cluster operation and maintenance and other aspects.

Redis cluster solution

The Redis Cluster Cluster mode usually has the features of high availability, scalability, distribution, and fault tolerance. There are generally two Redis distributed schemes:

1. Client partition scheme

The client has already decided which Redis node the data will be stored in or read from. The main idea is to use hash algorithm to hash the key of Redis data. Through hash function, the specific key will be mapped to the specific Redis node.



The representative of client partition scheme is Redis Sharding, which is the Redis multi-instance Cluster method widely used in the industry before the emergence of Redis Cluster. Java Redis client driver library Jedis, support Redis Sharding function, namely ShardedJedis and ShardedJedisPool combined with cache pool.

advantages

No third-party middleware, controllable partition logic, simple configuration, no association between nodes, easy linear expansion, strong flexibility.

disadvantages

Clients cannot dynamically add or delete service nodes. Clients need to maintain distribution logic by themselves. If no connection is shared between clients, connections are wasted.

2. Proxy partition scheme

The client sends the request to a proxy component, which parses the client’s data, forwards the request to the correct node, and finally returns the result to the client.

Advantages: Simplified client distributed logic, transparent client access, low switching cost, and proxy forwarding and storage separation. Disadvantages: An additional layer of agents, which increases the complexity of architecture deployment and performance costs.



The main schemes of proxy partition are Twemproxy and Codis.

 2.1. Twemproxy 

Twemproxy, also known as Nutcraker, is an open source redis and Memcache intermediate proxy server application for Twitter. Twemproxy as a proxy, can accept access from multiple programs, according to the routing rules, forward to the background of each Redis server, and then back to the original route. Twemproxy has a single point of failure problem, so it needs to combine Lvs and Keepalived to make a high availability solution.



Advantages: wide application range, high stability, high availability of intermediate agent layer. Disadvantages: Smooth horizontal capacity expansion/reduction, no visual management interface, unfriendly operation and maintenance, failure, cannot automatically transfer.

2.2. Codis 

Codis is a distributed Redis solution, and for upper-layer applications, there is no difference between connecting to CoDIS-Proxy and connecting directly to native Redis-Server. The Codis layer will handle the forwarding of requests, non-stop data migration and other work. Codis uses a stateless proxy layer where everything is transparent to the client.



advantages

It realizes the high availability of upper Proxy and bottom Redis, data sharding and automatic balance, provides command line interface and RESTful API, provides monitoring and management interface, and can dynamically add and delete Redis nodes.

disadvantages

The deployment architecture and configuration are complex and do not support cross-equipment room, multi-tenant, or authentication management.

3. Query the routing scheme

The client requests any random Redis instance, and Redis forwards the request to the correct Redis node. Redis Cluster implements a hybrid form of query routing, but rather than directly forwarding requests from one Redis node to another Redis node, it redirects directly to the correct Redis node with the help of the client.

advantages

Without a central node, data is distributed on multiple Redis instances according to slots, enabling smooth node expansion/reduction, high availability and automatic failover, and low operation and maintenance costs.

disadvantages

Heavy reliance on Redis-Trib tools, lack of monitoring management, need to rely on Smart Client (maintenance connection, cache routing table, MultiOp and Pipeline support). The detection of the Failover node is slower than that of the ZooKeeper node. The Gossip messages have some overhead. Cold and hot data cannot be statistically distinguished.

The data distribution

1. Theory of data Distribution Distributed database should first solve the problem of mapping the entire data set to multiple nodes according to partition rules, that is, divide the data set into multiple nodes, and each node is responsible for a subset of the overall data.



Data distribution is usually divided into hash partition and sequential partition. The comparison is as follows:



Since Redis Cluster uses hash partitioning rules, we will focus on hash partitioning here. There are several common hash partitioning rules, which are described below:

1.1 Node Redundancy Zone

Use specific data, such as Redis key or user ID, and then calculate the hash value based on the number of nodes N using the formula: hash (key) % N, which determines which node the data is mapped to.



advantages

The outstanding advantage of this method is simplicity, and it is often used in the rules of database sub-database sub-table. Generally, pre-partition is used to plan the number of partitions based on the amount of data in advance. For example, you can divide the table into 512 or 1024 to ensure that the data capacity can be supported for a period of time in the future, and then migrate the table to another database based on the load. Capacity expansion is usually doubled to prevent data mapping from being disrupted and resulting in full migration.

Disadvantages When the number of nodes changes, for example, when nodes are expanded or shrunk, data node mapping needs to be recalculated, causing data migration.

1.2. Consistent hash partitioning

Consistent hashing is a good solution to the stability problem. All storage nodes can be arranged on the Hash ring that is connected at the end. After calculating the Hash, each key will find the adjacent storage node clockwise and store it. When a node joins or exits, only the subsequent nodes clockwise next to the node on the Hash ring are affected.



advantages

Adding and deleting nodes affects only the adjacent nodes in the clockwise direction of the hash ring, and does not affect other nodes.

Disadvantages The addition and subtraction nodes may cause some data in the hash ring cannot be hit. When a small number of nodes are used, the node changes will greatly affect the data mapping in the hash ring, which is not suitable for the distributed scheme with a small number of data nodes. A common consistent hash partition needs to double or subtract half of the nodes when adding or removing nodes to ensure data and load balance.

Note: Because of these disadvantages of consistent hash partitioning, some distributed systems, such as Dynamo systems, use virtual slots to improve consistent hashing.

1.3. Virtual Slot Partitions

Virtual slot partitioning makes clever use of hash space, using well-distributed hash functions to map all data into a fixed range of integers defined as slots.

Generally, the range is much larger than the number of nodes. For example, the range of Redis Cluster slots is 0 to 16,383. A slot is the basic unit for data management and migration in a cluster. The main purpose of adopting a wide range of slots is to facilitate data splitting and cluster expansion. Each node is responsible for a certain number of slots, as shown in the figure:

The current cluster has five nodes, and each node is responsible for an average of about 3276 slots. Due to the high quality hash algorithm, the data mapped by each slot is usually fairly uniform, and the data is evenly divided into 5 nodes for data partitioning. Redis Cluster uses virtual slot partitions.

Node 1: contains hash slots 0 to 3276.

Node 2: Contains hash slots 3277 through 6553.

Node 3: contains hash slots 6554 through 9830.

Node 4: Contains hash slots 9831 through 13107.

Node 5: Contains hash slots 13108 to 16383.

This structure makes it easy to add or remove nodes. If a node 6 is added, some slots from nodes 1 to 5 are allocated to node 6. If you want to remove node 1, move the slots in node 1 to nodes 2 to 5, and remove node 1 from the cluster.

Since moving a hash slot from one node to another does not stop service, adding or deleting or changing the number of hash slots on a node does not make the cluster unavailable.

2. Redis data partition

Redis Cluster uses virtual slot partitions. All keys are mapped to integer slots 0 to 16383 based on hash functions. Slot = CRC16 (key) & 16383. Each node is responsible for maintaining a part of slots and the key value data mapped to the slots, as shown in the figure:

Redis virtual slot partitioning features

Decoupling the relationship between data and nodes simplifies node expansion and contraction.

The node maintains slot mapping, and does not require the client or proxy service to maintain slot partition metadata.

Supports query of mappings between nodes, slots, and keys for data routing and online scaling.

3. Functional limitations of Redis cluster

Redis cluster has some limitations in function compared with single machine, so developers need to be aware of them in advance and avoid them when using them. The support for key batch operation is limited.

For example, the mset and MGET operations can be performed in batches only on keys with the same slot value. Keys mapped to different slot values are not supported because operations such as MGET and MGET may exist on multiple nodes.

Limited support for key transaction operations.

Only transactions with multiple keys on the same node are supported. When multiple keys are distributed on different nodes, the transaction function cannot be used.

Key Specifies the minimum granularity of a data partition

A large key-value object such as hash, list, etc. cannot be mapped to different nodes.

Not supporting multiple database Spaces A single Redis server supports 16 databases (DB0 to DB15). In cluster mode, only one database space can be used, that is, DB0.

The replication structure supports only one layer

The secondary node can only replicate the primary node, and the nested tree replication structure is not supported.

Redis cluster setup

Redis-cluster is an official high availability solution of Redis. There are 2^14 (16384) slots in the Redis Cluster. After a Cluster is created, slots are evenly allocated to each Redis node.

Start 6 Redis clusters and create 3 master and 3 slave clusters using redis-trib.rb. The following three steps are required to set up a cluster:

1. Prepare nodes A Redis cluster usually consists of multiple nodes, and the number of nodes must be at least six to ensure a complete high availability cluster. Cluster-enabled yes must be enabled for each node to enable Redis to run in cluster mode.

Redis cluster node planning is as follows:



Note: You are advised to create a unified directory for all nodes in the cluster. Generally, you are advised to create three directories: conf, data, and log to store configuration, data, and log files respectively. Save the configuration of the six nodes in the conf directory.

1.1. Create redis instance directories

$ sudo mkdir -p /usr/local/redis-cluster

$ cd /usr/local/redis-cluster

$ sudo mkdir conf data log

$ sudo mkdir -p data/redis-6379 data/redis-6389 data/redis-6380 data/redis-6390 data/redis-6381 data/redis-6391Copy the code

1.2 Redis configuration file management

Configure the redis.conf file of each instance based on the following template. The following is the basic configuration required for cluster construction and may be modified based on the actual situation.

# redis run in the background

daemonize yes

# Bound host port

bind 127.0.0.1

# Data storage directory

dir /usr/local/redis-cluster/data/redis-6379

# process file

pidfile /var/run/redis-cluster/${custom}.pid

# log file

logfile /usr/local/redis-cluster/log/${custom}.log

# port

port 6379

# Enable cluster mode, uncomment #

cluster-enabled yes

# Cluster configuration, the configuration file is automatically generated for the first time

cluster-config-file /usr/local/redis-cluster/conf/${custom}.conf

Set request timeout to 10 seconds

cluster-node-timeout 10000

# aOF log enabled, if necessary, it logs a log for each write operation

appendonly yesCopy the code

redis-6379.conf

daemonize yes

bind 127.0.0.1

dir /usr/local/redis-cluster/data/redis-6379

pidfile /var/run/redis-cluster/redis-6379.pid

logfile /usr/local/redis-cluster/log/redis-6379.log

port 6379

cluster-enabled yes

cluster-config-file /usr/local/redis-cluster/conf/node-6379.conf

cluster-node-timeout 10000

appendonly yesCopy the code

redis-6389.conf

daemonize yes

bind 127.0.0.1

dir /usr/local/redis-cluster/data/redis-6389

pidfile /var/run/redis-cluster/redis-6389.pid

logfile /usr/local/redis-cluster/log/redis-6389.log

port 6389

cluster-enabled yes

cluster-config-file /usr/local/redis-cluster/conf/node-6389.conf

cluster-node-timeout 10000

appendonly yesCopy the code

redis-6380.conf

daemonize yes

bind 127.0.0.1

dir /usr/local/redis-cluster/data/redis-6380

pidfile /var/run/redis-cluster/redis-6380.pid

logfile /usr/local/redis-cluster/log/redis-6380.log

port 6380

cluster-enabled yes

cluster-config-file /usr/local/redis-cluster/conf/node-6380.conf

cluster-node-timeout 10000

appendonly yesCopy the code

redis-6390.conf

daemonize yes

bind 127.0.0.1

dir /usr/local/redis-cluster/data/redis-6390

pidfile /var/run/redis-cluster/redis-6390.pid

logfile /usr/local/redis-cluster/log/redis-6390.log

port 6390

cluster-enabled yes

cluster-config-file /usr/local/redis-cluster/conf/node-6390.conf

cluster-node-timeout 10000

appendonly yesCopy the code

redis-6381.conf

daemonize yes

bind 127.0.0.1

dir /usr/local/redis-cluster/data/redis-6381

pidfile /var/run/redis-cluster/redis-6381.pid

logfile /usr/local/redis-cluster/log/redis-6381.log

port 6381

cluster-enabled yes

cluster-config-file /usr/local/redis-cluster/conf/node-6381.conf

cluster-node-timeout 10000

appendonly yesCopy the code

redis-6391.conf

daemonize yes

bind 127.0.0.1

dir /usr/local/redis-cluster/data/redis-6391

pidfile /var/run/redis-cluster/redis-6391.pid

logfile /usr/local/redis-cluster/log/redis-6391.log

port 6391

cluster-enabled yes

cluster-config-file /usr/local/redis-cluster/conf/node-6391.conf

cluster-node-timeout 10000

appendonly yesCopy the code

2. Environment preparation

2.1 Installing the Ruby Environment

$ sudo brew install rubyCopy the code

2.2. Prepare rubyGem Redis dependencies

$ sudo gem install redis Password: Fetching: Gem (100%) Successfully installed Redis-4.0.2 Parsing documentationforRedis - 4.0.2 Installing ri documentationforRedis - 4.0.2 Done installing documentationfor redis after 1 seconds

1 gem installedCopy the code

2.3. Copy redis-trib.rb to the cluster root directory

Redis -trib.rb is an official redis cluster management tool, integrated in redis source SRC directory, based on redis cluster command package into a simple, convenient, practical operation tool.

$ sudo cp /usr/local/ redis - 4.0.11 / SRC/redis - trib. Rb/usr /local/redis-cluster Copy the code

Check whether the redis-trib.rb command environment is correct. The following output is displayed:

$ ./redis-trib.rb

Usage: redis-trib <command> <options> <arguments ... > create host1:port1 ... hostN:portN --replicas <arg> check host:port info host:port fix host:port --timeout <arg> reshard host:port --from <arg>  --to <arg> --slots <arg> --yes --timeout <arg> --pipeline <arg> rebalance host:port --weight <arg> --auto-weights --use-empty-masters --timeout <arg> --simulate --pipeline <arg> --threshold <arg> add-node new_host:new_port existing_host:existing_port --slave --master-id <arg> del-node host:port node_idset-timeout host:port milliseconds

call host:port command arg arg .. arg

import host:port

--from <arg>

--copy

--replace

help (show this help)

For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.Copy the code

Redis -trib.rb is done by redis authors in Ruby. The redis-trib.rb command line tool has the following functions:



3. Install the cluster

3.1 Starting the Redis Service Node

Run the following command to start six Redis nodes:

sudo redis-server conf/redis-6379.conf

sudo redis-server conf/redis-6389.conf

sudo redis-server conf/redis-6380.conf

sudo redis-server conf/redis-6390.conf

sudo redis-server conf/redis-6381.conf

sudo redis-server conf/redis-6391.confCopy the code

After the startup is complete, Redis starts in cluster mode, and check the process status of each Redis node:

$ps - ef | grep redis - server 1 0 0 1908 4:59 afternoon?? 0:00.01 Redis-server *:6379 [cluster] 0 1911 1 0 4:59 PM? 0:00.01 Redis-server *:6389 [cluster] 0 1914 1 0 4:59 PM?? 0:00.01 Redis-server *:6380 [cluster] 0 1917 1 0 4:59 PM? 0:00.01 Redis-server *:6390 [cluster] 0 1920 1 0 4:59 PM? 0:00.01 Redis-server *:6381 [cluster] 0 1923 1 0 4:59 PM? 0:00.01 redis - server * : 6391 [their]Copy the code

In the redis.conf file of each redis node, we have configured the file path of cluster-config-file. When the cluster is started, the conf directory will generate a new cluster node configuration file. View the file list as follows:

Tree - $3 L.. ├ ─ ─ appendonly. Aof ├ ─ ─ the conf │ ├ ─ ─ node - 6379. The conf │ ├ ─ ─ node - 6380. The conf │ ├ ─ ─ node - 6381. The conf │ ├ ─ ─ │ ├─ Heavy Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal Metal metal metal metal metal metal metal metal metal metal metal metal metal metal metal metal Redis - 6381. The conf │ ├ ─ ─ redis - 6389. The conf │ ├ ─ ─ redis - 6390. The conf │ └ ─ ─ redis - 6391. The conf ├ ─ ─ data │ ├ ─ ─ redis - 6379 │ ├ ─ ─ Redis - 6380 │ ├ ─ ─ redis - 6381 │ ├ ─ ─ redis - 6389 │ ├ ─ ─ redis - 6390 │ └ ─ ─ redis - 6391 ├ ─ ─log│ ├ ─ ─ redis - 6379. The log │ ├ ─ ─ redis - 6380. The log │ ├ ─ ─ redis - 6381. The log │ ├ ─ ─ redis - 6389. The log │ ├ ─ ─ redis - 6390. The log │ └ ─ ─ ├ ─ exclude.org.txt, 8 filesCopy the code

3.2 Redis-trib associated cluster nodes

The six Redis nodes are arranged from left to right in a master-to-slave fashion.

$sudo./redis-trib. Rb create --replicas 1 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 127.0.0.1:6389 127.0.0.1:6390 127.0.0.1:6391Copy the code

After the cluster is created, redis-Trib will first allocate 16,384 hash slots to three primary nodes, namely Redis-6379, Redis-6380 and Redis-6381. Then point each slave node to the master node for data synchronization.

>>> Creating cluster

>>> Performing hash slots allocation on 6 nodes...

Using 3 masters:

127.0.0.1:6379

127.0.0.1:6380

127.0.0.1:6381

Adding replica 127.0.0.1:6390 to 127.0.0.1:6379

Adding replica 127.0.0.1:6391 to 127.0.0.1:6380

Adding replica 127.0.0.1:6389 to 127.0.0.1:6381

>>> Trying to optimize slaves allocation for anti-affinity

[WARNING] Some slaves are inthe same host as their master M: Ad4b9ffceba062492ed67ab336657426f55874b7 127.0.0.1:6379 slots: 0-5460 (5461 slots) master M: Df23c6cad0654ba83f0422e352a81ecee822702e 127.0.0.1:6380 slots: 5461-10922 (5462 slots) master M: Ab9da92d37125f24fe60f1f33688b4f8644612ee 127.0.0.1:6381 slots: 10923-16383 (5461 slots) master S: 25 cfa11a2b4666021da5380ff332b80dbda97208 127.0.0.1:6389 replicates ad4b9ffceba062492ed67ab336657426f55874b7 S: 48 e0a4b539867e01c66172415d94d748933be173 127.0.0.1:6390 replicates df23c6cad0654ba83f0422e352a81ecee822702e S: D881142a8307f89ba51835734b27cb309a0fe855 127.0.0.1:6391 replicates ab9da92d37125f24fe60f1f33688b4f8644612eeCopy the code

Then enter yes, redis-trib.rb to start the node handshake and slot assignment operation, with the following output:

Can I set the above configuration? (type 'yes' to accept): yes

>>> Nodes configuration updated

>>> Assign a different config epoch to each node

>>> Sending CLUSTER MEET messages to join the cluster

Waiting forthe cluster to join.... >>> Performing Cluster Check (using node 127.0.0.1:6379) M: Ad4b9ffceba062492ed67ab336657426f55874b7 127.0.0.1:6379 slots: 0-5460 (5461 slots) master 1 additional up (s) M: Ab9da92d37125f24fe60f1f33688b4f8644612ee 127.0.0.1:6381 slots: 10923-16383 (5461 slots) master 1 additional up (s) s: 48 e0a4b539867e01c66172415d94d748933be173 127.0.0.1:6390 slots: (0 slots) slave replicates df23c6cad0654ba83f0422e352a81ecee822702e S: D881142a8307f89ba51835734b27cb309a0fe855 127.0.0.1:6391 slots: (0 slots) slave replicates ab9da92d37125f24fe60f1f33688b4f8644612ee M: Df23c6cad0654ba83f0422e352a81ecee822702e 127.0.0.1:6380 slots: 5461-10922 (5462 slots) master 1 additional up (s) s: 25 cfa11a2b4666021da5380ff332b80dbda97208 127.0.0.1:6389 slots: (0 slots) slave replicates ad4b9ffceba062492ed67ab336657426f55874b7 [OK] All nodes agree about slots configuration. >>> Checkfor open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.Copy the code

Perform cluster check to check the number of hash slots and slot coverage occupied by each Redis node. Among 16,384 slots, primary nodes ReDIS-6379, Redis-6380, and Redis-6381 occupy 5461, 5461, and 5462 slots, respectively.

3.3. Logs of the master node of Redis

You can see that the slave node RedIS-6389 synchronizes data asynchronously from the master node RedIS-6379 in the background using the BGSAVE command.

$ cat log/redis-6379.log

1907:C 05 Sep 16:59:52.960 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0OoSep 1907: C 05 16:59:52. 961# Redis version=4.0.11, bits=64, commit=00000000, Modified =0, PID =1907, just startedSep 1907: C 05 16:59:52. 961# Configuration loaded55:59:52.964 * Increased maximum number of open files to 10032 (it was originally builtsetTo 256).1908 :M 05 Sep 16:59:52.965 * No cluster configuration found, I'm ad4b9ffceba062492ed67ab336657426f55874b7:1908 m 5 Sep 16:59:52. 967 * Running mode = cluster, Port = 6379.1908 :M 05 Sep 16:59:52.967 # Server initialized 1908:M 05 Sep 16:59:52.967 * Ready to accept connections 1917:m 05 Sep 17:01:17.782 # configEpoch set to 1 via CLUSTER set-config-epoch 1908:M 05 Sep 17:01:17.812 # IP address For this node updated to 127.0.0.1 1908:M 05 Sep 17:01:22.740 # Cluster state changed: Ok 1908:M 05 Sep 17:01:23.681 * Slave 127.0.0.1:6389 asks for synchronization 1908:M 05 Sep 17:01:23.681 * Partial resynchronization not accepted: Replication ID mismatch (Slave asked for '4c5afe96cac51cde56039f96383ea7217ef2af41', my replication IDs are '037b661bf48c80c577d1fa937ba55367a3692921' and '0000000000000000000000000000000000000000') 1908:M 05 Sep 17:01:23.681 * Disk 1908:M 05 Sep 17:01:23.682 * Background saving started by PID 1952 1952:C 05 Sep 17:01:23.683 * DB saved on disk M 05 Sep 17:01:23.749 * Background saving terminated with success 1908:M 05 Sep 17:01:23.752 * Synchronization with Slave 127.0.0.1:6389 succeededCopy the code

3.4 Redis cluster integrity detection

Run the redis-trib.rb check command to check whether the two clusters created before are successful. The check command only needs to give the address of any node in the cluster to complete the check of the entire cluster.

$. / redis - trib. Rb check 127.0.0.1:6379Copy the code

When the following information is displayed, all slots in the cluster have been allocated to nodes:

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.Copy the code

This paper introduces Redis cluster solution, data distribution and cluster construction.

Cluster schemes include client partition scheme, proxy partition scheme and query routing scheme. In the data distribution part, the author simply describes and compares the node redundancy partition, the consistency hash partition and the virtual slot partition.

Finally, an example of a three-master, three-slave virtual slot cluster using Redis-trib is set up.