Redis multi-machine capabilities: replication, sentry, and clustering

Redis as a support distributed database, multi-machine operation is particularly important, this paper on Redis multi-machine function of replication, sentry and cluster functions to do a simple analysis.

copy

In Redis, users can copy one server to another by executing the SLAVEOF command or setting the SLAVEOF option. We call the replicated server the master server, and the slave server that replicates the master server.

The data on the replicated server remains in the same state, which is conceptually called “consistency.” Here’s a quick note on how Redis replicates the database using PSYNC.

PSYNC

Prior to Redis 2.8, the SYNC command was used for synchronization, but the full resynchronization mechanism after each disconnection was criticized. The PSYNC command supports full and partial resynchronization modes.

During the initial replication, have the master create and send the RDB file and send write commands to the slave to complete the resynchronization.
In the case of post-disconnection repeat, when the secondary server reconnects to the primary server after disconnection, the secondary server can process only the write commands during disconnection if conditions permit.

Partial resynchronization

To understand the synchronization process of PSYNC, the principle of partial resynchronization is briefly introduced as follows:

Copy offset
Replication backlogs
Server Running ID

Copy offset

Redis can easily determine whether the primary and secondary databases are in the same state by comparing the replication offsets of the primary and secondary databases:

If the primary and secondary servers are in the same state, the offset of the primary and secondary servers must be the same.
Conversely, if the primary and secondary servers have different offsets, you can determine that the primary and secondary servers are not in a consistent state.

Replication backlogs

The replication backlog buffer is a fixed-length first-in, first-out queue maintained by the primary server, with a default size of 1MB. When the master server propagates commands, it not only sends commands to all slave servers, but also queues write commands into the replication backlog buffer:

When the slave server reconnects to the master server, the slave server passes its replication offset to the master server via PSYNC, and the master server determines:

Partial synchronization is performed if the data after the offect offset is still in the replication backlog buffer;
Otherwise, the master server performs a full resynchronization operation on the slave server;

Server Running ID

Every Redis server running all have their own ID, each from the server to manipulate, the primary server will save ID, a master server running on, broken line reconnection will send the ID to the primary server from the server, if inconsistent ID (after a master-slave switch, an upgrade primary server) from the server. Full resynchronization is performed, otherwise partial resynchronization is performed.

The heartbeat detection

In the command propagation phase, the slave sends commands to the master server every 1s:

 REPLCONF ACK 101 # 101 is the copy offset
Copy the code

If the master server does not receive the REPLCONF ACK command from the slave server for 1s, it knows that the connection is faulty

Also, heartbeat detection can be used to detect command loss (by offset).

The sentry

Sentinel is a highly available Redis solution. Sentinel system consists of one or more sentinels that can monitor any number of master servers and all slave servers under the tree of these master servers. When the monitored master server is offline, Sentinel automatically upgrades the slave server to the new master server. The new master server then continues to process command requests, which is called failover. This process can be shown as:

The sentry cluster

Single sentinels or multiple sentinels can form a sentinel system. Sentinels can connect to a common monitoring server by command, as shown below:

Each sentry can exchange information with other sentries via a command link.

Subjective and objective downlines

By default, the Sentinel sends the PING command once per second to all the primary, secondary, and sentinel servers connected to its creation command and determines whether the instance is online based on the information returned.

The Down-after-milliseconds option in the Sentry profile specifies how long it takes for the sentry to determine how long an instance has been offline. If there is no response after down-after-milliseconds, sentry determines that the instance has been offline.

When a sentry will be the primary judgment for subjective after building, in order to confirm whether the master server is really get offline, to inquire the sentry also monitor the master server, the objective standard is offline, when have N sentries instance, must have the best N / 2 + 1 instance to judge the main library for subjective offline, to determine the main library as the objective. And failover the primary server.

The election chief

When a primary server is judged to be objectively offline, the sentinels monitoring the offline primary negotiate to elect a leader (a sentinel with half of the sentinels’ “votes” in the cluster becomes the leader) and the leader will failover the offline primary server.

Election master server

During failover, if there are multiple slave servers, how do you select the primary server from them?

Redis first provides a filtering mechanism:

Delete secondary servers that are offline or disconnected.
Delete the slave server that did not reply to the sentry INFO command for the last 5s.
Remove disconnection from disconnecting master server overdown-after-milliseconds * 10Millisecond slave server;

The head sentry then sets the master server based on which secondary server has the highest priority.

If multiple slave servers have the highest priority, the head sentry selects the slave server with the highest offset (the closest data to the master).

If there are multiple servers with the highest priorities and the largest offsets, the server with the smallest ID is selected as the primary server.

The cluster

Redis cluster is a distributed database solution provided by Redis. The cluster shares data through sharding.

Cluster based

A Redis cluster usually consists of multiple nodes, but in the initial state, all nodes are independent. To form a truly working cluster, separate nodes need to be connected:

Slots assigned

Redis cluster saves data by fragmentation. The whole database of the cluster is divided into 16384 slots. When all the slots of the database are processed by nodes, the cluster is in the online state (OK). Conversely, if a slot has no nodes to process, the cluster is in an offline state (fail).

Using the CLUSTER ADDSLOTS command, for example, we can assign a slot to a node:

Allocate 0-5000 to server1;
Allocate 5001-10000 to server2;
Allocate 10001-16384 to server3;

The shard

The Redis cluster resharding operation can change any number of slots assigned to one node to another node, and the key-value pairs of the associated slots are moved from the source node to the target node.

The resharding operation can be online, and the cluster does not need to go offline during the resharding process.

Replication and failover

The nodes of the Redis cluster are divided into master node and slave node. The master node is used for processing slots, and the slave node is used to replicate a master node and continue processing command requests in place of the master node after the replication master node goes offline.

When a slave node finds that the master node it is replicating has gone offline, the slave node will start failover of the offline master:

Of all slave nodes that replicate the offline master node, one slave node is selected.
The selected secondary node will run the SLAFEOF no one command to become the new primary node.
The new master node will revoke all slots assigned to the offline master node and assign all slots to itself.
The new master node broadcasts to the cluster to inform other nodes that it has become the master node.
The new master node starts receiving command requests related to the slots it is responsible for processing.

conclusion

With replication, sentry, and clustering, Redis implements multi-machine capabilities and provides a highly available multi-machine database implementation.