【Redis】 An introduction to Sentinel and its working principles

Redis Sentinel is a Redis server that runs in a special mode and does not support read and write operations. Its function is to coordinate with the Redis replication function to monitor the primary and secondary nodes, and to failover and notify the offline primary nodes.

Consider the following questions:

What is the main function of a sentry

How do sentinels find other sentinels

How did the sentry find the other slaves

How does Sentry monitor the Redis service

How does the sentry determine that the master node is offline

Why does the sentry elect a Leader

How does sentry select the new master node

How does sentry switch to the new master node

How does sentry notify the client of a master node change

What if the sentry hangs up

Note:

All information in this article is based on the author’s practice verification, if there is any mistake, please correct, thank you very much.

【Redis】 Sentinel (Sentinel) introduction and its working principle

Official data

Redis Sentinel Documentation — Redis

The sentry is introduced

The presence or absence of sentinels does not affect the Redis master-slave replication service. Sentinels act as an additional layer of monitoring rather than being interwoven with the Redis service.

Sentinel is not a single process, but a distributed system with multiple sentinel services. The sentinels use Gossip Protocols for message propagation and Agreement protocols to decide whether to perform automatic failover and select a new master node.

architecture

The sentinel cluster is independent of the Redis cluster. The sentinels connect with each other to jointly monitor and manage all the Redis nodes.

role

Monitor: Monitors the status of all Redis nodes.
Failover: When the sentry finds that the Master node goes offline, it selects one of the slave nodes as the new Master node and points the Master of all the other nodes to the new Master node. At the same time, the original Master node that has gone offline will be demoted to the slave node, and the configuration will be modified to point the Master node to the new Master node. When it comes back online, it will automatically work as the slave node.
Notification: When sentry elects a new master node, the client can be notified via the API.

Build sentinel architecture

[Redis] Docker compose deploy sentry cluster mode

Principle of sentry

From library

For the sentinel configuration, we only need to configure the information of the master library. After connecting to the master library, the sentinel will call the INFO command to obtain the information of the master library, and then parse out the information of the slave library connected to the master library, and then establish connections with other slave libraries for monitoring.

Replication information in INFO:

Replication role:master Connected_SLAVES :2 Slave0: IP =172.25.0.102,port=6379,state=online,offset=258369,lag=1 = 172.25.0.103 slave1: IP and port = 6379, state = online, offset = 258508, lag = 0 master_failover_state: no - failover master_replid:a4a6a7f3b2e15d9a43c01d4ba6c842539e582d6a master_replid2:0000000000000000000000000000000000000000 master_repl_offset:258508 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:258508Copy the code

The sentry sends the INFO command to all nodes every 10 seconds to obtain real-time topology information of the Redis cluster from each node. If a new node joins, the sentry monitors the new node.

Publish/subscribe mechanism

Sentinels discover each other’s existence by publishing/subscribing (PUB /sub) after they connect to the same master library.

Publish/subscribe (PUB/SUB) is a message communication pattern whose primary purpose is to decouple the coupling between message publishers and message subscribers. Redis acts as a Pub /sub server, routing messages between subscribers and publishers. Subscribers can subscribe to the message types they are interested in from Redis via subscribe and psubscribe commands. Redis calls message types channels. When a publisher sends a message of a particular type to Redis via a publish command, all subscribers of that channel receive the message. The message delivery here is many-to-many. A client can subscribe to multiple channels or send messages to multiple channels. 1

In Sentinel mode, sentinels create and subscribe to a channel called __sentinel__: Hello on each Redis service to discover and communicate with each other.

After subscription, each sentry will publish a Hello message with its own information to the Hello channel every 2 seconds, so that the sentry can know the status of other sentries, monitor the master node, and whether new sentries join:

127.0.0.1:6371> Subscribe __sentinel__: Hello Reading messages... (press Ctrl-C to quit) 1) "subscribe" 2) "__sentinel__:hello" 3) (integer) 1 1) "message" 2) "__sentinel__:hello" 3) 4 e342cc62ac76494c140b66b7fda80340e3a8 "172.25.0.202, 26379513, 0, mymaster, 172.25.0.101, 6379, 0" 1) "message" 2) "__sentinel__ : hello" 3) ", 26379, 5 f5ce54a6f22f71c7d273cfb9eb14377b103d4ad 172.25.0.203, 0, mymaster, 172.25.0.101, 6379, 0 "1) "message" 2) "__sentinel__:hello" 3) "172.25.0.201, 26379, 4 fa3486dfbaca9abc62b2976e821d18e697ab2db, 0, mymaster, 172.25.0.101, 6379, 0"Copy the code

monitoring

After establishing a TCP connection to the Redis node, the sentry periodically sends the PING command (1s by default) to the node to check whether the node is normal. If there is no response from a node in the Down-after-zipenconds time, it considers the node to be offline.

Subjective offline

When a sentinel finds that another node it is connected to is disconnected, it marks that node as subjective offline (+ SDOWN), including the master node, the slave node, or any other sentinel.

1:X 19 Aug 2021 07:26:29.837 # +sdown slave 172.25.0.103:6379 172.25.0.103 6379 @mymaster 172.25.0.101 6379 # Sentry subjective offline 1:19 Aug 2021 X 08:19:19. 799 # + sdown sentinel 5134 e342cc62ac76494c140b66b7fda80340e3a8 172.25.0.202 @ 26379 1:X 19 Aug 2021 08:24:06.612 # +sdown Master 172.25.0.101 6379Copy the code

When the node is reconnected, the sentry unflags it with the subjective offline operation -sdown.

1:19 Aug 2021 X 08:20:04. 811 # - sdown sentinel 5134 e342cc62ac76494c140b66b7fda80340e3a8 172.25.0.202 26379 @ mymaster 172.25.0.101 6379Copy the code

If the sentinel determines that a node or other sentinel node is subjectively offline, the sentinel does not perform additional operations. If the primary node is subjectively offline, sentry takes steps to determine if the primary node is actually down and perform a failover.

Objective offline

The sentinel’s confirmation of whether the primary node is down becomes an objective offline confirmation. If the primary node is down, the Sentinel marks the primary node as an objective offline (+ ODown) state.

1:X 19 Aug 2021 08:24:06.612 # +sdown master mymaster 172.25.0.101 6379
1:X 19 Aug 2021 08:24:06.685 # +odown master mymaster 172.25.0.101 6379 #quorum 2/2
Copy the code

To determine whether the master node is offline objectively, consensus needs to be reached with other sentries. If most sentries believe that the master node is offline subjectively, the sentries can confirm that the master node is offline objectively. A consensus is reached by a ballot, which is successful if the number of votes exceeds half the number of sentinels and is greater than or equal to the quorum set. Otherwise, sentry cannot say that the master node is objectively offline.

Quorum means quorum, and this information is configured in the sentinel configuration information:

# sentinel.conf
sentinel monitor <master-name> <ip> <redis-port> <quorum>
Copy the code

Objective offline voting process

When the sentinel finds that the primary node is offline, mark the primary node assdownState.

Sentries send to other sentriesSENTINEL is-master-down-by-addrCommand to ask other sentinels if the master node is offline.

SENTINEL IS-master-down-by-addr < IP > <port> <epoch> <runId> > IP: the SENTINEL determines the IP address of the primary node that is offline. > port: the sentry determines the primary node port that is offline. Epoch: The epoch of the sentry, which can be interpreted as the age, is incremented by one each time a round of failover is performed. > runId: runId of the sentry. # This command is used during the objective referral confirmation phase and the Leader election phase of failover. SENTINEL is-master-down-by-addr 127.25.0.101 6379 3 *Copy the code

Other sentinels, upon receiving the vote request, check the status of the master node in the local host cache and reply (1Means offline,0Normal).
The sentry who initiated the query, after receiving the reply, adds up the number of “referrals”.
When the number of referrals is greater than half the number of sentries and not less thanquorum, the primary node is marked asodownState. And start preparing for failover.

Note that when the sentinel marks the master node as ODOWN, it does not notify the other sentinels, because it gives itself a better chance of failover.
The sentry who initiated the vote has a countdown. If the number of votes is still not enough, the objective offline voting will be abandoned. And try to continue the connection with the primary node.

If multiple sentries find that the primary node is offline during the same period of time, each sentry who finds the primary node is voted. The result of the vote only allows the sentry who initiated the vote to confirm that the primary node is offline, and it is not shared with other sentries. Therefore, this offline confirmation action is initiated by multiple nodes simultaneously.

failover

Once the sentry marks the master node as ODOWN, it begins to attempt failover.

Failover is mainly composed of sentinelFailoverStateMachineZ sentinelRedisInstance () function is responsible for the 2. This function consists of a state set with five states, indicating that failover is divided into five major steps:

| SENTINEL_FAILOVER_STATE | desc | invoke | | : -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | : -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | : -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | |`WAIT_START`| | Leader election sentinelFailoverWaitStart (sentinelRedisInstance) | |`SELECT_SLAVE`| | Master selected sentinelFailoverSelectSlave (sentinelRedisInstance) | |`SEND_SLAVEOF_NOONE`Remove the | | Slave identity sentinelFailoverSendSlaveOfNoOne (sentinelRedisInstance) | |`WAIT_PROMOTION`| ascending Master | sentinelFailoverWaitPromotion (sentinelRedisInstance) | |`RECONF_SLAVES`| | | configuration from node sentinelFailoverReconfNextSlave (sentinelRedisInstance)Copy the code

SENTINEL_FAILOVER_STATE is the prefix for all states.

Leader election

The sentinel enters the WATI_START state to prepare for failover and waits until the sentinel becomes the Leader of the sentinel cluster. If the sentry does not become the Leader within the timeout period, the sentry calls the sentinelAbortFailover() function and ends the failover.

When sentinels want to failover, they first need the support of a majority of sentinels. In addition, multiple sentinels may initiate failover at the same time, so a round of election is required before failover. The sentinels with the most votes are called the Leader, and only the Leader can failover.

The election principle

Raft algorithm is used in Redis Leader election, so the process of Leader election is the same, which can be seen in detail: [distributed] Raft algorithm – Leader election. The article will cover what is not clearly explained below.

The sentry properties

According to the Raft algorithm, each sentry needs to store two pieces of information, which are the current term and the desired candidate. In Redis, defined as current_EPOCH and leader, the Leander field is used to store the desired candidate’s runId.

Vote request

Meanwhile, the SENTINEL still uses the SENTINEL IS-Master-down-by-ADDR command in its voting request, but in addition to its Epoch, it also adds its OWN runId in the parameter to mark the initiator of the vote. Such as:

SENTINEL is - master - down - by - addr 172.26.0.101 6379 4 9 effe0cdc338e245391055caa45a05adf61fed37Copy the code

Voting method

According to Raft algorithm, the sentry’s voting principle is to vote for whoever the LEADER field is.

When the sentry wants to run for office, he will turn hiscurrent_epochField plus one, and willleaderThe field points to itself.
When the sentry receives a vote request, if requestedepochLess than or equal to the sentry itselfcurrent_epochVote for yourselfleaderField indicates the sentry. If it’s greater than itselfcurrent_epoch, will update their ownleaderFields are in the requestrunIdAnd then vote for the other party.
According to points 1 and 2, if a campaign sentry receives a vote request from another sentry, then the other sentryepochEqual to oneself, will always only vote for oneself.

Thus, in the same round of elections as Epoch, the sentry who did not run would always vote for the first sentry who sent it a request. The sentinels will always vote for themselves. In this way, each voter can only vote for one person in the same Epoch election, ensuring the correctness and fairness of voting.

The election process

The specific election process of sentries is summarized as follows:

Once the sentinel confirms that the primary node is ODOWN, it increments its current_epoch by one, points the leader to itself, and sends a vote request to the other sentinels.

# Sentinel-1: Current_epoch ++ becomes 1, 1:X 19 Aug 2021 08:24:06.612 # + Sdown Master MyMaster 172.25.0.101 6379 1:X 19 Aug 2021 08:24:06.685 # Quorum 2/2 1:X 19 Aug 2021 08:24:06.685 # +new-epoch 1 1:X 19 Aug 2021 08:24:06.685 # +try-failover master myMaster 172.25.0.101 6379Copy the code

The other sentinels receive a vote request and determine whether the epoch in the request is greater than their own current_EPOCH: if so, update the current_EPOCH and point the leader to the sender, then vote to the sender. Less than or equal to votes for the sentinel (possibly itself) to which its leader field points.

# Journal (Sentinel-1) : As soon as I entered the campaign, I was asked to vote for myself on 2nd. 1:19 Aug 2021 X 08:24:06. 707 # + vote - for - leader 4 fa3486dfbaca9abc62b2976e821d18e697ab2db 1Copy the code

# Journal (Sentinel-2) : As soon as I entered the campaign, I sent a request to vote for myself. 1:19 Aug 2021 X 08:24:06. 707 # + 9 effe0cdc338e245391055caa45a05adf61fed37 1 vote - for - leaderCopy the code

# Sentinel-3: Just found out that the primary node is down, X 19 Aug 2021 08:24:06.589 # +sdown Master myMaster 172.25.0.101 6379 # X 19 Aug 2021 08:24:06.719 # +new-epoch 1 1:X 19 Aug 2021 08:24:06.731 # +vote-for-leader 4fa3486dfbaca9abc62b2976e821d18e697ab2db 1Copy the code

Every time a sentry receives a response, it stores the vote result of the other party and accumulates its own votes (the number of votes cast for itself is added to one, counting itself). When it has more than half of the votes and is not less than quorum, it becomes the Leader and displays the vote result to all sentries.
```
# Sentinel-1: Record the sentry votes 1: X 19 Aug 2021 08:24:06. 707 # 9 effe0cdc338e245391055caa45a05adf61fed37 voted for 19 Aug 9 effe0cdc338e245391055caa45a05adf61fed37 1:1 X 2021 08:24:06. 731 # 5 f5ce54a6f22f71c7d273cfb9eb14377b103d4ad voted  for 4fa3486dfbaca9abc62b2976e821d18e697ab2db 1Copy the code
```

If the sentry does not have enough votes by the time the voting clock expires, the sentry declares the election lost and enters a random waiting period, after which the election is held again.

The sentry doesn’t care if anyone wins, because if someone advances, it will announce success.

# Sentinel-1: The vote was two to one, but there were four sentinels. One of them was offline and did not vote. 1:X 19 Aug 2021 08:24:17.589 # - fail-abort -not-elected master mymaster 172.25.0.101 6379 1:X 19 Aug 2021 647 # Next failover delay: I will not start a failover before Thu Aug 19 08:30:07 2021Copy the code

If no sentry declares victory during the waiting period, the sentry re-elects after the waiting period ends and returns to Step 1.

# Sentinel-2: Current_epoch ++ = 2, X 19 Aug 2021 08:30:07.412 # +new-epoch 21 :X 19 Aug 2021 08:30:07.412 # +try- Failover Master Mymaster 172.25.0.101 1:6379 X 19 Aug 2021 08:30:07. 443 # 9 effe0cdc338e245391055caa45a05adf61fed37 2 + vote - for - leader 1:19 Aug 2021 X 08:30:07. 500 # 5 f5ce54a6f22f71c7d273cfb9eb14377b103d4ad voted for 19 Aug 9 effe0cdc338e245391055caa45a05adf61fed37 1:2 X 2021 08:30:07. 507 # 4 fa3486dfbaca9abc62b2976e821d18e697ab2db voted For 9 effe0cdc338e245391055caa45a05adf61fed37 1:2 X 19 Aug 2021 08:30:07. 520 # + elected leader master mymaster 172.25.0.101 6379Copy the code

Master the selection

After being elected Leader, the sentry enters SELECT_SLAVE state and selects a new master node.

1:X 19 Aug 2021 08:30:07.520 # + fail-state-select-slave master mymaster 172.25.0.101 6379Copy the code

Selecting a new master node follows the following rules:

To rule out:
- Offline slave nodes (sdown,odown).
- Disconnected node (PINGTimeout,disconnectedState).
- Nodes with improper Master configuration (replica-priority = 0).
- The secondary node is disconnected from the primary node for a long time (over 10 times)down-after-milliseconds).
priority, in descending order:
- The node with the highest priority (replica-priorityMinimum).
- The node with the largest replication offset.
- Configure therunIdThe node.
- randomrunIdThe node with the smallest lexicographic order.

If the primary node fails to be selected, the system tries again until a new primary node is selected.

Slave Identity Removal

When the new master node is identified, the sentry enters the SEND_SLAVEOF_NOONE state and deactivates the Slave state of the node.

1:X 19 Aug 2021 08:30:07.587 * + Fail-state-send-slaveof -noone slave 172.25.0.102:6379 172.25.0.102 6379@mymaster 172.25.0.101 6379Copy the code

The sentinel sends the Slaveof NO ONE command to the slave node, which disconnects it from the original Master node, resets its replication ID and performs persistent rewriting, and begins changing its replication status to Master.

Ascending Master

After sending the command, the sentinel enters the WAIT_PROMOTION state and waits for the node to promote itself to master.

1:X 19 Aug 2021 08:30:07.679 * + Fail-state-wait-promotion slave 172.25.0.102:6379 172.25.0.102 6379@mymaster 172.25.0.101 6379Copy the code

While waiting, the sentry sends INFO commands to it every second until its role becomes Master.

Configuring slave Nodes

When the node is promoted to Master, the Sentinels enter the RECONF_SLAVES state and update the configuration of all slave nodes for them to replicate the new Master.

1:X 19 Aug 2021 08:30:08.374 # + Promoted -slave slave 172.25.0.102:6379 172.25.0.102 6379 @myMaster 172.25.0.101 6379 1:X 19 Aug 2021 08:30:08.374 # + Fail-state-reconf-Slaves Master MyMaster 172.25.0.101 6379 2 :X 19 Aug 2021 08:30:08.374 # + Fail-state-reconf-Slaves Master MyMaster 172.25.0.101 6379Copy the code

The sentry can modify the slave replication configuration by sending the slaveof < IP > command to the slave and have the slave replicate the new master.

notice

When the sentry fails over, the sentry notifies the client of the replacement of the master node and allows the client to connect to the new master node.

When using sentinel mode, clients should usually use sentinel mode of Redis connection library, such as JedisSentinelPool for Jedis. Use sentry to get master node information and establish a connection, rather than write master node information directly in the configuration file, because the master node is mutable.

Sentinel is also a client-side notification through the publish/subscribe mechanism. Every client connected to sentinel will subscribe to sentinel’s +switch-master channel. When the Leader fails over, it will send the new master node configuration to other sentinels. All sentinels then post the master switch on the +switch-master channel, at which point the client listens for the change and connects to the new master.

# log 1:X 21 Aug 2021 08:16:05.963 # +failover-end master myMaster 172.25.0.101 6379 1:X 21 Aug 2021 08:16:05.963 # +switch-master mymaster 172.25.0.101 6379 172.25.0.103 637Copy the code

# sentinel-1 (Leader) 127.0.0.1:26371> SUBSCRIBE * 1) "pmessage" 2) "*" 3) "+ fail-end "4) "master mymaster 172.25.0.102 6379" 1) "pmessage" 2) "*" 3) "+switch-master" 4) "mymaster 172.25.0.102 6379 172.25.0.103 6379"Copy the code

# sentinel-2 127.0.0.1:26372> SUBSCRIBE * 1) "pmessage" 2) "*" 3) "+config-update-from" 4) "Sentinel 3 fbda0aa37fbc1eb6dfde66677361a4ef09a40e3 172.25.0.201 26379 @ mymaster 172.25.0.102 6379 "1)" pmessage "2) * 3) "+switch-master" 4) "mymaster 172.25.0.102 172.25.0.103 6379"Copy the code

# sentinel-3 127.0.0.1:26373> SUBSCRIBE * 1) "pmessage" 2) "*" 3) "+config-update-from" 4) "Sentinel 3 fbda0aa37fbc1eb6dfde66677361a4ef09a40e3 172.25.0.201 26379 @ mymaster 172.25.0.102 6379 "1)" pmessage "2) * 3) "+switch-master" 4) "mymaster 172.25.0.102 172.25.0.103 6379"Copy the code

Client code reference:

// package redis.clients.jedis.JedisSentinelPool
jedis.subscribe(new JedisPubSub() {
    @Override
    public void onMessage(String channel, String message) {
        log.debug("Sentinel {} published: {}.", hostPort, message);
        // Message example reference: mymaster 172.25.0.102 6379 172.25.0.103 6379
        String[] switchMasterMsg = message.split("");
        if (switchMasterMsg.length > 3) {
            if (masterName.equals(switchMasterMsg[0]) {// 3 => 172.25.0.103, 4 => 6379
                initMaster(toHostAndPort(Arrays.asList(switchMasterMsg[3], switchMasterMsg[4))); }else {
                log.debug("Ignoring message on +switch-master for master name {}, our master name is {}", switchMasterMsg[0], masterName); }}else {
            log.error("Invalid message received on Sentinel {} on channel +switch-master: {}", hostPort, message); }}},"+switch-master");
Copy the code

The client background thread subscribes +switch-master channel, parses and re-initializes the global master node initMaster() upon receiving the message.

Redis subscriptions, released | xiao technology blog (lanjingling. Making. IO)↩
Redis Sentinel Principle and Implementation (II) – Cloud + Community – Tencent Cloud (Tencent.com)↩