Moment For Technology

Deep analysis of Redis series (2) - Redis sentinel mode and high availability cluster

Posted on Dec. 1, 2022, 8:40 a.m. by Patricia Parsons
Category: The back-end Tag: The back-end redis The server Load balancing

preface

In Redis' master-slave replication mode, once the primary node fails to provide services, the secondary node needs to be promoted to the primary node manually and the client needs to be notified to update the address of the primary node, which is unacceptable to some extent. Since Redis 2.8, the Redis Sentinel Sentinel mechanism has been provided to solve this problem.

Other articles

The body of the

1. Redis High Availability Overview

In a Web server, high availability refers to the amount of time the server is accessible, measured by how long it can provide normal service (99.9%, 99.99%, 99.999%, and so on). At the Redis level, high availability is broader. In addition to ensuring normal services (such as master/slave separation and rapid disaster recovery technology), data capacity expansion and data security should also be considered.

In Redis, the techniques for achieving high availability include persistence, replication, sentry, and clustering. Here's a brief description of what they do and what problems they solve:

  • Persistence: Persistence is the simplest method of high availability. It is mainly used for data backup, that is, data is stored on the hard disk to ensure that the data will not be lost due to process exit.

  • Replication: Replication is the foundation of high availability Redis. Both Sentinel and cluster implement high availability on the basis of replication. Replication mainly implements multi-machine data backup, load balancing for read operations and simple fault recovery. The defects are that fault recovery cannot be automated, the write operation cannot be load balanced, and the storage capacity is limited by the single machine.

  • Sentinel: Based on replication, Sentinel implements automatic failure recovery. The disadvantages are that the write operation cannot be load balanced and the storage capacity is limited by the single machine.

  • Cluster: Through cluster, Redis solves the problem that write operation cannot be load balanced and storage capacity is limited by single machine, and realizes a relatively complete high availability solution.

2. Basic concept of Redis Sentinel

Redis Sentinel is a highly available Redis implementation. Sentinel is a tool that manages multiple Instances of Redis and enables monitoring, notification, and automatic failover of Redis. The basic concept of Redis Sentinel is briefly introduced in the following.

Description of basic nouns:

Basic nouns Logical structure The physical structure
Redis data node Primary and secondary nodes Processes on the primary and secondary nodes
The master node (master) Redis main database A separate Redis process
From the node (slave) Redis from the database A separate Redis process
Sentinel node Monitor Redis data nodes An independent Sentinel process
Sentinel node set Abstract combination of Sentinel nodes Several Sentinel node processes
Redis Sentinel Redis high availability implementation Sentinel node collection and Redis data node process
Application client Refers to one or more clients One or more client processes or threads

As shown in the figure, the schematic diagram of Redis' master-slave replication mode and Sentinel high availability architecture:

3. The Redis master/slave replication is faulty

Redis primary-secondary replication synchronizes data from the primary node to the secondary node. The secondary node has two functions:

  1. If the primary node goes down, the secondary node can act as a backup of the primary node at any time.
  2. Expand the read capability of the primary node to share the read pressure.

The following problems exist in the primary and secondary replication:

  1. When the primary node breaks down, the secondary node becomes the primary node. In addition, the IP address of the primary node of the application needs to be changed. In addition, all secondary nodes need to be ordered to replicate the new primary node.

  2. The write capability of the primary node is limited by the single node.

  3. The storage capacity of the primary node is limited by the single node.

  4. The disadvantages of native replication were also prominent in earlier versions, such as when the Redis replication breaks and the slave node initiates psync. If the synchronization fails, full synchronization is performed. When the full backup is performed for the primary database, a lag of milliseconds or seconds may occur.

4. Redis Sentinel delves

4.1. Architecture of Redis Sentinel

4.2. Main functions of Redis Sentinel

Sentinel provides primary node survival detection, primary/secondary running status detection, failover, and primary/secondary switchover. Redis' Sentinel minimum configuration is one master and one slave.

Redis' Sentinel system can be used to manage multiple Redis servers. The system can perform four tasks:

  • monitoring

Sentinel will constantly check whether the primary and secondary servers are running properly.

  • notice

Sentinel uses API scripts to notify administrators and other applications when a monitored Redis server has problems.

  • Automatic failover

When the primary node fails to work properly, Sentinel will start an automatic failover operation, which will upgrade one of the slave nodes that have a master-slave relationship with the failed primary node to the new master node and point other slave nodes to the new master node.

  • Configuring the provider

In The Redis Sentinel mode, the client application connects to a collection of Sentinel nodes at initialization, and obtains information from the primary node.

4.3. Subjective and objective referrals

By default, each Sentinel node pings the Redis node and other Sentinel nodes once per second to determine whether the node is online or not.

  • Subjective offline

Subjective offline applies to all primary and secondary nodes. If Sentinel does not receive a valid response from the target node within down-after-milliseconds, it will determine that the node is a subjective offline.

  • Objective offline

Objective offline applies only to the primary node. If the primary node fails, the Sentinel node will run the Sentinel is-master-down-by-addr command to ask other Sentinel nodes to judge the status of this node. If more nodes than determine that the primary is unreachable, the Sentinel node determines that the primary is objectively offline.

4.4. Communication commands of Sentinel

When a Sentinel node connects to a Redis instance, it creates two connections: CMD and pub/sub. Sentinel sends commands to Redis via CMD connection and connects to other Sentinel instances on Redis instance via pub/sub.

Commands for Sentinel to interact with Redis master and slave nodes include:

The command As with
PING SentinelRedisThe node sendsPINGCommand to check the status of the node
INFO SentinelRedisThe node sendsINFOCommand to get itsSecondary node information
PUBLISH SentinelTo monitor itRedisnode__sentinel__:hellothischannelreleaseYour own informationThe master nodeRelated Configuration
SUBSCRIBE SentinelBy subscribing toRedis The master nodeFrom the node__sentinel__:hellothischannnelTo get others that are monitoring the same serviceSentinelnode

Commands for Sentinel to interact with Sentinel mainly include:

The command As with
PING SentinelTo the otherSentinelThe node sendsPINGCommand to check the status of the node
SENTINEL:is-master-down-by-addr And otherSentinelnegotiationThe master nodeThe state of ifThe master nodeIn aSDOWNStatus, the new one is automatically selected by votingThe master node

4.5. Working principle of Redis Sentinel

Each Sentinel node is required to perform the following tasks on a regular basis:

  • eachSentinelEvery secondThe frequency of one, to which it is knownThe primary server,From the serverAs well as otherSentinel The instanceSend aPINGCommand.

  1. If aThe instance(instanceDistance)The last timeValid responsesPINGCommand time exceeded. Proceduredown-after-milliseconds, the instance will beSentinelMarked asSubjective offline.

  1. If aThe primary serverIs marked asSubjective offline, then ismonitoringthisThe primary serverAll of theSentinelNode, toOnce per secondFrequency confirmation ofThe primary serverIt did get intoSubjective offlineState.

  1. If aThe primary serverIs marked asSubjective offlineAnd there areA sufficient numberSentinel(At least to reachThe configuration fileThe specified amount) at the specifiedTime rangeNye agrees with that judgment, so thisThe primary serverIs marked asObjective offline.

  1. In the general case, eachSentinelIn each10The frequency of one second, to all that it knowsThe primary serverFrom the serversendINFOCommand. When aThe primary serverSentinelMarked asObjective offlineWhen,SentinelOffline primary serverAll of theFrom the serversendINFOThe frequency of the command will be from10Once per secondOnce per second.

  1. SentinelAnd otherSentinelnegotiationThe master nodeThe state of ifThe master nodeIn aSDOWNStatus, the new one is automatically selected by votingThe master node. The remainingFrom the nodePoint to theNew master nodeforData replication.

  1. When there is not enoughSentinelagreeThe primary serverWhen offline,The primary serverObjective offline statusIt will be removed. whenThe primary serverBack to theSentinelPINGCommand returnsValid responsesWhen,The primary serverSubjective offline statusIt will be removed.

Note: A valid PING response can be +PONG, -loading, or -masterdown. If the server returns any response other than the above three responses, or does not reply to the PING command within the specified time, Sentinel considers the response returned by the server to be non-valid.

5. Redis Sentinel construction

5.1. Instructions for deployment of Redis Sentinel

  1. A robust Redis Sentinel cluster should use at least three Sentinel instances and be sure to place them on different machines and even in different physical areas.

  2. Sentinel cannot guarantee strong consistency.

  3. Common client application libraries support Sentinel.

  4. Sentinel requires constant testing and observation to ensure high availability.

5.2. Redis Sentinel profile

The default port on which sentinel instances run is 26379
port 26379
# Sentinel working directory
dir ./

# Sentinel monitors the redis primary node
## IP: IP address of the host
## port: sentinel port number
## master-name: specifies the name of the master node that can be named by yourself. The name can only consist of letters A-Z and numbers 0-9 and the characters ".-_".
## Quorum: When the quorum number sentinel considers that the master is out of contact, then the master is objectively out of contact
# sentinel monitor master-name ip redis-port quorum  Sentinel Monitor MyMaster 127.0.0.1 6379 2When requirePass 
      
        is enabled in a Redis instance, all clients connected to the Redis instance must provide a password.
      
# sentinel auth-pass master-name password  
sentinel auth-pass mymaster 123456  

# Specifies the maximum interval for a primary node to respond to sentinel. After this interval, sentinel will consider the primary node offline. The default is 30 seconds
# sentinel down-after-milliseconds master-name milliseconds
sentinel down-after-milliseconds mymaster 30000  

# Specifies the maximum number of slaves that can synchronize the new master at the same time when a failover occurs. The smaller the number, the longer it takes to complete failover. The opposite is true, but the larger the number, the more slaves are unavailable for Replication. By setting this value to 1, you can ensure that only one slave at a time is in a state that cannot process command requests.
# sentinel parallel-syncs master-name numslaves
sentinel parallel-syncs mymaster 1  

# Fail-over timeout (default: 3 minutes)
## 1. The interval between two failover of the same Sentinel and the same Master.
## 2. Starts when a slave synchronizes data from an incorrect master and ends when the slave is corrected to synchronize data from the correct master.
## 3. The time required when you want to cancel an ongoing failover.
## 4. Configure the maximum time required for all Slaves to point to the new Master when failover is performed. However, even after this timeout, Slaves will still be configured to point to master, but will not synchronize data according to the rules configured for PARALLEL-Syncs
# sentinel failover-timeout master-name milliseconds  
sentinel failover-timeout mymaster 180000

This script will be called when Sentinel has any warning level events (such as subjective and objective failures of Redis instances, etc.). The maximum execution time of a script is 60 seconds. If this time is exceeded, the script will be terminated by a SIGKILL signal, and then re-executed.
# The following rules apply to the results of the script:
## 1. If the script returns 1, the script will be executed again later. The default number of repeats is 10.
## 2. If the script returns 2, or a value higher than 2, the script will not be repeated.
## 3. If the script was terminated during execution due to a system interrupt signal, it will behave the same as if the return value was 1.
# sentinel notification-script master-name script-path  
sentinel notification-script mymaster /var/redis/notify.sh

# This script should be generic, can be called multiple times, not targeted.
# sentinel client-reconfig-script master-name script-path
sentinel client-reconfig-script mymaster /var/redis/reconfig.sh
Copy the code

5.3. Node planning of Redis Sentinel

role The IP address The port number
Redis Master 10.206.20.231 16379
Redis Slave1 10.206.20.231 26379
Redis Slave2 10.206.20.231 36379
Redis Sentinel1 10.206.20.231 16380
Redis Sentinel2 10.206.20.231 26380
Redis Sentinel3 10.206.20.231 36380

5.4. Setup of Redis Sentinel

5.4.1. Configuration management of redis-server

Copy three redis.conf files to the /usr/local/redis-sentinel directory. The three configuration files correspond to the startup configuration of the Master, slave1, and slave2 Redis nodes.

$ sudo cp /usr/local/ redis - 4.0.11 / redis. Conf/usr /local/redis-sentinel/redis-16379.conf
$ sudo cp /usr/local/ redis - 4.0.11 / redis. Conf/usr /local/redis-sentinel/redis-26379.conf
$ sudo cp /usr/local/ redis - 4.0.11 / redis. Conf/usr /local/redis-sentinel/redis-36379.conf
Copy the code

Modify the following three configuration files:

  • Primary node: redis-16379.conf
Daemonize yes pidfile /var/run/redis-16379.pid logfile /var/log/redis/redis-16379.log port 16379 bind 0.0.0.0 timeout 300 databases 16 dbfilename dump-16379.db dir ./redis-workdir masterauth 123456 requirepass 123456Copy the code
  • Secondary node 1: redis-26379.conf
Daemonize yes pidfile /var/run/redis-26379.pid logfile /var/log/redis/redis-26379.log port 26379 bind 0.0.0.0 timeout 300 Databases 16 dbfilename dump-26379.db dir./ Redis -workdir Masterauth 123456 RequirePass 123456 slaveof 127.0.0.1 16379Copy the code
  • Secondary node 2: redis-36379.conf
Daemonize yes pidfile /var/run/redis-36379.pid logfile /var/log/redis/redis-36379.log port 36379 bind 0.0.0.0 timeout 300 Databases 16 dbfilename dump-36379.db dir./ Redis -workdir Masterauth 123456 RequirePass 123456 slaveof 127.0.0.1 16379Copy the code

To do automatic failover, it is recommended that all redis.conf files be set to Masterauth. Because automatic failure overwrites the master-slave relationship, slaveof, not masterAuth. If Redis does not have a password, it can be ignored.

5.4.2. Starting verification for redis-server

Start Redis nodes 16379,26379 and 36379 in sequence. Start command and start log are as follows:

Redis start command:

$ sudo redis-server /usr/local/redis-sentinel/redis-16379.conf
$ sudo redis-server /usr/local/redis-sentinel/redis-26379.conf
$ sudo redis-server /usr/local/redis-sentinel/redis-36379.conf
Copy the code

To view the startup process of Redis:

$ps - ef | grep redis - server 1 0 0 7127 and in the afternoon?? 0:01.84 Redis-server 0.0.0.0:16379 0 7133 1 0 2:16 PM?? 0:01.73 Redis-server 0.0.0.0:26379 0 7137 1 0 2:16 PM?? 0:01. Redis server 0.0.0.0-70:36379Copy the code

To view the startup logs of Redis:

  • noderedis-16379
$ cat /var/log/ reis /redis-16379.log 7126:c 22 Aug 14:16:38.907# oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo7126: C 22 Aug 14:16:38. 908# Redis version=4.0.11, bits=64, commit=00000000, modified=0, PID =7126, just started7126: C 22 Aug 14:16:38. 908# Configuration loaded
7127:M 22 Aug 14:16:38.910 * Increased maximum number of open files to 10032 (it was originally setTo 256).7127 :M 22 Aug 14:16:38.912 * Running mode=standalone, port= 16379.7127 :M 22 Aug 14:16:38.913# Server initialized7127:M 22 Aug 14:16:38.913 * Ready to accept connections 7127:M 22 Aug 14:16:48.416 * Slave 127.0.0.1:26379 asks. 7127:M 22 Aug 14:16:48.416 * Slave 127.0.0.1:26379 asksforSynchronization 7127:M 22 Aug 14:16:48.416 * Full resync requested by slave 127.0.0.1:26379 7127:M 22 Aug 14:16:48.416 *  Starting BGSAVEforSYNC with target: Disk 7127:M 22 Aug 14:16:48.416 * Background Saving started by PID 7134 7134:C 22 Aug 14:16:48.433 * DB saved on disk 7127:M 22 Aug 14:16:48.487 * Background saving with success 7127:M 22 Aug 14:16:48.494 * Synchronization with Slave 127.0.0.1:26379 Succeeded 7127:M 22 Aug 14:16:51.848 * slave 127.0.0.1:36379 AsksforSynchronization 7127:M 22 Aug 14:16:51.849 * Full resync requested by slave 127.0.0.1:36379 7127:M 22 Aug 14:16:51.849 *  Starting BGSAVEfor SYNC with target: disk
7127:M 22 Aug 14:16:51.850 * Background saving started by pid 7138
7138:C 22 Aug 14:16:51.862 * DB saved on disk
7127:M 22 Aug 14:16:51.919 * Background saving terminated with success
7127:M 22 Aug 14:16:51.923 * Synchronization with slave 127.0.0.1:36379 succeeded
Copy the code

The following two lines of logs indicate that redis-16379 is the primary node of Redis, and redis-26379 and Redis-36379 are secondary nodes that synchronize data from the primary node.

7127:M 22 Aug 14:16:48.416 * Slave 127.0.0.1:26379 Asks. 7127:M 22 Aug 14:16:48.416 * Slave 127.0.0.1:26379 AsksforSynchronization 7127:M 22 Aug 14:16:51.848 * Slave 127.0.0.1:36379 asksfor synchronization
Copy the code
  • noderedis-26379
$ cat /var/log/ reis /redis-26379.log 7132:C 22 Aug 14:16:48.407# oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo7132: C 22 Aug 14:16:48. 408# Redis version=4.0.11, bits=64, commit=00000000, modified=0, PID =7132, just started7132: C 22 Aug 14:16:48. 408# Configuration loaded
7133:S 22 Aug 14:16:48.410 * Increased maximum number of open files to 10032 (it was originally setTo 256).7133 :S 22 Aug 14:16:48.412 * Running mode=standalone, port= 26379.7133 :S 22 Aug 14:16:48.413# Server initialized
7133:S 22 Aug 14:16:48.413 * Ready to accept connections
7133:S 22 Aug 14:16:48.413 * Connecting to MASTER 127.0.0.1:16379
7133:S 22 Aug 14:16:48.413 * MASTER - SLAVE sync started
7133:S 22 Aug 14:16:48.414 * Non blocking connect forSYNC Fired The Event. 7133:S 22 Aug 14:16:48.414 * Master Tears to PING, Replication Can Continue... 7133:S 22 Aug 14:16:48.415 * Partial resynchronization not possible (no cached master) 7133:S 22 Aug 14:16:48.417 * Full Resync the from master: 211 d3b4eceaa3af4fe5c77d22adf06e1218e0e7b: 0:7133 S 22 Aug 14:16:48. 494 * master  -  SLAVE sync: Receiving 176 bytes from master 7133:S 22 Aug 14:16:48.495 * master - SLAVE sync: Flushing old data 7133:S 22 Aug 14:16:48.496 * MASTER - SLAVE sync: Loading DBinMemory 7133:S 22 Aug 14:16:48.498 * MASTER - SLAVE sync: Finished with successCopy the code
  • noderedis-36379
$ cat /var/log/redis/redis-36379.log 
7136:C 22 Aug 14:16:51.839 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo7136: C 22 Aug 14:16:51. 840# Redis version=4.0.11, bits=64, commit=00000000, modified=0, PID =7136, just started7136: C 22 Aug 14:16:51. 841# Configuration loaded
7137:S 22 Aug 14:16:51.843 * Increased maximum number of open files to 10032 (it was originally setTo 256).7137 :S 22 Aug 14:16:51.845 * Running mode=standalone, port= 36379.7137 :S 22 Aug 14:16:51.845# Server initialized
7137:S 22 Aug 14:16:51.846 * Ready to accept connections
7137:S 22 Aug 14:16:51.846 * Connecting to MASTER 127.0.0.1:16379
7137:S 22 Aug 14:16:51.847 * MASTER - SLAVE sync started
7137:S 22 Aug 14:16:51.847 * Non blocking connect forSYNC Fired The Event. 7137:S 22 Aug 14:16:51.847 * Master To PING, Replication Can Continue... 7137:S 22 Aug 14:16:51.848 * Partial resynchronization not possible (no cached master) 7137:S 22 Aug 14:16:51.850 * Full Resync the from master: 211 d3b4eceaa3af4fe5c77d22adf06e1218e0e7b: 14, 7137: S 22 Aug 14:16:51. 923 * master  -  SLAVE sync: Receiving 176 bytes from master 7137:s 22 Aug 14:16:51.923 * master - SLAVE sync: receiving 176 bytes from master 7137:s 22 Aug 14:16:51.923 * master - SLAVE sync: Flushing old data 7137:S 22 Aug 14:16:51.924 * MASTER - SLAVE sync: Loading DBinMemory 7137:S 22 Aug 14:16:51.927 * MASTER - SLAVE sync: Finished with successCopy the code

5.4.3. Configuration management of Sentinel

Conf file to /usr/local/redis-sentinel. The three profiles correspond to the sentinel configurations of the master, slave1, and slave2 Redis nodes.

$ sudo cp /usr/local/ redis - 4.0.11 / sentinel. Conf/usr /local/redis-sentinel/sentinel-16380.conf
$ sudo cp /usr/local/ redis - 4.0.11 / sentinel. Conf/usr /local/redis-sentinel/sentinel-26380.conf
$ sudo cp /usr/local/ redis - 4.0.11 / sentinel. Conf/usr /local/redis-sentinel/sentinel-36380.conf
Copy the code
  • Node 1: Sentinel-16380.conf
Protected -mode no bind 0.0.0.0 port 16380 daemonize Yes Sentinel Monitor master 127.0.0.1 16379 2 Sentinel down-after-milliseconds master 5000 sentinel failover-timeout master 180000 sentinel parallel-syncs master 1 sentinel auth-pass master 123456 logfile /var/log/redis/sentinel-16380.logCopy the code
  • Node 2: Sentinel-26380.conf
Protected -mode no bind 0.0.0.0 port 26380 daemonize Yes Sentinel Monitor master 127.0.0.1 16379 2 Sentinel down-after-milliseconds master 5000 sentinel failover-timeout master 180000 sentinel parallel-syncs master 1 sentinel auth-pass master 123456 logfile /var/log/redis/sentinel-26380.logCopy the code
  • Node 3: Sentinel-36380.conf
Protected -mode no bind 0.0.0.0 port 36380 daemonize Yes Sentinel Monitor master 127.0.0.1 16379 2 Sentinel down-after-milliseconds master 5000 sentinel failover-timeout master 180000 sentinel parallel-syncs master 1 sentinel auth-pass master 123456 logfile /var/log/redis/sentinel-36380.logCopy the code

5.4.4. Sentinel startup verification

The three Sentinel nodes 16380,26380 and 36380 are started respectively in sequence. The startup commands and startup logs are as follows:

$ sudo redis-sentinel /usr/local/redis-sentinel/sentinel-16380.conf
$ sudo redis-sentinel /usr/local/redis-sentinel/sentinel-26380.conf
$ sudo redis-sentinel /usr/local/redis-sentinel/sentinel-36380.conf
Copy the code

View the startup process of Sentinel:

$ps - ef | grep redis - sentinel 1 0 0 7954 3:30 in the afternoon?? 10:00.05 Redis-Sentinel 0.0.0.0:16380 [Sentinel] 0 7957 1 0 3:30 PM?? 10:00.05 Redis-Sentinel 0.0.0.0:26380 [Sentinel] 0 7960 1 0 3:30 PM?? 0:00. 04 redis - sentinel 0.0.0.0:36380 [sentinel]Copy the code

To view the startup logs of Sentinel:

  • nodesentinel-16380
$ cat /var/log/ Redis/Sentinel-16380. log 7953:X 22 Aug 15:30:27.245# oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo7953: X 22 Aug 15:30:27. 245# Redis version=4.0.11, bits=64, commit=00000000, modified=0, PID =7953, just started7953: X 22 Aug 15:30:27. 245# Configuration loaded
7954:X 22 Aug 15:30:27.247 * Increased maximum number of open files to 10032 (it was originally setTo 256).7954 :X 22 Aug 15:30:27.249 * Running Mode = Sentinel, port= 16380.7954 :X 22 Aug 15:30:27.250# Sentinel ID is 69d05b86a82102a8919231fd3c2d1f21ce86e0007954: X 22 Aug 15:30:27. 250# + Monitor Master master 127.0.0.1 16379 Quorum 27954: X 22 Aug 15:30:32. 28636380 @ # + sdown sentinel fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7 127.0.0.1 master 127.0.0.1 163797954: X 22 Aug 15:30:34. 58836380 @ # - sdown sentinel fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7 127.0.0.1 master 127.0.0.1 16379
Copy the code

Sentinel - 16380 sentinel node ID is 69 d05b86a82102a8919231fd3c2d1f21ce86e000, and through the sentinel ID have joined the sentinel in the cluster.

  • nodesentinel-26380
$ cat /var/log/ Redis/Sentinel-26380. log 7956:X 22 Aug 15:30:30.900# oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo7956: X 22 Aug 15:30:30. 901# Redis version=4.0.11, bits=64, commit=00000000, modified=0, PID =7956, just started7956: X 22 Aug 15:30:30. 901# Configuration loaded7957:X 22 Aug 15:30:30.904 * Increased maximum number of open files to 10032 (it was originally filedsetTo 256).7957 :X 22 Aug 15:30:30.905 * Running Mode = Sentinel, port= 26380.7957 :X 22 Aug 15:30:30.906# Sentinel ID is 21e30244cda6a3d3f55200bcd904d0877574e5067957: X 22 Aug 15:30:30. 906# + Monitor Master master 127.0.0.1 16379 Quorum 27957:X 22 Aug 15:30:30.907 * +slave slave 127.0.0.1:26379 127.0.0.1 [email protected] 127.0.0.1 16379 7957:X 22 Aug 15:30:30.911 * +slave slave 127.0.0.1:36379 127.0.0.1 [email protected] 127.0.0.1 16379 7957:X 22 Aug 15:30:36.311 * +sentinel sentinel fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7 127.0.0.1 36380 @master 127.0.0.1 16379Copy the code

Sentinel - 26380 sentinel node ID for the 21st e30244cda6a3d3f55200bcd904d0877574e506, and through the sentinel ID have joined the sentinel in the cluster. At this time, sentinel-16380 and Sentinel-26380 nodes already exist in the Sentinel cluster.

  • nodesentinel-36380
$ cat /var/log/ Redis/Sentinel-36380. log 7959:X 22 Aug 15:30:34.273# oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo7959: X 22 Aug 15:30:34. 274# Redis version=4.0.11, bits=64, commit=00000000, modified=0, PID =7959, just started7959: X 22 Aug 15:30:34. 274# Configuration loaded7960:X 22 Aug 15:30:34.276 * Increased maximum number of open files to 10032 (it was originally filedsetTo 256).7960 :X 22 Aug 15:30:34.277 * Running Mode = Sentinel, port= 36380.7960 :X 22 Aug 15:30:34.278# Sentinel ID is fd166dc66425dc1d9e2670e1f17cb94fe05f5fc77960: X 22 Aug 15:30:34. 278# + Monitor Master master 127.0.0.1 16379 Quorum 27960:X 22 Aug 15:30:34.279 * +slave slave 127.0.0.1:26379 127.0.0.1 [email protected] 127.0.0.1 16379 7960:X 22 Aug 15:30:34.283 * +slave slave 127.0.0.1:36379 127.0.0.1 [email protected] 127.0.0.1 16379 7960:X 22 Aug 15:30:34.993 * 21 + sentinel sentinel e30244cda6a3d3f55200bcd904d0877574e506 127.0.0.1 26380 @ master 127.0.0.1 16379Copy the code

Sentinel - 36380 sentinel node ID for fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7, and through the sentinel ID have joined the sentinel in the cluster. At this time, sentinel-16380, Sentinel-26380 and Sentinel-36380 nodes already exist in the Sentinel cluster.

5.4.5. The Sentinel configuration is updated

  • Node 1: Sentinel-16380.conf

The following configuration items are generated in the Sentinel-16380. conf file:

# Generated by CONFIG REWRITE
dir "/usr/local/redis-sentinel"Sentinel config-epoch master 0 Sentinel leader-epoch master 0 sentinel known-slave master 127.0.0.1 36379 sentinel Known -slave master 127.0.0.1 26379 sentinel KNOWN - Sentinel master 127.0.0.1 26380 21 e30244cda6a3d3f55200bcd904d0877574e506 sentinel known - sentinel master 127.0.0.1 36380 fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7
sentinel current-epoch 0
Copy the code

Note that the Sentinel-16380. conf refresh is written to all secondary nodes redis-26379 and Redis-36379 associated with the primary Redis node, At the same time, the IP address, port number and Sentinel ID of the other two Sentinel nodes sentinel-26380 and Sentinel-36380 are written.

# Generated by CONFIG REWRITE
dir "/usr/local/redis-sentinel"Sentinel config-epoch master 0 Sentinel leader-epoch master 0 sentinel known-slave master 127.0.0.1 26379 sentinel Known - slave master 127.0.0.1 36379 sentinel known - sentinel master 127.0.0.1 36380 fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7 sentinel known - sentinel master 127.0.0.1 16380 69 d05b86a82102a8919231fd3c2d1f21ce86e000 sentinel current epoch - 0Copy the code

Note that the Sentinel-26380. conf refresh is written to all secondary redis-26379 and Redis-36379 associated with the primary Redis node, At the same time, the IP address, port number and Sentinel ID of the other two Sentinel nodes sentinel-36380 and Sentinel-16380 are written.

# Generated by CONFIG REWRITE
dir "/usr/local/redis-sentinel"Sentinel config-epoch master 0 Sentinel leader-epoch master 0 sentinel known-slave master 127.0.0.1 36379 sentinel Known -slave master 127.0.0.1 26379 sentinel KNOWN - Sentinel master 127.0.0.1 16380 69 d05b86a82102a8919231fd3c2d1f21ce86e000 sentinel known - sentinel master 127.0.0.1 26380 21e30244cda6a3d3f55200bcd904d0877574e506 sentinel current-epoch 0Copy the code

Note that the Sentinel-36380. conf refresh is written to all secondary redis-26379 and Redis-36379 associated with the primary Redis node, At the same time, the IP address, port number and Sentinel ID of the other two Sentinel nodes sentinel-16380 and Sentinel-26380 are written.

5.5. Sentinel client command

  • Check for otherSentinelThe status of the node is returnedPONGAs normal.
 PING sentinel
Copy the code
  • Displays all monitored master nodes and their status.
 SENTINEL masters
Copy the code
  • Displays the information and status of the specified primary node.
 SENTINEL master master_name
Copy the code
  • Displays all the slave nodes of the specified master node and their state.
 SENTINEL slaves master_name
Copy the code

Returns the IP address and port of the specified primary node. If a failover is being performed or completed, the IP address and port of the secondary node that is promoted to the primary node are displayed.

 SENTINEL get-master-addr-by-name master_name
Copy the code
  • Resets the status of all primary nodes whose names match the regular expression, clears the status of all previous nodes, and the status of all secondary nodes.
 SENTINEL reset pattern
Copy the code
  • Mandatory currentSentinelNodes to performfailoverAnd you don't need to get anything elseSentinelNode's consent. butfailoverAfter the willLatest configurationSend to otherSentinelNode.
SENTINEL failover master_name
Copy the code

6. Redis Sentinel failover and recovery

6.1. Tracing the Redis CLI client

The above log shows that redis-16379 is the primary node and its process ID is 7127. To simulate a primary Redis node failure, force the process to be killed.

$ kill7127-9Copy the code

Run the redis-cli client command to go to the Sentinel-16380 node and check the Status of the Redis node.

$ redis-cli -p 16380
Copy the code
  • To viewRedisOf the primary and secondary clustersThe master nodeInformation. You can findredis-26379Promoted toNew master node.
127.0.1:16380  SENTINEL Master 1)"name"
 2) "master"
 3) "ip"
 4) "127.0.0.1"
 5) "port"
 6) "26379"
 7) "runid"
 8) "b8ca3b468a95d1be5efe1f50c50636cafe48c59f"
 9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "588"
19) "last-ping-reply"
20) "588"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "9913"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "663171"
29) "config-epoch"
30) "1"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "180000"
39) "parallel-syncs"
40) "1"
Copy the code

6.2. Redis Sentinel log tracking

Check the logs of any Sentinel node as follows:

7954: X 22 Aug 18:40:22. 504# +tilt #tilt mode entered7954: X 22 Aug 18:40:32. 197# +tilt #tilt mode entered7954: X 22 Aug 18:41:02. 241# -tilt #tilt mode exited7954: X 22 Aug 18:48:24. 550# +sdown master 127.0.0.1 163797954: X 22 Aug 18:48:24. 647# +new-epoch 17954: X 22 Aug 18:48:24. 651# +vote-for-leader fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7 17954: X 22 Aug 18:48:25. 678# +odown master master 127.0.0.1 16379 # Quorum 3/27954: X 22 Aug 18:48:25. 678# Next failover delay: I will not start a failover before Wed Aug 22 18:54:24 20187954: X 22 Aug 18:48:25. 709# + config - update - the from sentinel fd166dc66425dc1d9e2670e1f17cb94fe05f5fc7 127.0.0.1 36380 @ master 127.0.0.1 163797954: X 22 Aug 18:48:25. 710# +switch-master master 127.0.0.1 16379 127.0.0.1 263797954:x 22 Aug 18:48:25.710 * +slave slave 127.0.0.1:36379 127.0.0.1 [email protected] 127.0.0.1 26379 7954:X 22 Aug 18:48:25.711 * +slave slave 127.0.0.1:16379 127.0.0.1 [email protected] 127.0.0.1 26379 7954:X 22 Aug 18:48:30.738# +sdown slave 127.0.0.1:16379 127.0.0.1 16379 @master 127.0.0.1 263797954: X 22 Aug 19:38:23. 479# -sdown slave 127.0.0.1:16379 127.0.0.1 16379 @master 127.0.0.1 26379
Copy the code
  • Analyzing logs, you can find thatredis-16329Nodes enter firstsdown Subjective offlineState.
+sdown master master 127.0.0.1 16379
Copy the code
  • Sentinel detectedredis-16329Failure,SentinelTo enter aA new era, from0into1.
+new-epoch 1
Copy the code
  • threeSentinelNode negotiation startsThe master nodeTo determine whether it is neededObjective offline.
+vote-for-leader fd166dc66425dc1d9e2670e1f17cb94fe05f5fcJuly 1Copy the code
  • More thanquorumThe number ofSentinelNodes thatThe master nodeFailure,redis-16329The node to enterObjective offlineState.
+odown master master 127.0.0.1 16379 #quorum 3/2
Copy the code
  • SentinalforAutomatic failover, negotiated selectionredis-26329Node as a newThe master node.
+switch-master master 127.0.0.1 16379 127.0.0.1 26379
Copy the code
  • redis-36329Node and alreadyObjective offlineredis-16329Nodes becomeredis-26479From the node.
7954:x 22 Aug 18:48:25.710 * +slave slave 127.0.0.1:36379 127.0.0.1 [email protected] 127.0.0.1 26379 7954:X 22 Aug 18:48:25.711 * +slave slave 127.0.0.1:16379 127.0.0.1 16379 @master 127.0.0.1 26379Copy the code

6.3. Redis configuration file

View the configuration files of the three redis nodes. The redis.conf configuration is automatically refreshed when the primary/secondary switchover occurs.

  • Node redis - 16379
daemonize yes
pidfile "/var/run/redis-16379.pid"
logfile "/var/log/redis/redis-16379.log"
port 16379
bind 0.0.0.0
timeout 300
databases 16
dbfilename "dump-16379.db"
dir "/usr/local/redis-sentinel/redis-workdir"
masterauth "123456"
requirepass "123456"
Copy the code
  • Node redis - 26379
daemonize yes
pidfile "/var/run/redis-26379.pid"
logfile "/var/log/redis/redis-26379.log"
port 26379
bind 0.0.0.0
timeout 300
databases 16
dbfilename "dump-26379.db"
dir "/usr/local/redis-sentinel/redis-workdir"
masterauth "123456"
requirepass "123456"
Copy the code
  • Node redis - 36379
daemonize yes
pidfile "/var/run/redis-36379.pid"
logfile "/var/log/redis/redis-36379.log"
port 36379
bind 0.0.0.0
timeout 300
databases 16
dbfilename "dump-36379.db"
dir "/usr/local/redis-sentinel/redis-workdir"
masterauth "123456"
requirepass "123456"Slaveof 127.0.0.1 26379Copy the code

Analysis: The slaveof configuration on the redis-26379 node was removed and the node was promoted to the primary node. Redis-16379 The node is down. The slaveof configuration of Redis-36379 has been updated to 127.0.0.1 redis-26379, making it the slave node of redis-26379.

The redis-16379 node is restarted. After normal startup, check its redis.conf file again and configure it as follows:

daemonize yes
pidfile "/var/run/redis-16379.pid"
logfile "/var/log/redis/redis-16379.log"
port 16379
bind 0.0.0.0
timeout 300
databases 16
dbfilename "dump-16379.db"
dir "/usr/local/redis-sentinel/redis-workdir"
masterauth "123456"
requirepass "123456"
# Generated by CONFIG REWRITESlaveof 127.0.0.1 26379Copy the code

Added a new slaveof configuration property to the redis-16379 configuration file, which points to redis-26379 as the slaveof the new master node.

summary

In this paper, several modes for Redis to achieve high availability are firstly expounded, and the shortcomings of Redis master-slave replication are pointed out. The related concepts of Redis Sentinel Sentinel mode are further introduced, and the specific functions and basic principles of Redis Sentinel are explained in depth. High availability setup and automatic failover verification.

Of course, Redis Sentinel only solves the problem of high availability, but also needs to introduce Redis Cluster mode to solve the problems such as single-point write of primary node and failure of capacity expansion of single node.

reference

Redis Development and Operations


Welcome to pay attention to the technical public number: one technology Stack

This account will continue to share backend technology essentials, including virtual machine fundamentals, multi-threaded programming, high-performance frameworks, asynchronous, cache and messaging middleware, distributed and microservices, architecture learning and advanced learning materials and articles.

Search
About
mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.