statement

This article first from my public number: a flower is not romantic, if reprinted please indicate the source!

Interested partners can pay attention to personal public account: a flower is not romantic

preface

Redis’ Sentinel system is used to manage multiple Redis servers (instance). The system performs the following three tasks:

  • Monitoring: Sentinel continuously checks whether your primary and secondary servers are functioning properly.
  • Notification: Sentinel can send notifications to administrators or other applications via the API when a monitored Redis server has a problem.
  • Automatic failover: When a primary server fails, Sentinel starts an automatic failover operation. It upgrades one of the secondary servers of the failed primary server to the new master server and replicates the other secondary servers of the failed primary server. When a client tries to connect to a failed primary server, the cluster also returns the address of the new primary server to the client, allowing the cluster to use the new primary server in place of the failed server.

Redis Sentinel is a distributed system where you can run multiple Sentinel processes in a single architecture that use gossip Protocols to receive information about whether the main server is offline. Agreement Protocols are used to determine whether automatic failover is performed and which slave server is selected as the new master server.

What is high availability?

Architecturally, when we talk about high availability, we generally talk about 99%, 99.9%, 99.99% and more

For example, 365 days * 99.99% = 365.9 days of service can be provided externally, which translates into 3153.6s of downtime, 52.56min, 0.876h

General system unavailability can be divided into the following conditions:

  1. The machine is down
  2. JVM process OOM
  3. Machine 100% CPU
  4. The disk is full, and the system reports various I/O errors
  5. other

For example, Redis uses a master-slave architecture, with a single master and multiple slaves. What should we do if the master goes down? If we use sentinel mode, we will automatically promote a slave to master and continue to provide services.

The guard mode

Sentinal: sentry is a very important component of redis cluster. Its main functions are as follows:

  1. Cluster monitoring: Monitors whether the Redis master and Slave processes are working properly
  2. Message notification. If a Redis instance fails, the sentry is responsible for sending a message as an alarm notification to the administrator
  3. Failover: If the master node fails, it is automatically transferred to the slave node
  4. Configure the center to notify the client client of the new master address if failover occurs

The sentinels themselves are also distributed, operating as a cluster of sentinels, working cooperatively with each other

  1. During failover, determining that a master node is down requires the agreement of most of the sentinels involved in distributed elections
  2. Even if some sentinels fail, the sentinels cluster still works

The sentry configuration

The Redis source code includes a file called sentinel.conf, which is an example sentinel configuration file with detailed comments. The minimum configuration required to run a Sentinel is as follows:

sentinel monitor mymaster 127.00.1.6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1

192.1681.3.6380 4
sentinel down-after-milliseconds resque 10000
sentinel failover-timeout resque 180000
sentinel parallel-syncs resque 5
Copy the code

The first configuration instructs Sentinel to monitor a primary server named MyMaster with IP address 127.0.0.1 and port number 6379. At least two quorum sentinels are required to judge the primary server as failing (automatic failover will not be performed as long as the number of agreed Sentinels is not enough).

However, no matter how many sentinels you set to determine a server failure, a Sentinel needs the support of a majority of sentinels in the system to initiate an automatic failover. And reserve a given configuration Epoch (a configuration Epoch is the version number of a new master server configuration).

In other words, Sentinel cannot perform automatic failover if only a minority of Sentinel processes are working properly.

Sentinel mode operation mechanism

If the machine where M1 is located fails, then there are two sentinels left. S2 and S3 can agree that mater is down, and then elect one to perform failover

Majority of the two sentinels allow failover.

Here again, if it’s a two-node sentinel, something like this:

When quorum = 1 is configured and master is down, a switch can be performed if only one sentinel in S1 and S2 thinks master is down, and a sentinel is elected in S1 and S2 to perform failover

If R1 is the majority=2, failover is not allowed

The number of sentinal nodes is greater than or equal to majority

Here are two more concepts:

  1. Quorum: For example, if the sentinal cluster has five machines and quorum is set to 3, the Sentinal cluster will consider the master to be down if three nodes consider the master to be down, and each node will consider the master to be down as subjective sdown. The Sentinal cluster considers a master outage to be called an objective outage oDOWN
  2. A majority means the majority. For example, 2 of them =2 of them =2 of them =2 of them = 4 of them =2 of them = 5 of them =3

Data loss during the active/standby switchover in Sentinel mode

During the active/standby switchover, data may be lost in the following two cases:

  1. Data loss due to asynchronous replication Master -> Slave replication is asynchronous. Therefore, some data may be lost when the master breaks down before it is replicated to the slave

  2. Data loss due to split-brain is when a master’s machine is disconnected from the normal network and cannot communicate with other slaves, but the master is still running. The sentry may assume that the master is down, initiate an election, and switch the other slaves to master.

    At this point, there will be two masters in the cluster, which is called split brain

    In this case, a slave is switched to the master, but the client may continue to write data to the old master before switching to the new master. This data may be lost

    Therefore, when the original master is recovered, it is attached to the new master as a slave, and its data is cleared. Data is copied from the new master again

Sentinel mode data loss solution

Starting with Redis 2.8, to ensure data security, you can configure the primary server to execute write commands only when there are at least N currently connected secondary servers.

However, because Redis uses asynchronous replication, the write data sent by the master server is not necessarily received by the slave server, so there is still the possibility of data loss.

Here’s how this feature works:

  • The secondary server pings the primary server once per second and reports on the processing of the replicated stream.

  • The master server records when each slave server last sent a PING to it.

  • The user can specify the maximum value of network delay min-rabes-max-lag and the minimum number of slave slaves min-rabes-to-write required for the write operation.

If there are at least min-rabes-to-write slave servers and all of them have a delay value less than min-rabes-max-lag seconds, then the master server will perform the write requested by the client.

You can think of this feature as a relaxed version of THE C condition of CAP theory: while write persistence is not guaranteed, at least the window for losing data is strictly limited to a specified number of seconds.

On the other hand, if the conditions do not meet the conditions specified by min-rabes-to-write and Min-rabes-max-lag, then the write operation will not be performed and the master server will return an error to the client requesting the write operation.

Here are the two options for this feature and the parameters they require:

min-slaves-to-write <number of slaves>

  • min-slaves-max-lag <number of seconds>

For more information, see the Redis. Conf sample file that comes with the Redis source code.

The above two configurations can reduce data loss caused by asynchronous replication and split brain

  1. Reducing data loss for asynchronous replication With the min-rabes-max-lag configuration, it is ensured that if the slave replicated data and ack delay is too long, it will think that too much data was lost after the master went down and reject the write request. In this way, the data loss caused by the failure of the master to synchronize some data to the slave can be reduced to a manageable extent

  2. Reduce split data loss If a master is split and loses contact with other slaves, the above two configurations ensure that if the master cannot continue to send data to a specified number of slaves and the slave does not send itself an ACK message for more than 10 seconds, the client will reject the write request

In this way, the old master will not accept new data from the client, thus avoiding data loss

This configuration ensures that if you lose a connection to any slave and find no ack from any slave after 10 seconds, you will reject any new write requests

So in the split-brain scenario, at most 10 seconds of data is lost.

Analysis of some other mechanisms and principles of Sentinel mode

Sdown and ODOWN conversion mechanisms

Sdown (subjectively down) and Odown (objectively down) are states of year-end failure:

  • Sdown is a subjective outage, and a sentry who thinks master is down is a subjective outage
  • Odown is an objective outage, and if the number of sentinels in the Quorun think a master is down, then it is an objective outage

Sdown does something very simple: if a sentry pings a master for more than the number of milliseconds specified for IS-IS master-down-after-milliseconds, the master is down

For example, we can configure:

sentinel monitor mymaster 127.00.1.6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
Copy the code

If the ping of the master exceeds 5s, the master is considered sDown

Automatic discovery mechanism for sentinel cluster

A Sentinel can be connected to several other sentinels, and each Sentinel can check each other’s availability and exchange information.

You don’t need to set the addresses of other Sentinels for each Sentinel you run, because Sentinel can automatically find other Sentinels that are monitoring the same master server through the publish-subscribe function. This is done by sending a message to channel __sentinel__:hello.

Similarly, you don’t have to manually list all the slave servers under the master server, because Sentinel can query the master server to get information about all slave servers.

  • Every two seconds, each Sentinel sends a message to the __sentinel__: Hello channel of all the master and slave servers it monitors, via a publish and subscribe function. The information contains the IP address, port number, and runid (runid) of Sentinel.

  • Each Sentinel subscribes to the __sentinel__: Hello channel of all the master and slave servers it monitors, looking for sentinels (looking for Unknown Sentinels) that haven’t appeared before. When a Sentinel finds a new Sentinel, it adds the new Sentinel to a list of all other Sentinels known to Sentinel that monitor the same master server.

  • Sentinel also sends the complete master server current configuration. If a Sentinel contains a primary server configuration that is older than the one sent by another Sentinel, that Sentinel is immediately upgraded to the new configuration.

  • Before adding a new Sentinel to the list of monitored primary servers, Sentinel checks to see if the list already contains sentinels with the same run ID or address (including IP address and port number) as the Sentinel to be added. If so, Sentinel removes existing sentinels from the list that have the same run ID or address, and then adds new sentinels.

Automatic correction of slave configurations

The sentry is responsible for automatically correcting some configurations of the slave. For example, if the slave is to become a potential master candidate, the sentry ensures that the slave is copying data from the existing master. If the slaves are connected to the wrong master, such as after a failover, then the sentinels ensure that they are connected to the correct master

Slave -> Master Election algorithm

If a master is considered to be oDown and the majority of sentries allow a master/slave switchover, a majority sentry will perform the master/slave switchover, and a slave must be elected

Some information about the slave is considered:

  • Duration of disconnection from master
  • Slave priority
  • Copy the offset
  • run id

If a slave disconnects from the master for more than 10 times the number of down-after-milliseconds, plus how long the master has been down, then the slave is considered unfit to be elected master

own-after-milliseconds * 10) + milliseconds_since_master_is_in_SDOWN_state

The slave is then sorted:

  • The slave priority is sorted. The lower slave Priority is, the higher priority is
  • If the slave priority is the same, check replica offset to see which slave replicates more data. The lower the offset is, the higher the priority is
  • If both conditions are the same, select a slave with a smaller RUN ID

The quorum and majority

Each time a sentinel switches master/standby, the quorum sentinels must first consider the switch odown and then elect a sentinel to do the switch, which must also be authorized by the majority

If quorum < majority, such as five sentinels, majority is 3, and quorum is set to 2, then three sentinels authorization can perform the switch

However, if quorum >= majority, then all sentinels of the quorum number must be authorized, such as five sentinels, and quorum is 5, then all five sentinels must agree on authorization before the switch can take place

configuration epoch

Sentry monitors a set of Redis master+ Slave and has the corresponding monitoring configuration

The sentry performing the switch gets a Configuration epoch from the new master (Salve -> Master) to which it is switching. This is a version number that must be unique for each switch

If the first elected sentry fails to switch, the other sentries will wait for fail-timeout and then continue to switch. At this time, a new Configuration epoch will be obtained as the new version

Configuraiton spread

After the switch is complete, the sentry updates the master configuration locally and synchronizes it to the other sentries via pub/ SUB messaging

The previous version number is important here because messages are published and listened to through a channel, so when a sentinel makes a new switch, the new master configuration follows the new version number

The other sentinels update their master configuration based on the size of the version number

failover

A failover operation consists of the following steps:

  • The primary server was found to be offline.
  • Increment our current era (see Raft Leader Election for details) and try to get elected in that era.
  • If the election fails, try again after twice the set failover timeout. If elected, perform the following steps.
  • Select a slave server and upgrade it to the master server.
  • Send to the selected slave serverLAVEOF NO ONECommand to convert it to the primary server.
  • The updated configuration is propagated to all other Sentinels, who update their own configurations, through the publish-subscribe function.
  • To the slave server that has taken the primary offlineSLAVEOF host portCommand to replicate the new master server.
  • Lead Sentinel terminates the failover operation when all slave servers have started replicating the new master.

Reference: redisdoc.com/topic/senti…