1. Discover the detection object

1. Find the master

  • When sentinel is started, it senses master information through configuration information (a sentinel can detect multiple Masters)
    • Sentinel establishes two connections with each master: a command connection and a subscription connection

2. The slave

  • Sentinel sends the info command every 10s through the command connection established with the master. The sentinel detects the slave information of the master through the INFO command
    • Sentinel establishes two connections with each slave: a command connection and a subscription connection

3. Other sentinels were found

  • Every 2s, the sentinel sends a publish command to the master and slave identified above, so that all sentinels that have subscribed to the master and slave will receive the publish message (including the sentinel itself). In this way sentinel discovers each other. Publish information includes the IP and port information of the sentinel itself and the IP and port information of the detected master or slave.
    • Sentinel finds other Sentinels and establishes command connections with them. Because Sentinels are mutually discovered, the interconnection between sentinels is actually realized.

Two, detection activity

1. Subjective referral

The sentinel sends the ping command to the master, slave, and other sentinels every 1s. If no response is received for back-to-back down-after-milliseconds, the sentinel determines that the detection object is a subjective offline.

  • +PONG/ -loading / -masterdown

2. Get offline

When sentinel thinks a probe object is subjectively offline, it asks other Sentinels if the probe object is actually offline. If more than half of the sentinels believe the object is offline, it is considered to be offline objectively.

Failover

Detection objects identified as objective offline are failover by Leader Sentinel.

1. The leader election

Raft. Raft writes another article.

2. Failover

Let’s say the master is down.

  • Select the slave server as the new master and execute Slaveof no one.
    • Filter out slaves that are subjectively offline
    • Filter out the slave libraries that did not reply to the leader Sentinel info command within the last 5s
    • Filter for 10 milliseconds before you’ve synchronized to the master.
    • Select the slave library with the highest priority, the largest replication offset, and the smallest RUN ID
  • Change other slaves to copy the new master
  • Use the broken master as the slave library for the new master

Data structure

Sentinel manages the masters it finds through sentinelState.

The Redis instances it probes are managed through sentinelRedisInstance.

Slave information is managed through a field of master sentinelRedisInstance above.

What about other Sentinel information? Other sentinel information is other sentinels that are simultaneously detecting a Redis instance, so it should be a field in sentinelRedisInstance. SentinelRedisInstance of Slave also has sentinel field.