“This is the 10th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

Redis master/slave architecture parsing

A single REDis can carry QPS ranging from tens of thousands to tens of thousands. For caches, this is generally used to support high read concurrency.

Therefore, the architecture is made into a master-slave architecture, with one master and many slaves. The master is responsible for writing and copying data to other slave nodes, and the slave nodes are responsible for reading. All read requests go from the node. In this way, it is also easy to achieve horizontal expansion and support high read concurrency.


The core mechanism of Redis Replication

  • Redis replicates data to slave nodes asynchronously. However, starting from Redis2.8, slave nodes periodically confirm the amount of data they replicate each time.

  • A master node can be configured with multiple slave nodes.

  • Slave nodes can also connect to other slave nodes.

  • When replicating, the slave node does not block its own query operations. Instead, it uses the old data set to provide services. However, when the replication is complete, the old data set needs to be deleted and a new data set needs to be loaded.

  • The slave node is used for horizontal capacity expansion and read/write separation. The expanded slave node improves read throughput.

  • If you use a master/slave architecture, it is recommended to enable persistence for master nodes. It is not recommended to use slave nodes as data hot standby for master nodes, because then if you turn off persistence for master nodes, The data may be empty when the master is down and restarted, and may be lost as soon as the slave node is replicated.

  • In addition, the master of various backup schemes, also need to do. If all local files are lost, select an RDB from the backup to restore the master. This ensures that data is available at startup. Even with the high availability mechanism described later, the slave node can automatically take over the master node. However, it is also possible that the Master node automatically restarts before Sentinel detects a master failure, or that all slave node data on it may be wiped clean.


The core principles of Redis master-slave replication

When a slave node is started, it sends a PSYNC command to the master node. If this is the first time that the slave node connects to the master node, a full resynchronization full replication is triggered. At this point, the master starts a background thread to generate a snapshot file of the RDB and cache all the new write commands received from the client. After the RDB file is generated, the master sends the RDB file to the slave. The slave first writes the RDB file to the local disk and then loads the RDB file from the local disk to the memory. The master then sends the write commands cached in the memory to the slave and the slave synchronizes the data. If the slave node is disconnected from the master node due to a network fault, the slave node automatically reconnects to the slave node. After the connection, the master node copies only the missing data to the slave node.


Breakpoint continuation for master/slave replication

Since redis2.8, breakpoint continuation of master/slave replication is supported. If the network connection is down during master/slave replication, the replication can continue where the last replication was made, rather than starting from scratch.

Master nodes maintain a backlog in memory. Both master and slave nodes maintain a replica offset and a Master run ID. Offset is stored in the backlog. If the network connection between the master and slave breaks down, the slave asks the master to continue replication from the last replica offset. If no corresponding offset is found, a resynchronization operation is performed.

Locating the master node based on host+ IP is unreliable. If the master node restarts or data changes, the slave nodes should be distinguished based on different RUN ids.


Diskless replication

The master creates the RDB in memory and sends it to the slave instead of landing on its own disk. Simply enable repl-diskless-sync yes in the configuration file.

Repl-diskless-sync yes # Wait 5s before starting the replication, because more slaves need to reconnect to repl-diskless-sync-delay 5Copy the code

Handling expired Keys

The slave does not wait for the master key to expire. If the master expires a key or discards a key via the LRU, a del command is emulated and sent to the slave.


The complete process of replication

When the slave node starts, it saves the master node information, including the host and IP address of the master node, but the replication process does not start.

The slave node has a scheduled task that checks every second to see if there are new master nodes to be connected and replicated. If so, the slave node establishes socket connections with the master node. The slave node then sends the ping command to the master node. If master has requirePass set, slave node must send masterauth’s password for authentication. The master node performs full replication for the first time and sends all data to the slave node. Later, the master node asynchronously copies the write command to the slave node.


Full amount of copy

  • Master performs bgSave to generate an RDB snapshot file locally.

  • The master node sends the RDB snapshot file to the slave node. If the RDB replication time exceeds 60 seconds (Repl-timeout), the slave node considers that the replication fails. You can adjust this parameter appropriately. Generally 100MB, 6GB file transfer per second, probably more than 60 seconds)

  • When the master node generates an RDB, it caches all new write commands to the memory. After the master node saves the RDB, the master node copies the new write commands to the slave node.

  • If memory buffer consumption exceeds 64MB continuously during the replication or exceeds 256MB at a time, the replication is stopped and the replication fails.

    client-output-buffer-limit slave 256MB 64MB 60
    Copy the code
  • After receiving the RDB, the slave node clears its old data, reloads the RDB into its own memory, and provides services based on the old data version.

  • If AOF is enabled on the slave node, BGREWRITEAOF is immediately executed to override the AOF.


Incremental replication

  • If the master-slave network connection breaks during the full replication, incremental replication is triggered when the slave reconnects to the master.

  • The master takes some of the missing data directly from its own backlog and sends it to slave nodes. The default backlog is 1MB.

    rel_backlog_size
    Copy the code
  • The master retrieves the backlog from the offset in the psync sent by the slave.


heartbeat

Both the primary and secondary nodes send heartbeat information to each other.

By default, the master sends a heartbeat every 10 seconds, and the slave node sends a heartbeat every 1 second.

Asynchronous replication

After receiving a write command, the master writes data internally and asynchronously sends the data to the slave node.


How can Redis be highly available

A system is highly available if it is available 99.99% of the time within 365 days.

The failure of one slave does not affect availability, as other slaves provide the same external query service with the same data.

But what happens if the master node dies? I can’t write data. When I write to the cache, it all fails. What’s the point of a slave node? With no master to copy data to them, the system is virtually unusable.

The high availability architecture of Redis is called failover, also known as master/slave switchover.

When a master node fails, it automatically detects the fault and switches a slave node to the master node. This is called active/standby switchover. This process implements high availability under the master-slave architecture of Redis.

Introduction to the Sentry

Sentinel is Chinese name for sentinel. Sentinel is a very important component in redis cluster organization. It has the following functions:

Cluster monitoring: Monitors whether the Redis master and slave processes are working properly. Message notification: If a Redis instance fails, the sentry is responsible for sending a message as an alarm notification to the administrator. Failover: If the master node fails, it is automatically transferred to the slave node. Configuration center: Notifies the client client of the new master address if failover occurs.

Sentinel is used to achieve high availability of redis cluster, itself is also distributed, as a sentinel cluster to run, work with each other. During failover, determining whether a master node is down requires the agreement of most of the sentinels, which relates to distributed elections. Even if some of the sentinels fail, the sentinels will still work, because if a failover system that is an important part of the high availability mechanism is itself a single point of failure, it will be bad.