The first three chapters have analyzed the characteristics and core principles of REDis in detail. From this chapter, the deployment structure and operation mode of Redis will be analyzed and interpreted. In the real production environment, we will not use single-node REDis to provide services, at least the sentinel or cluster mode of master-slave structure, to ensure the reliability of Redis services. This article takes a closer look at the master-slave synchronization mechanism of Redis.

I. Redis master-slave has two structural models:

1.1 Primary/Secondary Replication

The replication structure of a master N slave has only one level of replication relationship, which is also the most widely used form. Redis usually uses this kind of replication structure to build sentinels or cluster structures. It can ensure the availability of services through the replication relationship of a primary slave node, so as to achieve master-slave switchover in abnormal cases.

1.2 Cascading Replication

The replication relationship of a cascading replication structure can have multiple levels. The slave node of a master node can be the master node of a subordinate slave node. The cascaded replication structure is seldom used, which can relieve the replication pressure of the master node to some extent in a structure with multiple slave nodes.

Ii. The establishment of Redis master-slave relationship

The master/slave synchronization of Redis starts with the SLAVEOF host port command, which establishes the master/slave relationship. The SLAVEOF command is used to dynamically modify the behavior of the replication function while Redis is running. You can run the SLAVEOF host port command to change the current server to a slave server of a specified server. If the current server is already a slave server of a master server, executing SLAVEOF Host Port will cause the current server to stop synchronizing with the old master server, discard the old data set and start synchronizing with the new master server. In addition, executing the SLAVEOF NO ONE command on a slave server will cause the slave server to turn off replication and switch back from the slave server to the master server without discarding the data sets previously synchronized. SLAVEOF NO ONE does not discard synchronized datasets, so slave servers can be used as the new primary server in the event of a primary server failure without sentinels or clustering.

The following figure shows the master-slave relationship establishment process:

Note:

Note that when executing slaveof on a node that already has a master-slave relationship, it will end the existing master-slave relationship and empty all data under the node, which is a more threatening operation in the generation environment. Is there a safer way? The slavelof command allows you to pass the NO ONE parameter, that is, run the SLAVEOF NO ONE command. This command only ends the master/slave replication relationship and does not empty data.

Data synchronization

After the master-slave relationship is established, it is necessary to enter the process of master-slave data synchronization. There are three main situations: full data synchronization after the master-slave relationship is established; Command propagation after the initial synchronization is complete. Select a synchronization mode after the master/slave relationship is interrupted and reconnected. Full synchronization and incremental synchronization are available.

3.1 Full Synchronization

  1. After the slave node is started or disconnected (reconnection does not meet the conditions for incremental synchronization), the SYNC command is sent to the master database.

  2. After receiving the SYNC command, the master node starts to save snapshots in the background (that is, RDB persistence. RDB is triggered unconditionally during master/slave replication) and caches the commands received during snapshot saving.

  3. After RDB persistence is complete, the master node sends snapshot RDB files to all slave nodes and records the write commands executed during snapshot sending.

  4. After receiving the snapshot file, the slave node discards all old data and loads the received snapshot.

  5. After the snapshot is sent from the master node and the snapshot is loaded from the slave node, the master node sends the write command in the buffer to the slave node.

  6. The slave node finishes loading the snapshot, starts receiving command requests, and performs write commands to the master database buffer. (Complete from database initialization)

  7. Every time the master node executes a write command, it sends the same write command to the slave node. The slave node receives and executes the received write command. (Command propagation operation, operation after initialization of slave node)

The full synchronization process is shown as follows:

Prior to redis2.8, the slave node used full synchronization either after initialization or after disconnection. After 2.8, the PSYNC command was introduced to determine whether incremental synchronization was used after disconnection.

3.2 Incremental Synchronization

PSYNC provides full data resynchronization and incremental data synchronization modes.

  1. Full resynchronization: The replication is basically the same as the original replication. It is known as full replication.
  2. Partial resynchronization: When the salve is disconnected and reconnected, only the write commands executed during the disconnection with the master are sent to the slave during command propagation. This is known as incremental replication.

There are three important concepts in PSYNC execution: rUNId, offset (replication offset), and replication backlog buffer.

1.runid

Each Redis server will have an ID that identifies it. The ID sent in PSYNC is the ID of the previously connected Master. If this ID is not saved, the PSYNC command will use the “PSYNC? -1 “is sent to the Master, indicating that full replication is required.

2. Offset (replication offset)

Both Master and Slave maintain an offset in the Master and Slave replication. If the Master sends a command with N bytes, the offset in the Master will be added to N. If the Slave receives a command with N bytes, the offset in the Slave will be added to N. If the Master and Slave states are the same, their offsets should be the same.

3. Copy the backlog buffer

The replication backlog buffer is a fixed length circular backlog queue (FIFO queue) maintained by the Master to cache commands that have been propagated. When the Master propagates a command, it not only sends the command to all slaves, but also writes the command to the replication backlog buffer. The difference between PSYNC and SYNC is that salve determines whether full synchronization is required during connection. The logical process of full synchronization is the same as SYNC. PSYNC Perform the following steps:

  1. The client sends the SLAVEOF command to the server. That is, when the salve sends a connection request to the master, the slave determines whether the connection is the first one based on whether the master runid is saved.

  2. If it’s the first synchronization, send PSYNC to the Master? The -1 command to complete the synchronization; If the connection is reconnected, the PSYNC runid offset command is sent to the Master. (Runid is the Master’s ID, and offset is the global migration from the node synchronization command.)

  3. After receiving the PSYNC command, the Master checks whether the Runid is consistent with the local ID. If so, the Master checks whether the offset offset exceeds the size of the replication backlog buffer. If not, the Master sends a CONTINUE to the Slave. The Slave then only needs to wait for the Master to return the commands lost during the disconnection. If the RUNID and the local ID are different or the offset difference exceeds the size of the replication backlog buffer, FULLRESYNC runid offset is returned, and the Slave saves the Runid and performs full synchronization.

When the command is propagated, the master database will transfer each write command to the slave database and store the write command to the backlog queue, and record the global offset offset of the command stored in the current backlog queue. When salve reconnects, the master will find the commands executed during the disconnection period in the ring backlog queue based on the offset passed from the node and synchronize the commands to the Salve node to achieve incremental synchronization results.

The PSYNC execution process is shown as follows:

As can be seen from the above PSYNC execution process, when the slave node is disconnected and reconnected, the key to determine whether to adopt incremental synchronization is whether the offset of the slave node and the offset of the master node exceed the size of the replication backlog buffer, which is set by the following parameters. The replication backlog buffer is essentially a circular queue of fixed length. By default, the size of the backlog queue is 1MB. You can set the queue size through the configuration file: Set the size of the replication backlog buffer

repl-backlog-size 1mb
Copy the code

Redis also provides how often the ring queue can be released when there are no slaves to synchronize (default: one hour) and how often the replication backlog can be released when there are no SALve connections

repl-backlog-ttl 3600
Copy the code

4. Master/slave replication policy

Redis adopts a strategy of optimistic replication, that is, tolerating content inconsistency between the master and slave databases to a certain extent, but maintaining the final consistency of data between the master and slave databases. Specifically, Redis is asynchronous in the process of master/slave replication. After the master/slave database completes the client request, it will immediately return the result to the client and asynchronously synchronize the command to the slave database, but it does not wait for the complete synchronization of the slave database before returning to the client. This feature ensures that the performance is not affected during the master/slave replication, but it also generates a time window for data inconsistency. If the network is suddenly disconnected during this time window, data inconsistency will occur. This is the default if no other policies are added to the configuration file. Redis provides the following two parameters for constraints in order to prevent uncontrollability of master/slave inconsistency:

min-slaves-to-write 3
min-slaves-max-lag 10
Copy the code

When the number of slaves is less than min-rabes-to-write and the delay is less than or equal to min-rabes-max-lag, the master stops the write operation.

There is another parameter that affects the latency between the master and slave:

Repl – disable – TCP – nodelay:

If the value is set to yes, Redis will merge small TCP packets to save bandwidth, but increase the synchronization delay, causing data inconsistency between master and slave. Set to no, the Redis Master sends synchronized data immediately with little delay.

The master-slave synchronization of Redis in any scenario can be abstracted into the following seven steps:

1. Establish a socket connection

According to set the socket from the server to create a socket connection to the main server, the primary server after receiving from the server socket connection, create the response for the socket of the state of the client, and will be from the server at this time as the primary server client, which is from the server and the server and the client two identities.

2. Run the PING command

The PING command provides two functions: You can run the PING command to check whether the read and write status of the socket is normal after setting up the socket connection but not using it. Send the PING command to check whether the master server can process the command request normally and can process the master server to reply PONG.

3. Authentication

After receiving the “PONG” reply from the main server from the server, you need to worry about authentication. If the masterAuth option is set on the slave server, authentication is performed, and if the masterauth option is not set on the slave server, authentication is not performed.

4. Send port information

After the authentication step, the slave server executes the command REPLCONF listening-port to send the slave server’s listening port number to the master server.

5. Synchronize data

The secondary server sends the SYNC and PSYNC commands to the primary server to perform synchronization.

6. Command propagation

The master server and the slave server will enter the command propagation phase. The master server only needs to send the write command it executes to the slave server, and the slave server only needs to execute and receive the write command sent by the master server.

V. Conclusion

This article describes in detail the master/slave synchronization mechanism of Redis and the choice of synchronization strategy in different scenarios, which is also the cornerstone of Redis high availability. On this basis, the next article will analyze the implementation of Redis high availability.