Next, “Redis master slave synchronization (a) before 2.8 version of the way”, in the last, spoke to the 2.8 version before the way has defects, everyone still remember, today will talk about from 2.8 version of the master slave synchronization mode improvement

Let’s review the shortcomings of the pre-2.8 redis master-slave synchronization

  • If the existing data is synchronized after the sync operation is completed, at this point, the master server is in the command propagation stage between the slave server and the master server, at this point, the following command is transmitted, and then the connection between the master server and the slave server is disconnected
  • After the connection between the primary and secondary servers is re-established, the synchronization between the primary and secondary servers starts again
  • At this point in the sync operation, the master generates an RDB file that contains the key from the first command development in the first synchronization
  • While the sync operation can still be done, there are many keys that do not need to be synchronized again
  • Since the main server needs to generate RDB files during the sync operation, this operation takes up a lot of resources of the server: CPU, memory, IO
  • The transfer of RDB files between the master and slave servers consumes server bandwidth
  • Loading RDB files from the server also consumes a large amount of server resources: CPU, memory, AND IO
  • Therefore, no unnecessary sync operation can consume server resources and degrade service performance
In summary, the master/slave synchronization mode of the old version was very low in case of disconnection and reconnection, so version 2.8 started to passPSYNCCommand in place ofSYNCCommand to perform a synchronization operation

PSYNC command

  • PSYNCCommands have two modes:Full resynchronization modeandPartial resynchronization modeTwo modes
Full resynchronization mode
  • Full synchronization mode is used for processingThe first copysituation
  • Full synchronization mode execution process andSYNCThe command is consistent. Let’s review the process as follows:

  • 1. The secondary server sends the slaveof command to the primary server
  • 2. The primary server receives the slaveof command
  • 3. Run the BGSAVE command
  • 4. Generate an RDB file in the background
  • 5. Use a buffer to record all commands executed since the beginning
  • 6. The BGSAVE command is executed on the primary server
  • 7. The primary server sends the RDB file generated by the BGSAVE command to the secondary server
  • 8. Accept the RDB file from the server and load it
  • 9. Update your database status to the database status when the master server runs the BGSAVE command
  • 10. The master server sends all write commands recorded in the buffer to the slave server
  • 11. Run these commands from the server to complete the synchronization
Partial resynchronization mode
  • Partial synchronization mode is mainly handledReplication scenario after disconnection and reconnection
  • Other andSYNCThe commands are consistent. The master node will synchronize the commands during the disconnection to the slave node only after the connection is reconnected
Let’s take a look at what happens when we break and reconnectPSYNCCommand Execution Process
time The primary server From the server
T0 The synchronization is complete on the primary and secondary servers The synchronization is complete on the primary and secondary servers
T1 Run the broadcast command to broadcast the set k1 v1 command Execute the set k1 v1 command propagated from the master server
T2 Run the propagation command to propagate the set k2 v2 command Execute the set k2 v2 command propagated from the primary server
T3 The primary and secondary servers are disconnected The primary and secondary servers are disconnected
T4 set k3 v3 At this point, the primary server is disconnected and set K3 V3 is not propagated
T5 set k4 v4 At this point, the primary server has been disconnected and set K4 V4 has not been propagated
T6 The fault is rectified and the primary and secondary servers are reconnected The fault is rectified and the primary and secondary servers are reconnected
T7 Send the PSYNC command to the primary server
T8 Send an acknowledgement reply to the slave server to perform partial synchronization Receives an acknowledgement reply to the master server and performs partial synchronization
T9 Send the set k3 v3 and set k4 V4 commands to the slave server
T10 The set k3 v3 and set k4 V4 commands sent by the primary server are received and executed
T11 The primary and secondary servers complete partial synchronization The primary and secondary servers complete partial synchronization
  • And this process depends on three main parts
Replication offset of the primary and secondary servers
  • Both the master and slave services maintain a replication offset
  • Each time the master propagates n bytes to the slave, it adds n to the offset it maintains
  • Each time the slave service receives data propagated from the master server, it adds the offset it maintains to the number of bytes of the data size
  • Master slave offset chestnut:
Offset of the primary server: offset=100 Offset of the secondary server A: offset=100 Offset of the secondary server B: offset=100Copy the code
  • In this case, the primary server and secondary server A and B are synchronized
  • When the primary service synchronizes 100 bytes of data to the secondary service
Offset of the primary server: offset=100 Offset of secondary server A: offset=200 (secondary server B and the primary server are disconnected) Offset of secondary server B: offset=100Copy the code
  • In this case, the primary server and secondary server A are synchronized
  • The offset value of server B is 100 due to disconnection of service B, so it can be judged that they are not in data synchronization state
Replication backlog of the primary server
  • Copy the backlogged cacheIs a queue maintained by the primary server with a default size of 1MB
  • The queue isFixed lengthandFirst in first outthe
  • When the primary server executes on the secondary serverCommand transmissionNot only will all commands be sent to all slave services
  • It also writes all the commandsCopy the backlogged cache, as shown in the figure:

  • becauseCopy the backlogged cacheisThe queue, is to write the bytes of each command of the main service to the queue, consisting of two parts,The offsetandCommand byte value
  • The queue structure is as follows:
Node1:
offset=1
val=s
Node2:
offset=2
val=e
Node3:
offset=3
val=t
Node4:
offset=4
val=k
Node5:
offset=5
val=1
Node6:
offset=6
val=v
Node7:
offset=7
val=1
Copy the code
  • See the queue structure, which is the command of the master serviceset k1 v1Each byte of and byte corresponding toThe offsetDeposit toCopy the backlogged cacheIn the
  • When the slave server reconnects to the master server, the slave server passesPSYNCTo his ownThe replication offset is offsetSend it to the master server, which determines whether it is used based on the replication offsetComplete synchronizationorPart of the synchronization, the specific operations are as follows:
time The primary server From the server
T0 The primary and secondary servers are disconnected The primary and secondary servers are disconnected
T1 The fault is rectified and the primary and secondary servers are reconnected The fault is rectified and the primary and secondary servers are reconnected
T2 Send the PSYNC command to the server, which will send its own replication offset
T3 Receives the replication offset passed from the server for judgment
T4 If the data after the replication offset is still in the replication cache, partial synchronization is performed and the data is fetched from the replication cache
T5 If the data after the replication offset does not exist (because the queue is first in first out), perform a full synchronization
Run ID of the redis service
  • When each Redis service starts, it has its own run ID
  • Each ID will never be repeated. How do I do that
  • When the primary replication is performed between the secondary service and the primary server, the secondary service saves the running ID of the primary service
  • Run the command after the connection is reconnectedPSYNCThe command is used to compare the ids. If the ids are the same, the active service is checkingCopy offsetDetermine whether full or partial synchronization is required. If the ids of the two main services are different, select this optionComplete synchronization
PSYNCDetailed command execution process
time The primary server From the server
T0 The primary server starts, generating a unique run ID (pretend to simulate one)=1 Start from the server, generate a unique run ID (pretend to simulate one)=2
T1 At this point, the primary/secondary replication is complete (assuming no command is used), the primary offset is 0, and there is no data in the replication cache At this point, the primary/secondary replication is complete (assuming no command is used), and the offset is 0
T2 Run the propagation command to propagate the set k1 v1 command and write the command to the replication backlog cache with the primary offset=9 Run the set k1 v1 command propagated from the primary server, offset=9
T3 Run the propagation command to propagate the set k2 v2 command and write it to the replication backlog cache with the primary offset=18 Run the set k2 v2 command propagated from the primary server with offset=18
T4 The primary and secondary servers are disconnected The primary and secondary servers are disconnected
T5 Run the set k3 v3 command to write the command to the replication backlog cache with the primary offset=27 At this point, the primary server is disconnected and set K3 V3 is not propagated
T6 Run set k4 v4 to write the command to the replication backlog cache with the primary offset=36 At this point, the primary server has been disconnected and set K4 V4 has not been propagated
T7 The fault is rectified and the primary and secondary servers are reconnected The fault is rectified and the primary and secondary servers are reconnected
T8 Send the PSYNC command to the primary server with the primary server running ID =1 and the offset of the secondary server =18
T9 The primary server received the PSYNC command and found that the run ID was the two servers before the disconnection
T10 The master server determines the offset sent from the slave server. The offset sent from the slave server is offset=18. The master server determines whether the data in the replication backlog cache after offset=18 exists
T11 Send an acknowledgement reply to the slave server to perform partial synchronization Receives an acknowledgement reply to the master server and performs partial synchronization
T12 The master server sends two commands set K3 v3 and set k4 V4 to the slave server after offset=18 from the replication backlog cache
T13 Received the set k3 v3 and set k4 V4 commands sent by the master server, executed them, and updated the offset of the slave server, offset=36
T14 At this point, the primary and secondary servers have partially synchronized, and their offsets are offset=36 At this point, the primary and secondary servers have partially synchronized, and their offsets are offset=36

Today, I talked about the way after the 2.8 version of the master/slave synchronization of Redis, welcome everyone to communicate, point out some mistakes in the article, let me deepen my understanding, wish you no bugs, thank you!