preface

After lastAOF principle analysis of Redis persistenceAfter that, Redis persistence is complete. There are two main articles (AOFRDB), and for those unfamiliar with Redis persistence, check out the first two articles. After watching, remember to click “like” to pay attention to yo (manual comparison). The next part is about Redis high availability. You’ll start with basic master slave replication and move on to things like sentinels and clusters. This piece of words is currently divided into three, some may be longer, and then the actual combat is relatively strong. So good brothers or the same do not lazy oh (lazy are cheating with women’s feelings of men, PiuPiu).

An overview of the

Master-slave replication is to set up a high-precision reference clock in a switching node and send the reference clock signal to each slave node in the network through the transmission link. Each slave node uses the phase-locked loop technology to lock the local clock frequency to the reference clock frequency, so as to realize the synchronization of clock signals between nodes in the network. To put it simply, the data from the master database is sent to a server through a network transmission link. Create a database environment exactly the same as the master database, called the slave database. If the data of the master database changes, the slave database will be notified to the slave server in time (the slave database will establish a connection with the master database). In Redis provides also provides the function of replication, the first two words are mainly to solve the problem of single point in Redis. Replication is mainly to solve the problem of high availability in a complex environment. Sentinels and clusters in the later chapters are implemented on the basis of replication.

use

1 Setting up the Configuration

Replication instances are divided into master nodes and slave nodes. Each slave node can have only one master node, and a master node can have multiple slave nodes at the same time. The replicated data flow is one-way and can only be copied from the master node to the slave node. On the other hand, enabling master-slave replication is completely initiated on the slave node and does not require us to do anything on the master node. The replication can be configured using a configuration file, startup command, or client command. The three methods are equivalent.

1.1 Configuration File

Add: slaveof


to the configuration file of the slave server

1.2 Startup Commands

–slaveof


1.3 Client Commands

On the client, run slaveof


.

2 Preparing An Example

Since I do not have so many server resources, here is a server to make two instances for demonstration. Of these, the master database is used6379, used from the database6380. If you are not familiar with the basic operation of Redis startup, please check it outLinux, Docker installation and configuration.

3 Creating replication

You can use any of the three methods mentioned above. In this case, I use the above 1.3 through the client command to establish a connection.

## Enter the slave server
redis-cli -p 6380
## Get key for hello
127.0.0.1:6380> get hello
(nil)
Get master database data
127.0.0.1:6379> get hello
"world"
Create a replicate from the database
127.0.0.1:6380> slaveof 127.0.0.1 6379
OK
SQL > select hello from database
127.0.0.1:6380> get hello
"world"
Error: READONLY You can't write against a read only slave.
127.0.0.1:6379> del hello
(integer) 1
## query hello from database again, has been deleted
127.0.0.1:6380> get hello
(nil)
Copy the code

4 Disconnecting Replication

Disconnect replication can be disconnected by using Slaveof no one. Note that after the replication is disconnected from the secondary node, the existing data is not deleted, but the new data changes on the primary node are not accepted. The main process is first to disconnect the replication relationship with the master node, and second to promote the master node from the node.

127.0.0.1:6380> slaveof no one
OK
Copy the code

The principle of

As shown above, the master/slave replication flow chart is divided into six steps. The following is a separate talk for each step.

1. Save the configuration

The masterHost and MasterPort fields are maintained by the slave node server to store the IP and port of the master node (this makes sense because this field is required to establish a connection). Slaveof: slaveof: slaveof: slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof: Slaveof

2. Establish a connection

The process of establishing a connection is established by polling the master node configuration. The internal logic of the slave node is maintained by periodic tasks running every second. When a periodic task discovers a new master node, it attempts to establish a network connection with the slave node. If the socket is successfully established, a file event handler is created for the socket to handle the replication work, and is responsible for the subsequent replication work, such as receiving RDB files and receiving command transmission. If the slave node fails to establish a connection, the scheduled task will retry indefinitely until the connection succeeds or run slaveof no one to cancel replication.

3. Run the ping command

After the connection is established, the secondary node sends a ping request for the first time. The purpose of the ping request is to check whether the current socket is available and whether the primary node can accept the processing command. If the secondary node does not receive a pong reply from the primary node or times out after the ping command is sent, for example, the network times out or the primary node is blocking and cannot respond to the command, the secondary node will disconnect the replication connection and the next scheduled task will initiate a reconnection.

4. Verify permissions

If the masterauth option is set in the slave node, the slave node needs to authenticate to the master node. If this option is not set, no authentication is required. Authentication from the slave node is done by sending the auth command to the master node, whose parameter is the value of masterauth in the configuration file. If the password status on the master node is the same as that on the slave node masterauth (consistent means both exist and the password is the same, or neither exists), the authentication succeeds and the replication continues. If they are inconsistent, disconnect the socket from the secondary node and wait for reconnection in the next polling.

5. Synchronize data

When the primary and secondary servers communicate properly, data can be synchronized. In the scenario where replication is established for the first time, the primary node sends all data to the secondary node, which is the most time-consuming step. After version 2.8, Redis uses the new replication command psync to synchronize data. The original sync command is still supported to ensure compatibility between old and new versions. The new version of synchronization is divided into full synchronization and partial synchronization.

5.1 Full Synchronization

In the early stage, Redis only supports full replication, which sends all the data of the primary node to the secondary node at one time. When the data volume is large, it will cause a lot of overhead to the primary and secondary nodes and the network.

  1. A synchronization command is issued from within the slave node, and since this is the first replication, the slave node does not have the replication offset and the master node is runningID, so sendpsync-1.
  2. Master node according topsync-1Parses that the current is full copy and replies+FULLRESYNCAnd his ownrunIdandoffset.
  3. The slave node receives the response data from the master node and saves it to runIDAnd the offsetoffsetAnd the hostmasterBasic information about.
  4. After receiving the command for full replication, the primary node runs the commandbgsave(execute asynchronously), generated in the backgroundRDBFile (snapshot), and use a buffer (called the copy buffer) to record all write commands executed from now on (similar to the two buffers in the previous AOF rewrite).
  5. Primary node sendsRDBFile to slave node.
  6. Start receiving for slave nodesRDBThe primary node still responds to read/write commands during the snapshot receiving period. Therefore, the primary node stores the write command data during this period in the replication client buffer when the secondary node is finished loadingRDBAfter the file is filed, the primary node sends the data in the buffer to the secondary node to ensure data consistency between the primary and secondary nodes.
  7. The slave node clears its old data after receiving all data from the master node.
  8. The load starts after the node clears dataRDBFiles, for larger onesRDBFile, this step operation is still relatively time-consuming.
  9. The load from the node is complete successfullyRDBIf the current node is enabledAOFPersistence, it does it right awaybgrewriteaofOperation to ensure full copyAOFPersistent files are available immediately.
5.2 Partial Synchronization

A careful brother will find a full copyI/OThe cost is very huge. That part of replication is mainly an optimization measure made by Redis for the high cost of full replicationpsync{runId}{offset}Command implementation. When a node (slave) is replicating the master node (master), the secondary node requests the primary node to send the lost command data to the secondary node in case of network intermittent disconnection or command loss. If the replication backlog buffer of the primary node contains the lost command data, the secondary node directly sends the lost command data to the secondary node, so that the replication consistency between the primary and secondary nodes can be maintained. This part of the data is generally much smaller than the full amount of data, so the cost is very small.

  1. If network jitter occurs between the primary and secondary nodes, ifrepl-timeoutThe master node considers the slave node to be faulty and breaks the replication connection (connection lost).
  2. When the primary node is disconnected from the secondary node, the primary node still responds to the command, but the command cannot be sent to the secondary node because the replication connection is interrupted. However, the replication backlog buffer in the primary node can still store the write command data in the latest period, and the default maximum cache is used1MB.
  3. The secondary node retries to connect to the primary node. After the network is restored, the secondary node reconnects to the primary node.
  4. The slave node will bring itself to the currentrunIdAnd offsets are passed to the master node and executedpysncCommand synchronization.
  5. The node receivingpsyncCheck parameters after you run the commandrunIdIf yes, the previous replication is the current primary node. And then according to the parametersoffsetLooks in the self-replication backlog buffer and sends to the slave node if the data after the offset exists in the buffer+CONTINUEResponse, indicating that partial replication can be performed.
  6. The master node sends the data in the replication backlog buffer to the slave node based on the offset to ensure that the master/slave replication is in a normal state.

6. Keep listening

Command continuous replication. When the master node synchronizes the current data to the slave node, the replication process is completed. Then the primary node continuously sends the write command to the secondary node to ensure data consistency between the primary and secondary nodes.

Pay attention to the point

  1. In addition to solving the problem of high availability in complex environments, master-slave replication is also used to do read/write separation. This is mentioned above.
  2. If you are using a master-slave replication architecture, there is a problem you must consider. That’s the problem of inconsistent master/slave data. There is also the problem of inconsistency in the direct configuration of the primary and secondary servers.
  3. Avoid full data replication when making master slave copies of Redis. As mentioned earlier, full replication is a very resource-intensive operation.

conclusion

About some points of attention in fact, like the separation of reading and writing this piece is actually very necessary to understand, later see if it is necessary to spare time to do one. The first article of the whole high availability configuration is about the same, the whole process or principle should be clear, with a picture to carefully think about the whole process should be not difficult.

That’s the end of this issue. Welcome to leave your comments in the comments sectionAsk for attention, ask for likes

Redis Persistence: AOF analysis