An overview of the

As an efficient middleware for caching, Redis is frequently used in our daily development. Today, we will talk about the four modes of Redis, which are stand-alone, master-slave replication, sentinel and cluster mode.

Probably, in the general company programmers using stand-alone version can basically solve the problem, in the Redis website gives the data is 10W QPS, this is more than enough to cope with the general company, no more can come to a master from the mode, the realization of both write separation, performance and greatly improved.

However, as aspiring programmers, we are not limited to stand-alone and master-slave CRUD, at least we need to understand the principles of sentinel and cluster mode, so that we can fight with the interviewer during the interview.

Before Redis is also written more articles, such as: Redis basic data types and underlying principles of implementation, transactions, persistence, distributed locking, subscription pre-release, can be said to be a more comprehensive tutorial, this is basically all, I will put the article system into PDF, share with you.

Here is an outline of Redis. There may be some incomplete parts. If there are incomplete parts, please add them in the message area and I will add them later.

stand-alone

Redis is a third-party dependency library recommended by the official website. In SpringBoot, the following dependencies can be directly used:

<dependency>
	  <groupId>redis.clients</groupId>
	  <artifactId>jedis</artifactId>
	  <version>${jedis.version}</version>
</dependency>
Copy the code

advantages

Redis stand-alone also has many advantages, such as simple implementation, easy maintenance, easy deployment, very low maintenance cost, no additional expenses.

disadvantages

However, because Redis is a stand-alone version, there are many problems, such as the most obvious single point of failure problem, a Redis hangs, all the requests will be directly typed on the DB.

In addition, the amount of concurrent resistance of a Redis is also limited, and both read and write requests should be taken into account. As long as the traffic volume comes up, Redis will be overwhelmed. On the other hand, the data storage of stand-alone Redis is also limited, and the large amount of data will be very slow when Redis is restarted, so the limitation is also relatively large.

Field building

The construction of the stand-alone tutorial, there are so many comprehensive tutorial on the net, basic is the fool operation, especially in the local building, basic and convenience to use yum, can get this done in a few words of command, here recommend a scaffolding tutorial: www.cnblogs.com/ zuidongfeng/p / 8032505 HTML.

The above tutorial is very detailed, the construction of the environment is originally the work of operation and maintenance, but as a programmer to try to build their own environment is still necessary, and build the environment this kind of thing, basically is once and for all, build once, maybe the next computer or virtual machine will be built again.

Redis.conf = redis.conf = redis.conf

Daemonize yes // Set background startup, Pid // When edis is running in daemon mode,redis writes pid to /var/run/redis.pid by default. Port 6379 // The default port is 6379 bind 127.0.0.1 // Address of the host. If 0.0.0.0 is set, both hosts can be accessed. 127.0.0.1 Indicates that the local host is allowed to access only. Timeout 900 // Specifies how long the client is idle before closing the connection. If 0 is specified, the function is disabled. Log "./redis7001.log" # databases 1 // Set the number of databases to be used. To synchronize data to the default Redis configuration file, we provide three conditions: Save 900 1 //900 seconds (15 minutes) Save 300 10 //300 seconds (5 minutes) 10 changes Save 60 10000 // 10000 changes within 60 seconds RDBCompression Yes // Dbfilename dump. RDB // Specify the name of the local database dir./ / specify the directory where the local database is stored. Set the SO_KEEPALIVE option to send ACK tcp-keepalive to idle clients. 60 by default, if RDB snapshot is enabled and the latest background save fails, Redis will stop accepting writes # this will let the user know that the data is not persisted correctly to disk, If the RDB snapshot is enabled (at least one save directive) and the latest background save fails, Redis will stop accepting writes. Rdbcompression yes # Version 5 RDB has a checksum of the CRC64 algorithm placed at the end of the file. This will make the file format more reliable. Rdbchecksum yes # dbfilename dump-master. RDB # dir /usr/local/redis-4.0.8/redis_master/ # Masterauth testMaster123 # When a slave is disconnected from the master or synchronization is in progress, the slave can behave in one of two ways: #1) If slave-serve-stale-data is set to "yes" (the default), the slave will continue to respond to client requests, which may be normal data, outdated data, or empty data with no value yet obtained. # 2) If slave-serve-stale-data is set to "no", the slave will reply "SYNC with master in progress" to process all requests. Except for the INFO and SLAVEOF commands. Slave-serve-stale-data Yes # Configures whether to read only slave-read-only Yes # If you select "yes" Redis will use less TCP packets and bandwidth to send data to slaves. But this will cause a delay in transferring data to the slave, The default configuration of the Linux kernel is 40 milliseconds. If you select "no", the delay of data transfer to salve will be reduced but more bandwidth will be used. Requirepass TestMaster123 # Redis instance maximum memory usage. Once the memory usage reaches the upper limit, Redis will delete the maxmemory 3GB key according to the policy selected (see # maxmemmory-policy). # volatile-lru -> delete key with expiration time based on lRU algorithm. # allkeys-lru -> delete any key according to lRU algorithm. # volatile-random -> Delete keys randomly based on expiration Settings. # allkeys->random -> select * from keys where keys are selected. # volatile- > delete based on last expiration time (with TTL), this is key # noeviction with expiration time -> no deletion by anyone, return error on write operation. Maxmemory-policy volatile-lru # AOF enable appendonly no # AOF filename appendfilename "appendonly. AOF "# fsync() The system call tells the operating system to write data to disk instead of waiting for more data to enter the output buffer. Some operating systems actually flush data to disk immediately; Some try to do so as soon as possible. # Redis supports three different modes: # No: Don't swipe right away, only when the operating system requires it. Faster. # always: every write operation is immediately written to aof file. Slow, but safest. # everysec: write once per second. Compromise. Appendfsync EverySEC # If AOF's synchronization policy is set to "always" or "everysec" and the background storage process (background storage or writing to AOF # logs) incurs a lot of disk I/O overhead. Some Linux configurations cause Redis to block for a long time with the fsync() system call. # Note that there is no perfect fix for this situation, and even fsync() from different threads will block our synchronized write(2) calls. To alleviate this problem, use the following option. It can prevent the main process from fsync() during BGSAVE or BGREWRITEAOF processing. # This means that Redis is in the "unsynchronized" state if a child process is saving. This actually means that 30 seconds of log data may be lost in the worst case. (Default Linux Settings) # Set this to "yes" if you have delay issues, otherwise leave it "no", this is the safest way to save persistent data. Non-appendfsync-on-rewrite yes # automatic rewrite of AOF files auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb # AOF files may be incomplete at the end (this is problematic with system shutdown, especially when mounting the ext4 filesystem without the data=ordered option. Only happens when OS dies, redis itself dies incomplete). # Redis restarts and loads into memory. When this happens, you can select redis startup error, and notify the user and write log, or load as much normal data as possible. If aof-load-truncated is yes, a log will be automatically published to the client and then loaded (default). If no, the user must manually redis-check-aof to restore the aof file. This option is only used when the server is trying to read more data but can't find it. Aof -load-truncated yes Slowlog-log-log-slower than 10000 # There is no limit on the length of slow query logs. It's just a major memory drain. You can reclaim memory by SLOWLOG RESET. Slowlog-max-len 128 # Client output buffer limit, Client-outoutput -buffer-limit normal 0 0 0 Client-outoutput -buffer-limit slave 256MB 64MB 60 Client-outout-buffer-limit pubsub 32MB 8MB 60 # When a subprocess overwrites an AOF file, the file will be synchronized AOF -rewrite-incremental-fsync yes for every 32mb generatedCopy the code

Because, the stand-alone version of Redis in the concurrent volume is relatively large, and the need for high performance and reliability, the stand-alone version of the basic is not suitable, so there is a master-slave mode.

A master-slave mode

The principle of

The principle of master and slave is relatively simple. The master database can read and write, and the slave database can only read.

However, the master/slave mode generally implements read/write separation, and the master database only writes, which reduces the pressure on the master database. The following figure illustrates the principle of the master/slave mode:

If the master-slave principle is that simple, what about the process it performs? Here’s another picture:

When the master-slave mode is enabled, its specific working mechanism is as follows:

  1. When the slave is started, it sends a message to the masterSYNCCommand, master after the command to the secondary database after the passbgsaveSaving a snapshot (RDB persistence), and the executed commands are cached.
  2. The master will then send the saved snapshot to the slave and continue writing commands during the cache.
  3. After receiving the snapshot from the master database, the slave loads the snapshot to its own database.
  4. Finally, the master synchronizes the cached commands to the slave, and the slave executes the commands after receiving them, so that the master and slave data are consistent.

advantages

The reason why master slave is used is that master slave to a certain extent solves the problem of request delay or Redis downtime caused by the large amount of concurrent stand-alone version.

The secondary database shares the read pressure of the primary database. If the primary database is in write-only mode, then the read and write separation is implemented, so that the read pressure of the primary database is eliminated.

On the other hand, it solves the single point of failure of the stand-alone version. If the main database is down, then the database can be used at any time. To sum up, the master-slave mode improves the availability and performance of the system to a certain extent, and is the basis of realizing sentinel and cluster.

Master/slave synchronization synchronizes in an asynchronous manner, during which Redis can still respond to queries and updates submitted by the client.

disadvantages

The master/slave mode is good, but it also has its own disadvantages, such as data consistency problem, if the master database write operation is completed, then its data will be copied to the slave database, if not copied to the slave database, the read request comes again, the data is not the latest data.

If a network fault occurs during the synchronization process, the synchronization fails and data consistency problems occur.

The master/slave mode does not have the functions of automatic fault tolerance and recovery. Once the primary database is used, the process of promoting the secondary database to the non-primary database needs human operation, which increases the maintenance cost and limits the write and storage capabilities of the primary node.

Field building

Let’s set up the master-slave mode. It is relatively simple to set up the master-slave mode. I have a centos 7 VIRTUAL machine here, and use the method of enabling redis multi-instance to set up the master-slave mode.

To enable multiple instances in Redis, create a folder to store the configuration files of the Redis cluster:

mkdir redis
Copy the code

Then paste and copy the redis.conf configuration file:

Conf /root/redis-4.0.6/redis.conf /root/redis-6379.conf cp /root/redis-4.0.6/redis.conf /root/redis-6379.conf Cp/root/redis - 4.0.6 / redis. Conf/root/redis/redis - 6381. ConfCopy the code

Three configuration files are replicated, with one master and two slaves. Port 6379 serves as the master database and port 6380 and port 6381 serves as the slave database.

First, configure the main database configuration file: vi redis-6379.conf:

Bind 0.0.0.0 # 0.0.0.0 indicates that any IP address is accessible. Protected -mode no # Disable protected mode and use password access. Port 6379 # set port 6380, 6381 to 6380, 6381. Yes # in the background pidfile /var/run/redis_6379.pid # pid process file name, Logfile: /root/reid/log/6379.log # logfile: /root/reid/log/6379.log Save 300 10 save 60 10000 rdbcompression yes Dbfilename dump. RDB # RDB filename dir /root/redis/datas # RDB file save path, Appendonly yes # means using incremental persistence of AOF. Appendfsync everysec # Optional values: always, everysec, no, The recommended setting is everysec requirePass 123456 #Copy the code

Then, it is time to modify the configuration file from the database. In the configuration file from the database, assume the following configuration information:

Slaveof 127.0.0.1 6379 # Configure master IP, port masterauth 123456 # Configure master password slaveof-serve-stale-data noCopy the code

To start the three redis instances, CD the redis SRC directory and run the following command:

./redis-server /root/redis/6379.conf
./redis-server /root/redis/6380.conf
./redis-server /root/redis/6381.conf
Copy the code

Through the commandps -aux | grep redisTo view the redis process started:As shown in the figure above, the startup is successful, and the test phase begins.

test

I use SecureCRT as the client to connect to Redis, and I start three SecureCRT to connect to three instances of Redis1 at the same time, specifying the port and password:

./redis-cli -p 6379 -a 123456
Copy the code

In master (6379), enter: set name ‘LDC’; in slave (6379), enter: set name ‘LDC’;

Redis.conf is not set to bind, which causes non-native IP addresses to be filtered out. Generally, 0.0.0.0 is used.

The other is that the password requirePass 123456 is not configured, which leads to abnormal IO connections. This is a pit I encountered, but after configuring the password, it succeeded.

There are two warning messages in the redis startup log. This does not affect the master/slave synchronization. This is annoying, but some people will encounter this, while others will not.

But, I this person has obsessive-compulsive disorder, Baidu also has a solution, here is not to talk about, to solve your own, here just tell you have this problem, some people don’t look at the log, see the success of the startup that everything is all right, also don’t look at the log, this habit is not good.

The guard mode

The principle of

Sentinel mode is an upgrade to master slave mode, because when the master slave fails, it does not automatically recover, requiring human intervention, which is very painful.

On a master-slave basis, the sentinel mode is implemented to monitor the health of the master-slave, to monitor the health of the master-slave, to act as a sentinel, to warn whenever there are exceptions, and to handle exception conditions.

So, in summary, sentinel mode has the following advantages (function points) :

  1. Monitoring: Monitors whether the master and slave are running properly, and the sentinels also monitor each other
  2. Automatic fault recovery: When the master fails, a slave will be automatically elected as the master.

Sentinel mode monitoring configuration information is specified by configuring sentinel monitor

< IP >

from the database, such as:

// Mymaster defines a name for the master database, followed by the MASTER IP and port, 1 means that at least one Sentinel process is required for the master database to be invalid. If this condition is not met, Sentinel Monitor myMaster 127.0.0.1 6379 1Copy the code

Node communication

Of course, there are other configuration information, other configuration information, when the environment is set up. When the sentinel is up, a connection is made to the Master to subscribe to the Master’s _sentinel_: Hello channel.

This channel is used to get information about other sentries monitoring the master. In addition, a connection is set up to periodically send the INFO command to the master to obtain the master information.

When the sentinel is connected to the master, it sends the INFO command to the master and slave at regular intervals (every 10 seconds). If the master is marked as offline, the frequency changes to once a second.

The sentinel_: Hello channel regularly sends information about itself to _sentinel_: Hello so other sentinels can subscribe to it, including the sentinel’s IP and port, its operational ID, configuration version, master’s name, master’s IP port, and Master’s configuration version.

Periodically send the PING command to the master, slave, and other sentinels (once per second) to check whether the object is alive. If the sentinels receive the PING command, they reply to the PONG command if no fault occurs.

Therefore, the sentinel establishes these two connections and communicates with the sentinel and the sentinel with the master by sending the INFO and PING commands periodically.

There are some concepts to understand, such as INFO, PING, PONG and so on, followed by MEET, FAIL, and subjective offline, of course, there will be objective offline, here are the main understanding of these concepts:

  1. INFO: Using this command, you can obtain the latest information of the primary and secondary databases and discover new nodes
  2. PING: This command is used most frequently. This command encapsulates the status data of the node and other nodes.
  3. PONG: When a node receives MEET and PING, it replies with the PONG command and sends its status to the other node.
  4. MEET: When a new node joins the cluster, it sends this command to the old node to indicate that it is a new person
  5. FAIL: When a node goes offline, the message is broadcast to the cluster.

Online and offline

Sentinel down-after-milliseconds master-name milliseconds The sentinel that sends the PING command will assume that the master is offline (Subjectively Down).

Since there may be a network problem between the sentry and the master, and not the master itself, the sentry will also ask other sentries if they believe the master is offline. If the number of sentries that believe the master is offline reaches a certain number (the previous quorum field configuration), We should think of the node as Objectively Down.

If sufficient number of Sentinels do not agree with the logging of the master, the objective logging mark of the master will be removed. If the master pings the sentinel again, the offline flag will also be removed.

Election algorithm

When the master is considered offline objectively, how is the fault rectified? It turns out that the sentries first elect a sentry to perform fault recovery. The algorithm for electing sentries is called Raft algorithm:

  1. The sentinelA, who find the master offline, send orders to other sentinels to solicit votes, asking them to choose themselves as sentinels.
  2. If the target sentinels choose no other sentinels, the sentinelA will be chosen as the boss.
  3. If you choose to have more than half of sentinels on sentinelA (half rule), the sentinelA should be the master.
  4. If more than one sentry is running at the same time, and there may be a unanimous vote, it waits until the next time, at a random time, a request is made again and a new round of voting is held until the big man is elected.

After the sentry is selected, the sentry will automatically reply to the fault and select one slave from the slave as the master database. The election rules are as follows:

  1. Of all the slavesslave-priorityThe one with the highest priority is selected.
  2. If the priorities are the same, the system selects the one with the largest offset, because the offset records the data replication increment. The larger the offset, the more complete the data is.
  3. If both are the same, select the one with the smallest ID.

The selected slave will be promoted to master, and other slaves will copy data to the new master. If the down master goes online again, it will be run as a slave.

advantages

Sentinel mode is the upgrade version of master-slave mode, so it improves the availability, performance and stability of the system at the system level. When the Master breaks down, the system automatically recovers the fault without human intervention.

The sentinels can carry out timely monitoring and heartbeat detection between the sentinels and the master, and discover the problems of the system in time, which makes up for the shortcomings of the master and slave.

disadvantages

The sentinel system with one master and many slaves also has write bottlenecks. If the master system breaks down, it takes a long time to recover and write services are affected.

The addition of sentry also increased the complexity of the system, requiring the maintenance of sentry mode.

Field building

Finally, let’s set up the sentinel mode. It is easy to configure the sentinel mode. On the basis of the master/slave mode configured above, create a folder for the three sentinel configuration files:

The mkdir/root/redis - 4.0.6 / sentinel. Conf/root/redis/sentinel/sentinel1. Conf mkdir/root/redis - 4.0.6 / sentinel. Conf / root/redis/sentinel/sentinel2. Conf mkdir/root/redis - 4.0.6 / sentinel. Conf/root/redis/sentinel/sentinel3. ConfCopy the code

Add the following configuration to each of the three files:

Daemonize yes # Run sentinel monitor myMaster 127.0.0.1 6379 in the background And configure the master IP and port sentinel auth-pass mymaster 123456 # master password port 26379 # and configure the other two port 36379,46379 sentinel Sentinel Parallel-syncs Mymaster 2 # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # A maximum of two slave instances can be synchronized with a new master instance. Sentinel failover-timeout myMaster 100000 # If the failover operation is not completed within 10s, the failover failsCopy the code

After the configuration, start the three sentries respectively:

./redis-server sentinel1.conf --sentinel
./redis-server sentinel2.conf --sentinel
./redis-server sentinel3.conf --sentinel
Copy the code

And then through:ps -aux|grep redisTo view:You can see that the three Redis instances and the three sentries have started up properly. Now log in to 6379 and check the master message with INFO Repliaction:

}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}

As you can see, the 6380 is currently a master character, and the 6380 is readable and writable, not read-only. This means that our sentinels are working. You can build your own sentinels if you are interested, or you may step in a lot of holes.

Cluster mode

Finally, Cluster is a real Cluster pattern, the guard and master-slave can’t automatic fault recovery solution, but at the same time there are also difficult to expansion and single storage, the problem of limited ability to read and write, and before is a redis Cluster is all the amount of data, so that all the redis a redundant, will consume memory space greatly.

The cluster mode realizes distributed storage of Redis data and data sharding. Each Redis node stores different contents and solves the problems of online node shrinkage (offline) and expansion (online).

The cluster pattern really makes the system highly available and high performance, but the cluster also further makes the system more and more complex. Let’s look at how clusters work in detail.

Data Partitioning Principles

The Redis cluster is divided into 16,384 slots (0-16,383) by the virtual slot partitioning algorithm.

For example, as shown in the following figure, the three masters will divide the slot range from 0 to 16383 into three parts (0-5000), (5001-11000), and (11001-16383), respectively, to data the slot range of the three cache nodes.

When the client request comes, it will first through the key CRC16 check and 16384 module (CRC16(key)%16383) to calculate the key slot, and then to the corresponding slot to fetch data or save data, so as to achieve data access update.

The reason for slot-partitioning storage is to slice a whole heap of data to prevent the performance of a single redis from being affected by too much data.

Node communication

How do nodes communicate with each other? This command is basically the same as the sentry command.

First, the newly online node sends a Meet message to the old member through the Gossip protocol, indicating that it is a new member.

After receiving the Meet message, the old member will recover the PONG message when there is no fault, indicating that the new node is welcome to join. In addition to sending the Meet message for the first time, the old member will send the PING message periodically to realize the communication between nodes.

During the communication, a TCP channel is opened for each node to communicate with each other, and then a scheduled task is performed to continuously send PING messages to other nodes. The purpose of this task is to know the metadata storage and health status between nodes, so that problems can be discovered.

Data request

Redis maintains an unsigned char myslots[CLUSTER_SLOTS/8] array that holds slots for each node.

Because it is a binary array, it only stores 0 and 1 values, as shown below:

In this way, the array only indicates whether it stores the corresponding slot data. If 1 indicates that the data exists, and 0 indicates that the data does not exist, the query efficiency will be very high, similar to the Bloom filter and binary storage.

For example, cluster node 1 stores data in slots 0 to 5000. However, only slot 0, 1, and 2 store data. The values of 0, 1, and 2 are 1.

In addition, each Redis layer also maintains a clusterNode array with the size of 16384, which is used to store the IP address, port and other information of the node responsible for the corresponding slot. In this way, each node maintains the metadata information of other nodes, so as to facilitate timely finding the corresponding node.

When a new node joins or shrinks, you can use the PING command to update the metadata information in the clusterNode array in time. In this way, you can find the corresponding node in time when the request comes.

There are two other cases where the request comes in and the data has been migrated, such as a new node joining, and the old cache node data is migrated to the new node.

When the request comes, it is found that data has been migrated to the old node and the data has been migrated to the new node. Because each node has clusterNode information, through the IP address and port of the information. At this point, the old node will send a MOVED redirect request to the client, indicating that the data has been migrated to the new node. You need to access the IP and port of the new node to get the data, so that you can get the data again.

What if data migration is being corrected? The old node sends an ASK redirection request to the client and returns the IP address and port of the destination node to which the client is migrating.

Expansion and contraction

Expansion and shrinkage refer to the process of bringing a node online or offline. If a node fails, the fault automatically recovers (node shrinkage).

When a node is shrunk or expanded, the system recalculates the slot range of each node and updates the corresponding data to the corresponding node based on the virtual slot algorithm.

As we said before, when a node is added, it sends a Meet message first, so you can see more about that, it’s basically the same pattern.

And after the failure, the sentry eldest node election, master node re election, slave promoted to master node, you can view the previous sentry mode election process.

advantages

Cluster mode is a centralless architecture mode. Data is fragmented and cannot be divided into corresponding slots. Each node stores different data contents.

And the cluster mode has increased the horizontal and vertical expansion ability, the realization node joins and the contraction, the cluster mode is sentinel’s upgrade version, sentinel’s advantage cluster has.

disadvantages

The biggest problem with caching is data consistency. When balancing data consistency between performance and business requirements, most of the solutions are final consistency, not strong consistency.

In addition, cluster mode brings a sharp increase in the number of nodes. A cluster mode requires at least 6 machines, which also brings the complexity of architecture because it needs to meet the election method of half principle.

The slave acts as a cold standby and cannot relieve the read pressure on the master.

Field building

To deploy in cluster mode, add the following configuration information to the redis.conf file:

Daemonize yes # r Background run pidfile /var/run/redis_6379.pid # 6379, 6380, 6381, 6382, 6383, 6384 cluster-enabled yes # Enable cluster mode masterauth 123456# If the password is set, Cluster-config-file nodes_6379.conf Cluster-nod-timeout 10000 # Request timeout time of 6379, 6380, 6381, 6382, 6383 and 6384 nodesCopy the code

Start all six instances at the same time and run them as a cluster using the following command

/redis-cli --cluster create --cluster-replicas 1 127.0.0.1:6379 127.0.0.1:6380 127.0.0.1:6381 127.0.0.1:6382 127.0.0.1:127.0.0.1 6383:6384-123456 aCopy the code

In this way, the construction of the cluster has been realized, and this phase is completed. It is not easy to see the number of words in a total of 1.7W. It is not easy to be original.