Do you know how hard Redis works on the road to high availability

Original: Curly brace MC(wechat official account: Huakuohao-MC) Focus on JAVA basic programming and big data, focus on experience sharing and personal growth.

To introduce myself

I, Redis, an in-memory database, has more power than memcached. It is now the leader in this field.

General database

The regular database here refers to the database based on disk read and write, such as Oracle, Mysql, Mongodb, etc. Hard disk read-write database can effectively ensure the high availability of data. High availability means that data will not be lost after the operating system or database crashes, which is also the most basic requirement for the database.

In-memory database

The database based on disk read and write can ensure high data availability, but the read and write speed is slow, which is also the natural property of disk I/O. While there is a significant increase in performance when switching to SSDS, there is also an increase in economic cost and a low lifetime. To solve this problem, Redis made changes. Redis is a database for reading and writing based on memory. All data is stored in memory, which greatly improves the speed of reading and writing data.

memcached

Speaking of in-memory databases, memcached has to be mentioned. Memcached predates Redis and is also based on in-memory data storage. A dozen years ago, the common caching solution was memcached. Memcached supports key-value data storage, but only String data structures. It does not support more complex data structures, nor does it support clustering. Data is lost after the operating system or memcached restarts. This is also the biggest disadvantage of memory-based data storage.

Redis

Redis inherits all of memcached’s strengths and improves on many of its weaknesses. For example, Redis also operates data based on memory and supports more data types, such as List and Set. Most importantly, and the focus of this article, Redis supports high availability of data, which means that data will not be lost after Redis or the operating system restarts.

Redis’ efforts on the road of high availability are just like an inspirational young man making continuous efforts and progress.

Single-machine persistence

Data stored in memory is notoriously unstable. To solve the problem of data loss caused by system or Redis restart. Redis provides two data persistence schemes.

The snapshot backup

Periodically back up the data in Redis database to disk. When the database is restarted, you can use the snapshot file periodically backed up to the disk file to restore data. In this way, Redis not only ensures the speed of data reading and writing, but also ensures the high availability of data. 2. AOF Synchronous snapshot backup has a disadvantage that some data may be lost. For example, if a system problem occurs before a new snapshot file is generated, the data after the last snapshot will be lost. Redis proposed AOF solution to solve this problem. The so-called AOF is that each command that writes data is appended to a file. In the event of a system failure, you can simply replay all the commands in this file. So you don’t lose numbers.

However, if there are too many data writing operations, the AOF file will be too large. To solve this problem, Redis provides the AOF automatic compression function and deduplication function, which can achieve the purpose of optimizing the file size.

A master-slave replication

The above two persistence schemes are almost sufficient for single-node Redis. But our system keeps getting bigger and bigger and demanding more and more. Sometimes single-node Redis can not support the system traffic. Redis provides master-slave mode in this case.

The master-slave mode is one master node, which is responsible for reading and writing, and one slave node, which is responsible for synchronizing data from the master node to the slave node, so that the master and slave nodes have consistent information. Note: The slave node does not support write operations, but it does support read operations. When any of these points fail, data is not lost. Moreover, the read pressure can be spread over multiple nodes, supporting a larger number of visits.

The guard mode

This is the biggest pain point for master-slave mode. When the primary node fails, the secondary node is not automatically upgraded to the primary node. That is, the program responsible for writing to Redis will report an error, but the read operation will not have a problem. This doesn’t quite fit the high availability requirements. In order to solve the problem of automatic switchover of the master node in the event of a failure, Redis also provides you with sentinel mode.

The sentinel mode provides three sentinel nodes (again Redis instances, but without data storage) to monitor all the Redis nodes (the actual data storage nodes) in master/slave mode. The client program obtains master node information from the Sentinel node. When the primary node fails, the Sentinel node automatically upgrades one of the secondary nodes to the primary node for the client program to write to. After the faulty primary node recovers, it automatically becomes the secondary node of the new primary node.

Cluster pattern

As you may have noticed, neither the master-slave replication mode nor the Sentinel mode solves the problem of distributed write, which means that up to now, all schemes can only write data to a node, and the data storage capacity is limited by a single node. The Sentinel mode only solves the problem that switchover cannot be performed automatically when a fault occurs in the master/slave replication mode. To solve the problem of distributed writes, Redis provides clustering capabilities.

Redis clustering enables distributed writes. Nodes in a cluster are classified as master nodes and slave nodes. The primary node reads and writes data and maintains cluster information, while the secondary node synchronizes the primary node information.

Redis clusters use the concept of data sharding, hash the Key to be operated on, and determine which master node the Key should be stored in according to the result obtained. This allows for distributed writes using multiple primary nodes. When performing read operations, the hash value of the Key is calculated first and then the corresponding primary node is found.

Unfortunately, clustering is not 100% perfect either. For example, batch operations on keys are limited and can only be performed when all keys are in the same slot. There are also Keys operations, which can only occur on any node, not across nodes. In fact, all of these disadvantages are caused by distributed write, because you store data separately to different Redis nodes.

conclusion

Redis goes from single-node persistence, to master-slave replication, to sentinel, and finally to cluster. All the way to beat strange upgrade, constantly improve their own.

Recommended reading

1. Java concurrent programming stuff (10) — Final summary

Java8 Stream is really delicious, you never know if you haven’t experienced it

3. Do you know how to use Awk

4. Teach you how to build a set of ELK log search operation and maintenance platform

, END,

Curly braces MC

Java· Big Data · Personal growth

Wechat id: Huakuohao-MC