preface

This article mainly Redis core content over again, involving data structure, memory model, IO model, persistent RDB and AOF, master-slave replication principle, sentry principle, cluster principle.

Summarized a Redis knowledge map to share with you

Why is Redis so fast?

Many people only know is K/V NoSQl memory database, single thread…… This is due to the lack of a comprehensive understanding of Redis and the inability to ask further questions.

This question is the foundation, we can from Redis different data type underlying data structure implementation, completely based on memory, IO multiplexing network model, threading model, progressive rehash… .

How fast?

We can start by talking about How fast it is. According to official data, Redis QPS can reach about 100000(requests per second). IO /topics/benc…

The benchmark

The horizontal axis is the number of connections, and the vertical axis is QPS.

This chart reflects an order of magnitude, quantifying it to make the interviewer think you’ve read the official document, and being serious.

Memory-based implementation

Redis is a memory-based database, which is completely slower than a disk database.

Both read and write operations are performed in memory. Let’s compare memory operations with disk operations.

Disk call

Memory operations

The memory is directly controlled by the CPU, that is, the memory controller integrated within the CPU. Therefore, the memory is directly connected to the CPU and enjoys the optimal bandwidth for communication with the CPU.

Finally, the various delay times of the system are quantified in a graph (some data are quoted by Brendan Gregg)

Efficient data structures

When LEARNING MySQL, I know that B+Tree data structure is used to improve the retrieval speed, so the fast speed of Redis should also be related to the data structure.

Redis has five data types: String, List, Hash, Set, and SortedSet.

Different data types are supported by one or more data structures in order to achieve higher speed.

Note: We can explain the advantages of the underlying data structure of each data type separately. Many people only know the data type, and the name of the underlying data structure will be a surprise.

SDS simple dynamic string advantage

C language strings with SDS

1. Len in SDS stores the length of the string, O(1) time complexity query string length information.

2. Space pre-allocation: After SDS is modified, the program will not only allocate the required space for SDS, but also allocate additional unused space.

3. Lazy space release: When shortening SDS, the program will not reclaim the excess memory space, but use the free field to record the number of bytes and not release them. If the append operation is needed later, the unused space in free will be directly used to reduce the memory allocation.

ZipList compressed list

Compressed lists are one of the underlying implementations of the List, Hash, and sorted Set data types.

When a list has only a small amount of data, and each list item is either a small integer value or a short string, Redis uses compressed lists for the underlying implementation of list keys.

ziplist

In this way, memory is compact and memory is saved.

Later versions of QuickList revamped the list data structure to use QuickList instead of Ziplist and LinkedList.

Quicklist is a hybrid of Ziplist and LinkedList, which splits the linkedList into segments, each of which is stored compactly using ziplist, and multiple Ziplists are connected together using bidirectional Pointers.

SkipList jump table

The sorted set type is sorted using a “skip list” data structure.

A skiplist is an ordered data structure that enables fast access to nodes by maintaining multiple Pointers to other nodes in each node.

On the basis of linked list, hop table adds multi-level index to realize fast location of data through several jumps of index position, as shown in the figure below:

Jump table

When a set contains only integer numeric elements and the set has a small number of elements, Redis uses the integer set as the underlying implementation of the set key to save memory.

Single threaded model

Note that the single thread of Redis refers to Redis network IO (after 6.x network IO uses multiple threads) and key/value reads and writes are performed by a single thread. Redis persistence, cluster data synchronization, asynchronous deletion, etc. are all performed by other threads.

Never say that Redis has only one thread.

Single-threaded means that Redis key-value read/write instructions are executed single-threaded.

First say the official answer, let a person feel rigorous enough, rather than others to recite some blog.

Official answer: Because Redis is a memory-based operation, CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of machine memory or network bandwidth. Since single-threading is easy to implement and the CPU is not a bottleneck, it makes sense to adopt a single-threaded solution. Redis. IO /topics/ FAQ

Why not use multiple threads to make the most of your CPU?

Before running each task, the CPU needs to know where the task loads and starts running. That is, the system needs help to pre-set CPU registers and program counters, which is called the CPU context.

When we switch contexts, we need to do a lot of work, which is a very resource-intensive operation.

With the introduction of multithreading, it is necessary to use synchronization primitives to protect concurrent reads and writes of shared resources, which increases the code complexity and debugging difficulty.

What are the benefits of single threading?

No performance cost due to thread creation;

Avoid the CPU consumption caused by context switch, no overhead of multithreading switch;

Avoid contention issues between threads, such as adding locks, releasing locks, deadlocks, etc., and do not need to consider various locks.

The code is clearer and the processing logic is simple.

I/O multiplexing model

Redis uses I/O multiplexing to process connections concurrently. A simple event framework implemented by epoll + itself is used.

Read, write, close, connect in ePoll are converted into events, and then take advantage of ePoll’s multiplexing features without wasting any time on IO.

High performance IO multiplexing

The Redis thread does not block on a particular listener or connected socket, that is, on a particular client request processing. Because of this, Redis can connect to multiple clients at the same time and process requests, increasing concurrency.

Redis global hash dictionary

Redis as a whole is a hash table that holds all key-value pairs, regardless of the data type. A hash table is essentially an array. Each element is called a hash bucket, and each entry in the bucket holds a pointer to the actual value, regardless of the data type.

Redis global hash table

However, the time complexity of hash table is O(1), so we only need to calculate the hash value of each key to know the corresponding hash bucket position, and locate the entry in the bucket to find the corresponding data, which is also one of the reasons why Redis is fast.

Redis uses redisObject to represent the key value in the database. When we create a key-value pair in Redis, we create at least two objects, one for the key object and the other for the value object of the key-value pair.

That is, each entry stores a redisObject of key-value pairs. The corresponding data can be found through the pointer to the redisObject.

Typedef struct redisObject{// type unsigned type:4; // Unsigned encoding:4; // A pointer to the underlying data structure void * PTR; / /... }robj;Copy the code

What about Hash conflicts?

Redis resolves conflicts with chained hashing: elements in the same bucket are stored in a linked list. However, when the linked list is too long, the search performance may deteriorate, so Redis uses two global hash tables in pursuit of speed. Used for rehash operations to increase the number of existing hash buckets and reduce hash conflicts.

Hash table 1 is used to store key-value pair data by default. Hash Table 2 is not allocated at this time. When more data triggers the rehash operation, the following operations are performed:

1. Allocate more space for Hash table 2. 2. Copy the data from Hash Table 1 to Hash Table 2. 3. Release the space for Hash Table 1.

It is important to note that the remapping of hash table 1 data to Hash table 2 is not a one-time process, and this will cause Redis to block and fail to serve.

Instead, a progressive rehash is used, starting with the first index in “Hash table 1” and copying all data at that location to “Hash Table 2” for each client request, thus spreading the Rehash over multiple requests to avoid time-consuming blocking.

How does Redis persist? How can I recover data after an Outage?

Redis data persistence uses the “RDB data snapshot” method to achieve fast recovery from downtime. However, performing full data snapshots too frequently has two serious performance costs:

1. RDB files are frequently generated and written into disks, resulting in high disk pressure. The last RDB has not yet been executed, and the next RDB starts generating again, in an endless loop.

2. The bgSave child is forked and blocks the main thread. The larger the main thread is, the longer it blocks.

Therefore, Redis also designed AOF post – write logging to record instructions for memory modification.

Interviewer: What is an RDB memory snapshot?

As Redis executes the write command, the memory data changes. The so-called memory snapshot refers to the state data of data in Redis memory at a certain moment.

Just like time is fixed in a certain moment, when we take photos, we can completely record the instant picture of a certain moment through photos.

Redis is similar in that it takes a snapshot of a moment’s worth of data as a file and writes it to disk. This snapshot file is called an RDB file, which stands for Redis DataBase.

RDB memory snapshot

During data restoration, RDB files are directly read into the memory to complete data restoration.

Interviewer: Can Redis process write requests simultaneously during RDB generation?

Yes, Redis uses the multi-process copy-on-write (COW) technology of the operating system to implement snapshot persistence and ensure data consistency.

Redis calls glibc’s fork during persistence to generate a child process. Snapshot persistence is left entirely to the child process, and the parent process continues to process client requests.

When the main thread modifies data by executing write instructions, a copy of the data is made, and the BGSave child reads the copy and writes it to the RDB file.

This ensures snapshot integrity and allows the main thread to modify data at the same time, avoiding impact on normal services.

Copy-on-write ensures that data can be modified during snapshots

Interviewer: What’s AOF?

The AOF log records all sequences of modified instructions since the Redis instance was created, so it is possible to restore the state of the in-memory data structure of the current Redis instance by executing all instructions sequentially on an empty Redis instance, known as “replay.”

The Redis appendfsync write back policy for the AOF configuration item directly determines the efficiency and security of AOF persistence.

There is no one-size-fits-all strategy, and we need to make a trade-off between performance and reliability.

Interviewer: Since RDB has two performance problems, why not use AOF?

AOF pre-write log, which records each “write” operation. It does not cause performance loss as RDB full snapshots do, but the execution speed is not as fast as RDB snapshots. In addition, large log files may cause performance problems.

Therefore, Redis designed a killer “AOF rewrite mechanism”. Redis provides the BgreWriteAOF directive to slim down AOF logs.

The principle is to create a sub-process to traverse the memory into a series of Redis operation instructions, serialized into a new AOF log file. After serialization, the incremental AOF logs that occurred during the operation are appended to the new AOF log file, which immediately replaces the old AOF log file, and the slimming job is complete.

AOF rewrite mechanism (3 to 1)

Interviewer: How can you achieve as little data loss as possible while still maintaining performance?

When rebooting Redis, we rarely use RDB to restore memory state because a lot of data is lost. We usually use AOF log replay, but replay AOF log performance is much slower than RDB, so it can take a long time to start up with a large Redis instance.

Redis 4.0 addresses this problem with a new persistence option, hybrid persistence. Store the contents of the RDB file with the incremental AOF log file. Here the AOF log is no longer the full log, but rather the incremental AOF log that occurs between the beginning of persistence and the end of persistence, which is usually small.

Therefore, when Redis restarts, the contents of RDB can be loaded first, and then the incremental AOF log can completely replace the previous full AOF file replay, so the restart efficiency is greatly improved.

Redis master/slave architecture data synchronization

Redis provides a master-slave mode to copy redundant data to other Redis servers through master-slave replication.

Interviewer: How to ensure consistency of data between master and slave?

To ensure the consistency of duplicate data, master/slave architectures adopt read/write separation.

Read operation: master and slave library can execute;

Write operations: the master performs first and then synchronizes the write operations to the slave.

Redis reads and writes separate

Interviewer: Is there any other use for master-slave replication?

Fault recovery: When the active node goes down, other nodes can still provide services. Load balancing: The Master node provides the write service and the Slave node provides the read service to share load. High availability cornerstone: Is the foundation of sentinel and Cluster implementation, is the cornerstone of high availability. Interviewer: How is master-slave replication implemented?

Synchronization can be divided into three cases:

The first full replication of the primary and secondary libraries; Synchronization between master and slave during normal operation; The network between the primary and secondary libraries is disconnected and reconnected. Interviewer: How to achieve the first synchronization?

The process of the first replication between master and slave libraries can be divided into three stages: connection establishment stage (i.e. preparation stage), synchronization stage from master to slave, sending new write commands during synchronization stage to slave.

Redis full synchronization

1. Establish the connection: the secondary database establishes the connection with the primary database. The secondary database executes replicaof and sends the psync command to inform the primary database that the synchronization is about to take place.

2. The master synchronizes data to the slave: The master runs the BGsave command to generate an RDB file and sends the file to the slave. At the same time, the master creates a replication buffer for each slave to record all write commands received since the RDB file was generated. Save the RDB from the library and empty the database before loading the RDB data into memory.

3. Send the new write command received after RDB to the slave library: Write operations after the RDB file is generated are not recorded in the RDB file. To ensure data consistency between the primary and secondary libraries, the primary library uses a replication buffer to record all write operations after the RDB file is generated. And sends the data inside to the slave.

Interviewer: What if the network between the master and slave libraries is disconnected? Do YOU want to make full copy again after disconnection?

Prior to Redis 2.8, if the master and slave libraries had network interruptions during command propagation, the master and slave libraries would have to do a full copy again, which was very expensive.

Starting with Redis 2.8, when the network is down, the master and slave libraries continue to synchronize using incremental replication.

Incremental replication: Used for replication after a network interruption. Only the write commands executed by the primary node during the interruption are sent to the secondary node. Compared with full replication, incremental replication is more efficient.

The repl_backlog_buffer buffer is where the master records write operations at any time because memory is limited. Repl_backlog_buffer is a fixed-length, circular array that overwrites everything from the beginning if it is full.

The master uses master_repl_offset to record the offset it writes to, and the slave uses slave_repl_offset to record the offset it reads.

repl_backlog_buffer

After the master and slave are disconnected and reconnected, the slave first sends the psync command to the master and sends its runID and slave_repl_offset commands to the master.

The master only needs to synchronize the commands between master_REPL_offset and slave_repl_offset to the slave library.

The following figure shows the incremental replication process:

Redis incremental replication

Interviewer: How do you synchronize data during normal operation after full synchronization?

When the master and slave libraries complete full replication, a network connection is maintained between them. The master library synchronizes subsequent command operations to the slave library through this connection. This process is also known as command propagation based on long connections.

Do you know the sentry cluster principle?

Sentinel is a running mode of Redis, which focuses on monitoring the running status of Redis instance (master node, slave node), and can select master and switch master and slave through a series of mechanisms when the master node fails, to achieve failover and ensure the availability of the whole Redis system.

His architecture diagram is as follows:

Redis Sentinel cluster

The Redis Sentry has the following capabilities:

Monitoring: Continuously monitors whether the master and slave work in the expected state. Automatic Master switchover: When the Master fails, the sentry starts the automatic fault recovery process by selecting one of the slaves as the new Master. Notification: make the slave execute replicaof to synchronize with the new master; And notify the client to establish a connection with the new master. Interviewer: How do sentries know each other?

The sentinel establishes a communication with the master and uses the master to publish/subscribe its own information, such as height, weight, single, IP address, port number…

Master has a dedicated channel, Sentinel: Hello, for publishing and subscribing messages between sentinels. This is similar to the Sentinel: Hello wechat group. The sentinels use the wechat group set up by the master to post their own messages and follow the messages posted by other sentinels.

Interviewer: Although the sentinels are connected with each other, they still need to be connected with the slave, otherwise they cannot be monitored. How do you know about the slave and monitor them?

The key is to use the master to do this. The sentry sends the INFO command to the master, and the master knows all the salves under his control. So when the master receives the command, he tells the sentry the list of slaves.

The sentry connects to each salve based on the slave list information that the master responds to, and continuously monitors the sentry based on this connection.

The INFO command is used to obtain slave information

Cluster Cluster

Interviewer: Besides sentry, are there any other high availability tools?

The Redis Cluster monitored by the Sentinel Cluster is a master-slave architecture and cannot be horizontally expanded. Redis Cluster mainly solves various slow problems caused by large data storage, and also facilitates horizontal expansion.

Scale-out Redis slice clusters can be a great choice for millions and tens of millions of users.

Interviewer: What is a Cluster?

Redis clustering is a distributed database solution that manages data through sharding (a practice of “divide-and-conquer” thinking) and provides replication and failover capabilities.

Divide the data into 16384 slots. Each node is responsible for a number of slots. Slot information is stored in each node.

It is decentralized, as shown in the figure. The cluster consists of three Redis nodes, each of which is responsible for a part of the data of the whole cluster, and each node may be responsible for different amounts of data.

Redis cluster architecture

The three nodes are connected to form a peer cluster. They exchange cluster information with each other through the Gossip protocol. Finally, each node stores the slots allocation of other nodes.

Interviewer: How does a hash slot map to a Redis instance?

The CRC16 algorithm is used to calculate a 16-bit value according to the key of the key-value pair.

The 16-bit value is modded to 16384. The number from 0 to 16383 indicates the hash slot corresponding to the key.

Locate the instance based on the slot information.

The mapping between key-value pairs, hash slots, and Redis instances is as follows:

Mapping of data, slots, and instances

Interviewer: How does a Cluster fail over?

Redis cluster nodes use the Gossip protocol to broadcast their status and changes in their perceptions of the entire cluster. For example, if a node discovers that a node has lost contact (PFail), it will broadcast this information to the whole cluster, and other nodes will receive this information.

If a node receives that the number of lost connections (PFail Count) of a node has reached the majority of the cluster, it can mark the node as definitionally offline (Fail) and broadcast this to the entire cluster, forcing other nodes to accept the fact that the node has gone offline as well, and immediately initiate a primary/secondary switchover of the lost node.

Interviewer: How does the client determine on which instance the accessed data is distributed?

Redis instances send their hash slot information to other instances in the cluster through the Gossip protocol, spreading hash slot allocation information.

Thus, each instance in the cluster has information about the mapping between all the hash slots and the instance.

When a client connects to any instance, the instance responds with the hash slot mapping to the client, and the client caches the hash slot and instance mapping information locally.

When a client makes a request, it calculates the hash slot corresponding to the key, locates the instance where the data resides based on the hash slot instance mapping information cached locally, and then sends the request to the corresponding instance.

The Redis client locates the node where the data resides

Interviewer: What is Redis redirection?

The mapping between the hash slot and the instance changes due to a new instance or load balancing reallocation. The client sends the request to the instance, which has no corresponding data, and the Redis instance tells the client to send the request to another instance.

Redis tells the client about the MOVED error and ASK error.

MOVED

MOVED error (load balancing; data has been migrated to other instances) : When a client sends a key-value pair operation request to an instance that is not in its own slot, the instance returns a MOVED error directing it to the node that is in charge of the slot.

At the same time, the client also updates the local cache to correct the mapping between the slot and the Redis instance.

Version instruction

ASK

If a slot has a lot of data, some of it is migrated to the new instance and some of it is not.

Execute the command if the requested key is found on the current node, otherwise an ASK error response is required.

If the Slot is being migrated from instance 1 to instance 2(if the Slot is no longer in instance 1), instance 1 will return an ASK message to the client: The hash slot for the requested key is being migrated to instance 2. You send an ASKING command to instance 2 and then the send command.

For example, the client requests to locate slot 16330 whose key = “public number: code byte” on instance 172.17.18.1. If node 1 can find it, run the command directly; otherwise, the client responds with an ASK error message and directs the client to 172.17.18.2, the target node that is being migrated.

ASK the wrong

Note: The ASK error directive does not update the client cache hash slot allocation information.

The last

I here organized a Redis database data document Java systematic information :(including Java core knowledge points, Spring series of family drum, interview topics and 21 years of the latest Internet real questions, e-books, etc.) friends who need to pay attention to the public number [procedures yuan small wan] can be obtained.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Redis High Energy Chapter: From high frequency problems to sweeping core knowledge principles

preface

Why is Redis so fast?

How fast?

Memory-based implementation

Disk call

Memory operations

Efficient data structures

SDS simple dynamic string advantage

ZipList compressed list

ziplist

SkipList jump table

Jump table

Single threaded model

I/O multiplexing model

Redis global hash dictionary

What about Hash conflicts?

How does Redis persist? How can I recover data after an Outage?

Redis master/slave architecture data synchronization

Cluster Cluster

MOVED

ASK

The last

Redis High Energy Chapter: From high frequency problems to sweeping core knowledge principles

preface

Why is Redis so fast?

How fast?

Memory-based implementation

Disk call

Memory operations

Efficient data structures

SDS simple dynamic string advantage

ZipList compressed list

ziplist

SkipList jump table

Jump table

Single threaded model

I/O multiplexing model

Redis global hash dictionary

What about Hash conflicts?

How does Redis persist? How can I recover data after an Outage?

Redis master/slave architecture data synchronization

Cluster Cluster

MOVED

ASK

The last

Related Posts

Concurrent programming: JMM

How close are you to Java mastery?

I decided to write MyBatis by hand