Why is Redis so fast?

preface

Speaking of the current mainstream NoSql database is Redis. Because it read and write speed is very fast, generally used for caching hot data to speed up the query speed, we must have dealt with Redis in the work, but for why Redis fast, in addition to the eight-part essay recitation, it seems that there is no special in-depth understanding.

Today, let’s have an in-depth understanding of Redis:

Efficient data structures

The underlying data structure of Redis has six types, namely, simple dynamic string, bidirectional linked list, compressed list, hash table, hop table and integer array. The corresponding relationship between them and data types is shown in the following figure:

This article is not a table for the time being, the following will be for all of the above data structure source level in-depth analysis

Single-threaded vs. multi-threaded

When learning computer operating system, I must have encountered this question: is multithreading faster than single thread? Believe everybody see official people certainly won’t be like above silly Ne Zha fall into ao third’s snare.

Multithreading is sometimes faster than single-threading, but there are many times it’s not as fast. Let’s start by illustrating the difference between concurrency and parallelism with a diagram that a three-year-old can understand:

Concurrency: Concurrency refers to the concurrency of only one command running at a time. The concurrency effect is similar to that of multiple processes running concurrently at a macro level. However, the concurrency effect is different at a micro level.

Parallel: The simultaneous execution of multiple instructions on multiple processors at the same time. So both from a micro and a macro point of view, the two are executed together.

It’s not hard to see that concurrency is executed with only one instruction at a time, but processes (threads) switch around the CPU so fast that they give the impression of being “running simultaneously” when in fact only one instruction is executing at a time. But in reality, if we use multiple threads in an application, the rotation and context switching between threads can take a lot of time.

Talk is cheap,Show me the code

The following code demonstrates serial and concurrent execution and accumulation of operations:

public class ConcurrencyTest {
    private static final long count = 1000000000;

    public static void main(String[] args) {
        try {
            concurrency();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        serial();
    }

    private static void concurrency(a) throws InterruptedException {
        long start = System.currentTimeMillis();
        Thread thread = new Thread(new Runnable() {

            @Override
            public void run(a) {
                 int a = 0;
                 for (long i = 0; i < count; i++)
                 {
                     a += 5; }}}); thread.start();int b = 0;
        for (long i = 0; i < count; i++) {
            b--;
        }
        thread.join();
        long time = System.currentTimeMillis() - start;
        System.out.println("concurrency : " + time + "ms,b=" + b);
    }

    private static void serial(a) {
        long start = System.currentTimeMillis();
        int a = 0;
        for (long i = 0; i < count; i++)
        {
            a += 5;
        }
        int b = 0;
        for (long i = 0; i < count; i++) {
            b--;
        }
        long time = System.currentTimeMillis() - start;
        System.out.println("serial : " + time + "ms,b="+ b); }}Copy the code

The execution time is shown in the following table. It is not difficult to find that when the sum operation is executed concurrently for no more than a million times, the speed is slower than that of the serial sum operation.

cycles	Serial execution time /ms	Concurrent execution time	How much faster is concurrent than serial
100 million	130	77	About 1 times
10 million	18	9	About 1 times
1 million	5	5	The same
100000	4	3	The same
10000	0	1	slow

Concurrent execution can be slower than serial because of the overhead of thread creation and context switching.

Context switch

Multiple threads can execute on single-core or multi-core cpus. Single-core cpus also support multi-thread code execution. The CPU implements this mechanism by allocating CPU time slices (opportunities) to each thread. In order for the CPU to execute multiple threads, it is necessary to constantly switch the execution thread, so as to ensure that all threads are executed in a period of time.

At this point, the CPU allocates each thread a period of execution called its time slice. CPU time slices are typically tens of milliseconds. The CPU uses the time slice allocation algorithm to execute tasks in a cycle. The current task executes a time slice and then switches to the next task.

However, the state of the previous task is saved before switching, so that the state of the task can be loaded again when switching back to the task. So the process from saving to reloading a task is a context switch.

According to the running state of multithreading: In multi-threaded environments, when a thread’s state is changed from Runnable to non-runnable (Blocked, Waiting, Timed_Waiting), the context information of the corresponding thread (including the contents of CPU registers and program counters at a certain point in time, etc.) needs to be saved. So that the thread can build on its previous progress when it enters the Runnable state again later. A thread moving from a non-runnable state to a Runnable state may involve restoring previously saved context information. This process of saving and restoring a thread’s context is called a context switch.

Based on the memory

For example, MySQL data and indexes are stored on disk persistently. Therefore, when we use SQL statement to execute a query command, if the index of the target database is not loaded into memory, we must first load the index into memory, and then perform some addressing and disk I/O. The disk blocks corresponding to the data are loaded into memory, and the data is read at last.

If the disk is a mechanical disk, you need to locate the data location, that is, the disk address to be read. Take a look at this diagram:

To read data on a hard disk, the first step is to find the desired track. A track is a ring with an intermediate axis as the center. First, we need to find the track we want to align and move the magnetic head to the corresponding track, a process called seek.

Then, we need to wait for the disk rotation, let the head point we need to read data starting position, this time called rotation delay, usually we say hard drive speed fast, main effect is to spend time here, and that the direction of rotation is one-way, if missed the beginning of the data, You have to wait for the disk to rotate to the next turn before you can start reading.

Finally, the magnetic head begins to read and record the data on the disk. This works in a similar way to the optical disc. Since there is a layer of magnetic media on the track, when the magnetic head sweeps over a specific location, the magnetic signal can be converted into an electrical signal by sensing the magnetic state at different locations.

As you can see, both the movement of the magnetic head and the rotation of the disk are mechanical in nature, which is why this kind of hard disk is called a mechanical hard disk, and the efficiency of mechanical movement is the bottleneck of disk read and write.

Pulling a bit far, let’s go back to Redis, if the data is stored in memory like Redis, read and write directly to the database operation, naturally less than the disk database to read data this step, and this step is exactly the bottleneck of computer processing I/O.

Reading data in memory is essentially the transmission of electrical signals, much faster than mechanical motion.

So it’s safe to say that Redis is so fast, of course, because it runs on memory. But that is far from the whole story.

Redis FAQ

Faced with single-threaded Redis, you may have a question: Obin, my multi-core CPU is not working. Don’t worry, Redis has an answer to this question.

It is not common for a CPU to become a performance bottleneck for Redis, as Redis is often constrained by memory or network constraints. For example, pipelined Redis can serve up to a million requests per second on A Linux system, so if your application primarily uses O(N) or O(log(N)) commands, it hardly takes up much CPU.

However, to maximize CPU utilization, you can start multiple Instances of Redis in the same node and treat them as different Redis services. In some cases, a single node may not be enough, so if you want to use multiple cpus, you can start thinking about some earlier sharding methods.

You can find more information about using multiple Instances of Redis on the Partitioning page.

However, in Redis 4.0, we started to make Redis more threaded. This is currently limited to deleting objects in the background and blocking commands implemented through the Redis module. For future releases, our plan is to make Redis more and more multithreaded.

Note: when we say that Redis is single threaded, we only have one thread to process our network requests. A formal Redis Server must run with more than one thread!

For example, Redis forks a subprocess to persist

Four IO models

When a network IO occurs (let’s say read), it involves two system objects, the process that calls the IO and the system kernel.

When a read operation occurs, it goes through two phases:

(1) Wait for data to be prepared.

② Copy data from the kernel to the process.

In order to solve the problems in network IO, four network IO models are proposed:

Blocking IO model
Non-blocking IO model
Multiplexing IO model
Asynchronous IO model

The concepts of blocking and non-blocking describe how user threads invoke kernel I/O operations: blocking means that the I/O operation is returned to user space after it has completely completed; Non-blocking means that the status value is returned to the user immediately after the IO operation is invoked, without waiting for the IO operation to complete.

Blocking IO model

In Linux, all sockets are blocked by default. A typical read operation is shown below:

When the recvFROM system call is made by the application process, the kernel begins the first stage of IO: preparing data.

Many times with network IO, the kernel has to wait for enough data to arrive before the data has initially arrived (for example, before a complete TCP packet has been received). On the user side, the entire process is blocked.

When the kernel waits until the data is ready, it copies the data from the kernel to the user’s memory. Then the kernel returns the result, and the user process unblocks and starts running again. Thus, the blocking IO model is characterized by the fact that both phases of IO execution (waiting for data and copying data) are blocked.

Non-blocking IO model

In Linux, you can set the socket to make the I/O non-blocking. When a read operation is performed on a non-blocking socket, the flow of the read operation is as follows:

As you can see from the figure, when the user process issues a read operation, if the data in the kernel is not ready, it does not block the user process, but immediately returns an error.

From the user process’s point of view, when it initiates a read operation, it does not wait, but gets a result immediately. When the user process determines that the result is an error, it knows that the data is not ready, and it can send the read operation again.

As soon as the data is ready in the kernel and the system call from the user process is received again, it copies the data into user memory and returns the correct return value.

So, in non-blocking IO, the user process actually needs to constantly actively ask the kernel if the data is ready. The significant difference between a non-blocking interface and a blocking interface is that it returns immediately after being called.

Multiplexing IO model

Multiplexing IO is sometimes called event-driven IO (Reactor design pattern). The basic principle is that there is a function that polls all the sockets it is responsible for. When data arrives on a socket, it notifies the user process. The flow of the MULTIPLEXing model is shown in the figure below:

When a user process calls select, the entire process blocks, and the kernel “watches” all the select sockets and returns when the data in any of the sockets is ready. The user process then calls the read operation to copy data from the kernel to the user process.

This model is not that different from the IO blocking model, in fact it is worse. Because there are two system calls (SELECT and recvFROM), blocking IO calls only one system call (recvFROM). However, the advantage of using SELECT is that it can handle multiple connections simultaneously. So, if the number of connections on the system is not very high, a Web server using Select /epoll may not perform better than a Web server using multithreaded BLOCKING I/O, and the latency may be greater. The advantage of SELECT /epoll is not that it can process individual connections faster, but that it can process more connections.

If select() finds that a handle captures a “readable event,” the server program should do a recv() operation immediately, prepare the data to be sent based on the received data, and add the corresponding handle value to writefDS in preparation for the next select() check for a “writable event.” Similarly, if select() finds that a handle caught a writable event, the program should do send() and be ready for the next readable event detection.

The following figure illustrates the event-driven working model, where handlers sense and execute different events as they occur, like a multiplexer switch.

IO multiplexing is the most commonly used IO model, but it is not asynchronous enough because it uses a select system call that blocks threads. Therefore, IO multiplexing can only be called asynchronous blocking IO, not true asynchronous IO.

Asynchronous IO model

“True” asynchronous IO requires stronger support from the operating system. The following shows the flow of the asynchronous IO model (Proactor design mode) :

Once a user process initiates a read operation, it can immediately start doing other things; On the other hand, from the kernel’s point of view, when it receives an asynchronous read request operation, it first returns immediately, so there is no blocking for the user process.

The kernel then waits for the data to be ready, then copies the data into user memory, and when all this is done, the kernel sends a signal to the user process that the read operation is complete.

IO Model Summary

Calling blocking IO blocks the process until the operation is complete, while non-blocking IO returns immediately while the kernel is still preparing data.

The difference is that synchronous IO blocks the process when performing IO operations. According to this definition, the blocking IO, non-blocking IO, and multiplex IO described earlier are synchronous IO. In fact, the actual IO operation is the recvfrom system call in this example.

Non-blocking IO does not block the recvfrom system call if the kernel data is not ready. But when the kernel is ready, recvFROM copies the data from the kernel into user memory, which blocks the process.

Asynchronous I/O is different. When a process initiates an I/O operation, it returns directly until the kernel sends a signal telling the process that the I/O has completed, and the process is not blocked at all.

The comparison of each IO model is as follows:

Application in Redis

The Redis server is an event driver, and the server needs to handle two types of events:

File events: A Redis server connects to a client (or other Redis server) over a socket, and file events are the server’s abstraction of socket operations. Communication between a server and a client (or other server) generates file events, and the server performs a series of network communication operations by listening for and processing these events.
Time event: Some operations in Redis server (e.gserverCron) functions need to be executed at a given point in time, and time events are the server’s abstraction of such timed operations.

I/O multiplexing program

All of Redis’s I/O multiplexer functionality is achieved by wrapping the common select, Epoll, EVport, kQueue multiplexer libraries.

Because Redis implements the same API for each I/O multiplexing library, the underlying implementation of an I/O multiplexing program is interchangeable.

Redis I/O multiplexing program implementation source code with the #include macro defined the corresponding rules, the program will automatically select the system at compile time the highest performance I/O multiplexing function library as the underlying implementation of Redis I/O multiplexing program (ae.c file) :

/* Include the best multiplexing layer supported by this system. * The following should be ordered by performances, descending. */
#ifdef HAVE_EVPORT
#include "ae_evport.c"
#else
    #ifdef HAVE_EPOLL
    #include "ae_epoll.c"
    #else
        #ifdef HAVE_KQUEUE
        #include "ae_kqueue.c"
        #else
        #include "ae_select.c"
        #endif
    #endif
#endif
Copy the code

File event handler

Redis developed its own network event handler based on the Reactor pattern: This handler is called the file event handler:

File event handlers use I/O multiplexers to listen for multiple sockets at the same time and associate different event handlers to the socket based on the task the socket is currently performing.
When the monitored socket is ready to perform a connection reply (accept), read (read), write (write), close (close), file events corresponding to the operation will be generated, and then the file event handler will call the event handler associated with the socket to handle these events.

The following figure shows the four components of a file event handler: a socket, an I/O multiplexer, a file event dispatcher, and an event handler.

A file event is an abstraction of a socket operation that is generated whenever a socket is ready to perform connection reply, write, read, close, and so on. Because a server typically connects to multiple sockets, it is possible for multiple file events to occur concurrently. The I/O multiplexer listens for multiple sockets and passes those sockets that generated the event to the file event dispatcher.

Ne Zha asked a great question, think of the life of a group of people to the canteen for dinner, aunt said most of the word is: queue up! Line up! Not one less!

That’s right, everything from life! Redis’s I/O multiplexer always puts all event-generating sockets in a queue and sends sockets to the file event dispatcher in an ordered, synchronous, one-socket at a time manner through this queue. After the event generated by the previous socket has been processed, the I/O multiplexer continues to transmit the next socket to the file event dispatcher.

Redis has written multiple handlers for file event handlers that are used to meet different network communication requirements:

In order to respond to the various clients connecting to the server, the server associates the listening socketConnection response processor;
To accept a command request from a client, the server associates the client socketCommand request handler ；
To return the execution result of the command to the client, the server associates the client socketCommand reply handler ；
When replication is performed on both the master and slave servers, both the master and slave servers need to be associated with those written specifically for replicationCopy processor.

Connection response processor

Networking. C/acceptTcpHandler function is Redis connection response processor, the CPU is used to connect to the server listening socket to reply the client, specific implementation for the sys/socket. H/acccept function of packaging.

When the Redis server is initialized, the program associates this connection reply handler with the AE_READABLE time that the server is listening for sockets. When a client is connecting to the server using sys/socket.h/connect to listen for sockets, The socket will emit an AE_READABLE event that causes the connection reply handler to execute and perform the corresponding socket reply operation.

Command request handler

Networking. C/readQueryFromClient function is Redis command request processor, the CPU is responsible for the read from the socket client sends the command request content, concrete implementation for unistd. H/read function of packaging.

When a client successfully connects to the server through the connection reply handler, the server associates the AE_READABLE event of the client socket with the command request handler. When the client sends a command request to the server, the socket will generate AE_READABLE events, causing the command request handler to execute. And perform the appropriate socket read operation.

The server will continue to associate the command request handler for the client socket AE_READABLE event the entire time the client connects to the server.

Command reply handler

Networking. C/reply sendReplyToClient function is Redis command processor, the CPU is responsible for the executive command after get the reply from the server through a socket returned to the client, the specific implementation for unistd. H/write function of packaging.

When the server has a command reply that needs to be sent to the client, the server associates the AE_WRITABLE event of the client socket with the command reply processor. When the client is ready to receive the command reply from the server, the AE_WRITABLE event will be generated, causing the command reply processor to execute. And performs the corresponding socket write operation.

When the command reply has been sent, the server disassociates the command reply handler from the AE_WRITABLE event of the client socket.

A small summary

IO multiplexing in Redis Redis puts all the sockets that generate the event into a queue, and sends the socket to the file event dispatcher in an orderly, synchronous, one socket at a time. The file event dispatcher selects the processor that responds to the corresponding event of the socket for processing, thus realizing the efficient network request.

Redis custom protocol

The Redis client uses RESP (Redis serialization Protocol) to communicate with the Redis server. It is simple to implement, fast to parse, and human readable.

RESP supports the following data types: simple strings, errors, integers, batch strings, and arrays.

RESP is used as a request-response protocol in Redis in the following ways:

The client sends the command to the Redis server as an RESP array of batch strings.
The server replies with one of the RESP types according to the command implementation.

In RESP, certain data types depend on the first byte:

For simple strings, the first byte of the reply is “+”
For errors, the first byte of the reply is “-“
For integers, the first byte of the reply is “:”
For batch strings, the first byte of the reply is “$”
For arrays, the first byte of the reply is “*”

In addition, RESP can represent Null values using special variants of batch strings or arrays specified later. In RESP, different parts of the agreement are always terminated with “\ R \ N” (CRLF).

The following is a brief introduction to string encoding and incorrect encoding. For details, see the Redis official website for a detailed explanation of RESP.

Simple string

It is encoded as follows: a “+” sign followed by a string, followed by “\r\n”. The string cannot contain “\r\n”. Simple strings are used to transmit shorter, binary-safe strings. For example, many redis commands return “OK” on success, which is 5 bytes in RESP encoding:

"+OK\r\n"
Copy the code

To send binary safe strings, use RESP block strings. When Redis returns a simple string, the client library needs to return to the caller the string after the + sign (excluding) and before the CRLF (excluding).

RESP error

RESP has a type designed specifically for errors. In fact, the error type is much like the RESP simple string type, but the first character is “-“. The difference between the simple string type and the error type is that the client treats the error type as an exception, and the string contained in the error type is the exception information. Format is:

"-Error message\r\n"
Copy the code

An error type is returned only when an error has occurred, such as when you performed an operation for a type of error, or when the command does not exist. The client library should raise an exception when an error type is returned. Here is an example of an error type

-ERR unknown command 'foobar' -WRONGTYPE Operation against a key holding the wrong kind of value
Copy the code

The string after a space or before a newline represents the type of error returned, which is the convention and not the format required by RESP. For example, “ERR” is a common error, and “WRONGTYPE” is a more specific error. It indicates that a client attempts to perform an operation on an error type. This is called an error prefix and makes it easier for clients to identify the type of error.

The client may return different exceptions for different errors, or it may provide only a generic method to catch errors and provide error names. You cannot rely on these features provided by the client, however, because some clients only return generic errors, such as false.

High performance Redis protocol analyzer

Although Redis’s protocol is very human-readable and simple to define, it can still be implemented as fast as binary protocols.

Because Redis puts the length of data before the body of the data, applications do not need to scan the entire payload for a particular character, as JSON does, or escape the payload sent to the server.

The program can look for CR characters and calculate the length of batch replies or multiple batch replies while processing individual characters in the protocol text, like this:

#include <stdio.h>

int main(void) {
    unsigned char *p = "$123\r\n";
    int len = 0;

    p++;
    while(*p ! ='\r') {
        len = (len*10)+(*p - '0');
        p++;
    }

    /* Now p points at '\r', and the len is in bulk_len. */
    printf("%d\n", len);
    return 0;
}
Copy the code

After obtaining the length of a batch reply or multiple batch replies, the program simply calls the read function to read the entire body of the reply into memory without doing anything with the data. CR and LF at the end of the reply are not processed and discarded.

The implementation performance of the Redis protocol is comparable to that of the binary protocol, and due to the simplicity of the Redis protocol, most high-level languages can easily implement the protocol, which greatly reduces the number of bugs in the client software.

Trivia: How fast is Redis?

After successfully installing Redis, Redis comes with a command that can be used to test performance. By running this command, we can simulate a scenario in which N clients send requests simultaneously and monitor how long it takes Redis to process these requests.

According to official documentation, Redis has been benchmarked on over 60,000 connections and still managed to maintain an efficiency of 50,000 q/s under these conditions, the same amount of requests that MySQL would fail to handle would simply crash. For this reason, Redis is often used as a cache to protect the database.

It can be seen that Redis is known as one hundred thousand throughput really did not brag, after everyone interview can also pretend to casually mention this order of magnitude, found that a lot of people on the “one hundred thousand”, “million” this magnitude of often disorderly use, can be more accurate to say it is also a plus.

I’m Aobing, the more you know, the more you don’t know, thank you for your talent: likes, favorites and comments, we’ll see you next time!