0 the introduction

This blog is a summary of the second part of the book “Redis Design and Implementation” by Huang Jianhong. If you want to know more detailed content, please read this book.

1 RDB Persistence

RDB persistence, which can be performed either manually or periodically depending on server configuration options, saves the database state at a point in time to an RDB file.

Database state: We collectively refer to the non-empty databases and their key-value pairs in the server as database state.

The RDB file generated by the RDB persistence function is a compressed binary file that restores the database state when the RDB file is generated.

1.1 RDB file creation and loading

There are two commands that can be used to generate RDB files, one is SAVE and the other is BGSAVE.

The SAVE command blocks the Redis server process until the RDB file is created. Unlike the SAVE command, the BGSAVE command spawns a child process that creates the RDB file, and the server process continues to process the command request.

Unlike SAVE and BGSAVE commands, the loading of RDB files is automatically performed at server startup, so there is no Redis command for loading RDB files, as long as the Redis server detects the RDB file at startup. The RDB file is automatically loaded.

1.2 Flowchart of triggering the BGSAVE command

2 AOF persistence

Unlike RDB persistence, which records database state by saving key-value pairs in the database, AOF persistence records database state by saving write commands executed by the Redis server.

The implementation of AOF persistence can be divided into three steps: command addition, file writing and file synchronization.

2.1 Command Addition

When AOF persistence is enabled, the server executes a write command, which is appended to the end of the aOF_buf buffer of the server state:

struct redisServer {
    // ...
    // AOF buffer
    sds aof_buf;
    // ...
}
Copy the code

2.2 Writing and synchronization of AOF files

In this case, the contents of aOF_buF are written to OS_BUFFER, and the contents of OS_Buffer are synchronized to disks. Therefore, synchronization is the real meaning of the data in aOF_BUF off disk.

Due to the difference of several orders of magnitude between disk and memory reading and writing speeds, modern operating systems provide a OS_Buffer to improve file writing efficiency. The operating system stores the data in OS_Buffer until the space of the os_Buffer is filled up or exceeds the specified threshold.

While this improves efficiency, it also creates a new problem — data consistency. If a computer outage occurs, write data stored in os_Buffer will be lost. To ensure data security, the system provides two synchronization functions, fsync and fdatasync, to force the operating system to immediately write data in OS_Buffer to disks.

The Redis server process is an event loop in which file events are responsible for receiving client command requests and sending command replies to the client.

The server calls flushAppendOnlyFile at the end of each event loop because it may write to the aof_buf buffer. Consider whether to write and save the contents of aOF_buf to an AOF file.

The behavior of the flushAppendOnlyFile function is determined by the value of the appendfsync option, and the behavior of the different values is as follows:

Appendfsync option value Behavior of the flushAppendOnlyFile function The efficiency of security
always Before each end of an event loop, all the contents of the AOF_buf buffer are written to OS_Buffer and synchronized to the AOF file The slowest Most secure, even if there is a malfunction, only the command data generated in one event loop will be lost
everysec Write everything in aOF_buf to OS_BUFFER. If the interval between the contents of the OS_BUFFER and the AOF file is now more than 1s, then the AOF file is synchronized again. This synchronization operation is performed by another thread Fast enough All data that has not been synchronized since the last synchronization will be lost (theoretically, no more than 1s of data will be lost)
no The contents of aOF_buf are written to OS_Buffer without synchronization. The timing of synchronization is determined by the operating system The fastest The data synchronization time is determined by the operating system. Therefore, data is written in OS_Buffer for a period of time. Therefore, the single synchronization time in this mode is usually the longest. In terms of amortized operation, the efficiency of this mode is similar to that of everysec mode. In the event of a faulty shutdown, all write commands since the last AOF file synchronization are lost

2.3 AOF rewrite

Because AOF persistence records database state by saving write commands that are executed, as the server runs, the contents of AOF files will increase. If not controlled, large AOF files may have an impact on the Redis server, or even the entire host. And the larger the AOF file size, the more time it takes to restore data using AOF. Finally, for data loading purposes, we don’t need historical states, AOF just records the most recent state, which is the command that produces a consistent state.

To solve the problem of bloated AOF files, Redis provides AOF file rewriting. With this feature, the Redis server can create a new AOF file to replace the existing AOF file. The old and new AOF files hold the same database state, but the new AOF file does not contain any redundant commands that waste space.

2.3.1 Implementation of AOF rewrite

AOF file rewrite does not require any reading, parsing, or writing of existing AOF files. This function is implemented by reading the server’s current database state.

Consider the following case where the server executes the following command on the list key:

Redis > RPUSHlist "A" "B"            // ["A", "B"]
(integer) 2Redis > RPUSHlist "C"                // ["A", "B", "C"]
(integer) 3Redis > RPUSHlist "D" "E"            // ["A", "B", "C", "D", "E"]
(integer) 5Redis > LPOPlist                     // ["B", "C", "D", "E"]
"A"Redis > LPOPlist                     // ["C", "D", "E"]
"B"Redis > RPUSHlist "F" "G"            // ["C", "D", "E", "F", "G"]
(integer) 5
Copy the code

Then the server must write six commands to the AOF file in order to save the current list key state.

If the server wants to record the state of the list keys with as few commands as possible, the simplest and most efficient way is not to read and analyze existing AOF files, but to read the list keys directly from the database. Then replace the six commands stored in the AOF file with a single RPUSH list “C” “D” “E” “F” “G” command, reducing the number of commands needed to save the list key from six to one.

As mentioned above, the realization principle of AOF rewriting is to read the current key value from the database, and then record the key value pair with one command instead of the previous multiple commands. This is the realization principle of AOF rewriting function.

2.3.2 AOF background rewrite

Because the Redis server uses a single thread to process command requests, and AOF overrides do a lot of writing, if the server calls AOF overrides directly, the Redis server thread will be blocked for a long time and will not be able to process requests from clients during AOF overrides.

Redis does not cause the server to be unable to handle requests due to AOF overrides, so the AOF overrides are executed in a child process. This serves two purposes simultaneously:

  1. The server process can continue processing command requests while the child process does the AOF rewrite
  2. The child process has a copy of the server process’s data, and using the child process instead of the thread ensures data security in the event of locking

However, during the AOF rewrite using the child process, the server process continues to process command requests, and the new command may modify the existing database state so that the current database state of the server is inconsistent with the database state stored in the rewritten AOF file.

To solve this problem, the Redis server sets up the AOF rewrite buffer, which is used after the child process is created. When Redis executes a write command, it sends the write command to both the AOF buffer and the AOF rewrite buffer, as shown in the following figure:

This ensures that:

  1. The introduction of the AOF rewrite buffer does not affect the processing of existing AOF files
  2. From the time the child process is created, all write commands executed by the server are logged into the AOF rewrite buffer

When the child completes the AOF rewrite, it sends a signal to the parent. The parent that receives the signal calls a signal handler to stop executing the command request from the client and do the following:

  1. Write all the contents of the AOF rewrite buffer to the new AOF file, and the database state stored in the new AOF file will be the same as the current database state of the server
  2. Rename the new AOF file, atomic overwrite the existing AOF file, complete the replacement of the old and new two AOF files

After this signal handler completes, the parent process can continue receiving client command requests.

3 Reference reading

  1. Redis Design and Implementation, second edition, Part 2 — Huang Jianhong