Hello, today for everyone to share is Redis persistence, quickly take out a small book to write down

preface

As you all know Redis is an in-memory database, data is stored in memory, which is one of the reasons why Redis is very fast. Although the speed is up, it is very easy to lose data if it is kept in memory. If the server shuts down or goes down, the data in memory is gone. To solve this problem, Redis provides a persistence mechanism. “RDB” and “AOF” persistence.

RDB

What is RDB persistence?

RDB persistence can generate point-in-time snapshots of data sets at specified intervals.

Advantages of RDB?

  • The RDB is a compact file that represents Redis data at a point in time. RDB files are suitable for backup. For example, you might want to archive RDB files every hour for the last 24 hours, and save RDB snapshots for the last 30 days every day. This allows you to easily recover different versions of data sets for disaster recovery.
  • RDB is ideal for disaster recovery as a compact single file that can be transferred to a remote data center.
  • RDB maximizes Redis performance. Because the only thing the Redis parent needs to do to persist is fork a child that does all the rest of the work. The parent process instance does not need to perform operations such as disk IO.
  • RDB saves instances of large datasets on restart faster than AOF.

Disadvantages of RDB?

  • RDB may not be good when you need to minimize data loss when Redis stops working (such as power outages). You can configure different save points to save RDB files (for example, after at least 5 minutes and 100 writes to the dataset, but you can have multiple save points). However, you usually create an RDB snapshot every 5 minutes or more, so if Redis stops working for any reason and doesn’t shut down properly, you should be prepared for data loss in the last few minutes.
  • The RDB often calls the fork() subprocess to persist to disk. Fork () is time-consuming if the dataset is large, and as a result Redis will stop serving the client for milliseconds or even a second if the dataset is large and the CPU performance is not strong enough. AOF also requires fork(), but you can tune how often logs are rewritten without compromising trade-off durability.

RDB file creation and loading

There are two Redis commands that can be used to generate RDB files, one is “SAVE” and the other is “BGSAVE”. The “SAVE” command blocks the Redis server process until the RDB file is created, and the server cannot process any command requests while the server process is blocked.

SAVE // Wait until the RDB file is created

OK

Unlike the SAVE command, which blocks the server process directly, the BGSAVE command gives birth to a child process, which then creates the RDB file, and the server process (parent) continues to process the command process. “The operating system (UNIx-like operating system) uses copy-on-write when forking. This means that the parent process shares the same memory data at the moment the fork occurs. When the parent process wants to change a piece of memory (such as a write command), The operating system will make a copy of this data to ensure that the child process is not affected, so the new RDB file stores the memory data at fork point.”

BGSAVE // Descends the child process and creates the RDB file by the child process

Background saving started

There are two ways to generate RDB files: “one is manual, as described above, and the other is automatic.” The process of automatically generating RDB files is described in detail. Redis allows users to have the server automatically execute BGSAVE commands at regular intervals by setting the “save” option for the server configuration. The user can set multiple save conditions by using the SNAPSHOTTING save option in the redis.conf configuration file, but the server executes the BGSAEVE command whenever any of these conditions are met. For example, save 900 1Save 300 10Save 60 10000

  • The server has made at least one change to the database within 900 seconds.
  • The server made at least 10 changes to the database in 300 seconds.
  • The server made at least 10,000 changes to the database in 60 seconds.

If the save option is not configured manually, the server sets the default parameter for the save option: Save 900 1Save 300 10Save 60 10000 The server will then set the saveParams property of the server state redisServer structure according to the configuration of the save option:

struct redisServer{

/ /… Struct saveparams *saveparams; / /… };

The saveParams attribute is an array. Each element in the array is a SaveParam structure. Each saveParam structure holds a save condition set by the Save option:

struct saveparam {

Time_t seconds; Int changes; };

In addition to the SaveParams array, the server state maintains a dirty counter and a lastSave attribute;

struct redisServer { // … // Change the count long long dirty; Time_t lastsave; / /… }

The dirty counter records how many changes (including writes, deletes, updates, and so on) the server has made to the database state (all databases on the server) since the last SAVE or BGSAVE command was successfully executed.

The lastSave attribute is a UNIX timestamp that records the last time the server executed the SAVE or BGSAVE command.

Check whether the conditions meet the trigger RDB

Redis’s server periodic operation function serverCron executes every 100 milliseconds by default. This function is used to maintain a running server. One of its jobs is to check whether the save criteria set by the save option have been met, and if so, to execute the BGSAVE command. Redis serverCron

The program iterates through and checks all the save criteria in the SaveParams array, and the server executes the BGSAVE command if any of the criteria are met. RdbSaveBackground:

RDB file structure

The following figure shows the various parts of a complete RDB file.

The redis file begins with the “redis” section, which is 5 bytes long and holds the “redis” five characters. With these five characters, the program can quickly check whether the loaded file is an RDB file when loading the file.

The length of the db_version parameter is 4 bytes. Its value is a string integer that records the RDB file version number. For example, 0006 indicates that the RDB file version is version 6.

The “Database” section contains zero or any number of databases and key-value pairs in each database:

If the server’s database state is empty (all databases are empty), then this part is also empty and has a length of 0 bytes.

If the server’s database state is non-empty (at least one database is non-empty), then this section is also non-empty, and the length of this section varies depending on the number, type, and content of key/value pairs held by the database.

The “EOF” constant is 1 byte long. This constant marks the end of the body of the RDB file. When the reader encounters this value, it knows that all key-value pairs for all databases have been loaded.

Check_sum is an 8-byte unsigned integer that contains a checksum calculated by REDIS, db_version, database, and EOF. When the RDB file is loaded, the server compares the checksum calculated by the loaded data with the check_sum recorded by the RDB to check for errors or corruption. For example, here is an RDB file for database 0 and database 3. The first is “REDIS” for an RDB file, followed by “0006” for REDIS version 6, followed by the two databases, followed by the EOF end identifier, and finally check_sum.

AOF persistence

What is AOF persistence

The AOF persistence mode records each write operation to the server. When the server is restarted, these commands will be executed again to restore the original data. The AOF command saves each write operation to the end of the file using the Redis protocol. Redis can also rewrite AOF files in the background so that AOF files are not too large.

What are the advantages of AOF?

  • Using AOF makes your Redis more durable: you can use different fsync strategies: no fsync, fsync per second, fsync every time you write. With the default fsync per second policy,Redis still performs well (fsync is handled by background threads and the main thread does its best to handle client requests), and you can lose up to 1 second of data in the event of a failure.
  • The AOF file is an append only log file, so there is no need to write seek, even if for some reason (disk space is full, write downtime, etc.) the full write command is not executed, you can use the redis-check-aof tool to fix these problems.
  • Redis can automatically rewrite the AOF in the background when the AOF file becomes too large: the rewritten new AOF file contains the minimum set of commands needed to restore the current data set. The entire rewrite operation is absolutely safe because Redis continues to append commands to existing AOF files while creating new AOF files, and the existing AOF files will not be lost even if an outage occurs during the rewrite. Once the new AOF file is created, Redis switches from the old AOF file to the new AOF file and starts appending the new AOF file.
  • AOF files orderly store all writes to the database in the Redis protocol format, so the contents of AOF files are easy to read and parse. Exporting AOF files is also very simple: For example, if you accidentally execute the FLUSHALL command, as long as the AOF file isn’t overwritten, stop the server, remove the FLUSHALL command at the end of the AOF file, and restart Redis, You can restore the data set to the state it was in before the FLUSHALL execution.

Disadvantages of AOF?

  • AOF files are usually larger than RDB files for the same data set.
  • Depending on the fsync strategy used, AOF may be slower than RDB. Fsync per second performance is still very high under normal conditions, and turning off fsync allows the AOF to be as fast as the RDB, even under high loads. However, RDB can provide more guaranteed maximum latency when handling large write loads.

Implementation of AOF persistence

The implementation of AOF persistence can be divided into three steps: command append (append), file write and file sync.

Command to add

When AOF persistence is enabled, the server appends a write command to the end of the aOF_buf buffer of the server state in a protocol format after executing the write command.

struct redisServer { // … // SDS aOF_buf; / /.. };

If the client sends the following command to the server:

 set KEY VALUE

OK

After executing the set command, the server appends the following protocol content to the end of the aOF_buf buffer;

*3\r\n3\r\nSET\r\n3\r\nKEY\r\n$5\r\nVALUE\r\n

AOF file writing and synchronization

The Redis server process is an event loop, in which file events receive and reply to client commands, and time events execute functions that need to be run periodically, such as the serverCron function. Because the server may execute write commands to append content to the aof_buf buffer when processing file events, it calls the ‘flushAppendOnlyFile’ function before closing an event loop. To consider whether the contents of the AOF_buf buffer need to be written and saved to an AOF file, this process can be represented by the following pseudocode:

def eventLoop(): while True: # handle file events, ProcessFileEvents () # processTimeEvents() # Consider whether you want to write and save the contents of aOF_buf to FlushAppendOnlyFile ()

The behavior of the flushAppendOnlyFile function is determined by the value of the appendfsync option configured on the server.

If the user does not actively set a value for the appendfsync option, the default value of the appendfsync option is everysec. Write: Writes data from aOF_buf to an AOF file. “Sync: Call fsync and the fdatasync function to save the data in the AOF file to disk.” In layman’s terms, you’re writing to a file, and the process of writing is writing, whereas synchronization is saving the file and putting the data on disk. Redis AOF defaults to everysec. This policy is executed once per second, so AOF persistence loses at most one second of data.

AOF file loading and data restoration

Because the AOF file contains all the write commands needed to restore the database state, the server simply reads and re-executes the write commands saved in the AOF file to restore the database state before the server was shut down. Redis reads the AOF file and restores the database state as follows:

  1. Create a fake client with no network connection: Because the commands of Redis can only be executed in the context of the client, and the commands used to load AOF files are directly from the AOF file rather than the network connection, the server uses a fake client without network connection to execute the write commands saved by the AOF file. The effect of the command executed by the pseudo client is exactly the same as that executed by the client with network connection.
  2. Parse and read a write command from the AOF file.
  3. Use pseudo clients to execute read write commands.
  4. Continue steps 2 and 3 until all write commands in the AOF file have been processed. When the above steps are complete, the database state saved by the AOF file is restored in its entirety, as shown in the figure below.

AOF rewrite

Because AOF persistence keeps track of the state of the database by holding write commands that are executed, as the server runs, the contents of AOF files grow larger and larger, if left unchecked, A large AOF file is likely to affect the Redis server, or even the entire host computer, and the larger the AOF file, the more time it takes to restore data using AOF files. For example, the client runs the following command:

 rpush list “A” “B”

OK

 rpush list “C”

OK

 rpush list “D”

OK

 rpush list “E” “F”

OK

So just to record the state of the list key, the AOF file needs to hold four commands. For practical use, write commands are executed much more often and frequently than the simple example above, and the resulting problems are much more serious. To solve the problem of bloated AOF files, Redis provides AOF file rewrite. With this feature, the Redis server can create a new AOF file to replace the existing AOF file. The old and new AOF files hold the same database state, but the new AOF file does not contain any redundant commands that waste space. So the size of new AOF files is usually much smaller than the size of old AOF files. In the following sections, we’ll look at how the AOF file rewrite works, as well as how the BGREWEITEAOF command works. Although Redis names the ability to replace old AOF files with new AOF files as “AOF file rewrite”, in fact, AOF file rewrite does not require any reading, analysis, or writing of existing AOF files. It is done by reading the server’s current database state. As in the case above, the server could easily combine these six commands into one.

 rpush list “A” “B” “C” “D” “E” “F”

With the exception of the list keys listed above, all other types of keys can be used in the same way to reduce the number of commands in AOF files. First read the current value of the key from the database, and then use a command to record the key value pair, instead of recording the previous multiple commands, this is the realization principle of the AOF rewrite function. In practice, in order to avoid client input buffer overflow when executing commands, the rewrite program checks the number of elements in the key before processing the list, hash table, set, and ordered set, which may contain multiple elements. If the number of elements exceeds the value of the redis. H /REDIS_AOF_REWRITE_ITEMS_PER_CMD constant, the rewrite program will use multiple commands to record the value of the key instead of just one command. In the current version, the value of the REDIS_AOF_REWRITE_ITEMS_PER_CMD constant is 64, which means that if a set key contains more than 64 elements, the rewrite will use multiple SADD commands to record the set. Each command also sets the number of elements to 64.

AOF background rewrite

AOF overrides do a lot of writing that affects the main thread, so redis AOF overrides are carried out by child processes. This serves two purposes:

The server process (parent) can continue to process command requests while the child process does the AOF rewrite.

The child process has a copy of the server process’s data, and using the child process instead of the thread keeps the data secure without using locks.

One problem, however, is that when the child rewrites the data, the main process is still processing the new data, which can cause data inconsistencies. “To address this inconsistency, the Redis server sets up an AOF rewrite buffer. This buffer is used after the server creates a child process. When the Redis server executes a write command, it sends the write command to both the AOF buffer and the AOF rewrite buffer.” The diagram below:

This means that the server process needs to do the following three things during AOF rewrite by the child process:

  1. Run the command sent by the client.
  2. Appends the executed write command to the AOF buffer.
  3. Appends the executed write command to the AOF rewrite buffer.

This ensures that:

The contents of the AOF buffer are periodically written and synchronized to AOF files, and processing of existing AOF files will proceed as usual.

All write commands executed by the server from the time the child process is created are recorded in the AOF rewrite buffer.

When the child completes the AOF rewrite, it sends a signal to the parent. Upon receiving the signal, the parent calls a signal handler and performs the following:

Write all the contents of the AOF rewrite buffer to the new AOF file, and the state of the database stored in the new AOF file will be the same as the current database state of the server.

Rename the new AOF file, overwrite the existing AOF file atomically, and complete the replacement of the old and new AOF files.

After this signal handler completes, the parent process can continue accepting command requests as usual. In the entire AOF background rewrite process, only the signal processing function execution will block the server process (parent process), at other times, AOF background rewrite does not block the parent process, which minimizes the impact of AOF rewrite on server performance.

Redis hybrid persistence

Redis can also use both AOF and RDB persistence. In this case, when Redis restarts, it will use AOF files to restore the dataset in preference, since AOF files usually hold more complete datasets than RDB files. However, AOF recovery was slow, and Redis 4.0 introduced “hybrid persistence”.

Mixed persistence: Store the contents of the RDB file with the incremental AOF log file. Here the AOF log is no longer the full log, but rather the incremental AOF log that occurs between the beginning of persistence and the end of persistence, which is usually small.

Therefore, when Redis restarts, the content of “RDB” can be loaded first, and then replay the incremental AOF log can completely replace the previous full FILE replay of AOF, thus greatly improving the restart efficiency.

Well, this is the end of today’s article, I hope to help you confused in front of the screen