All data and status of Redis are stored in memory. To avoid data loss caused by process exit, data and status need to be saved to hard disks.

To achieve this, there are two ways to do this:

  1. Treat Redis as a state machine and record every operation to Redis, that is, state transition. Logical backup is the operation of replaying records from the initial state when they need to be restored
  2. The complete state of Redis is saved for restoration if necessary. This method is called physical backup

Redis also implements both types of persistence, AOF and RDB respectively

AOF

AOF records database status by saving write commands executed by the Redis server.

AOF configuration

Example configuration file in Redis source code: Redis

Example of AOF configuration
# https://github.com/redis/redis/blob/48e24d54b736b162617112ce27ec724b9290592e/redis.conf#L489

# Important parameters:
appendonly yes # Whether to enable AOF. If AOF is enabled, AOF will be used preferentially and RDB will be skipped during subsequent database recovery
appendfsync everysec # Persist judgment rules
appendfilename appendonly.aof # AOF file location
Copy the code

AOF logs are written only after the command is executed

AOF logs are post-write logs. In contrast to Write Ahead logs (WAL), AOF logs are recorded only after Write commands are executed. This is designed because AOF logs received commands without syntax checking (for performance). Using post-write logging has two advantages:

  1. You can ensure that the commands recorded in logs are correct
  2. A command is logged only after it is executed. This does not block the current write operation

Risk:

  1. After the command is executed, data is not written. In this case, the system breaks down, and the command and related data may be lost
  2. It avoids blocking the current command, but may block the next command

AOF persists execution steps

  1. After executing the command, the server writes the command to the end of the SDS Aof_buf ‘buffer of struct redisServer
  2. The Redis process calls void flushAppendOnlyFile at the end of each event loop (the loop that handles client requests) to check that the commands in the buffer are written to the AOF file

AOF write condition judgment rule

In flushAppendOnlyFile, the appendfsync parameter in the configuration file determines whether to write to the AOF file. Writing commands from aOF_buf to AOF files is a two-step process:

  1. Call of the OSwriteFunction to save the commands in aof_buf to the memory buffer
  2. The OS writes the memory buffer to disk

If only the first step is performed, from redis’ point of view, the data has been written to the file, but it has not actually been written. If it is stopped at this point, the data will still be lost, so you can force the data in the buffer to be written to disk using the Fsync and Fdatasync provided by the OS

FlushAppendOnlyFile behavior Appndfsync options
Always write the contents of the AOF_buf buffer to the memory buffer and synchronize to the AOF file always
Writes the contents of the AOF_buf buffer to the memory buffer and synchronizes to the AOF file if it is more than one second since the last synchronization everysec
Write only to memory buffers, with the OS later deciding when to synchronize to AOF files no

The AOF judgment process is as follows:

void flushAppendOnlyFile(int force) {
    ssize_tnwritten; .// Call write to write the filenwritten = write(server.aof_fd,server.aof_buf,sdslen(server.aof_buf)); .// After successful writingserver.aof_current_size += nwritten; .// Determine whether to synchronize to AOF file according to appndfsync condition
    if (server.aof_fsync == AOF_FSYNC_ALWAYS) {
        ...
        // Aof_fsync is used to enforce synchronization because aof_fsync is defined as fsync
				/ / location in config.h:https://github.com/redis/redis/blob/48e24d54b736b162617112ce27ec724b9290592e/src/config.h#L89aof_fsync(server.aof_fd); .// Record the time for the next synchronization condition check
        server.aof_last_fsync = server.unixtime;
    } else if ((server.aof_fsync == AOF_FSYNC_EVERYSEC &&
                server.unixtime > server.aof_last_fsync)) {
        // Execute in background in another thread
        if (!sync_in_progress) aof_background_fsync(server.aof_fd);
        server.aof_last_fsync = server.unixtime;
    }
}
Copy the code

AOF file loaded

  1. Redis creates a dummy client with no network connection
  2. Commands are read from AOF files and handed to the pseudo client for execution. This process is similar to a normal Redis client reading commands from the network and then executing them

AOF rewrite

The AOF file records the write commands sent by the client in sequence. In the case of a large number of writes, the AOF file expands rapidly. Therefore, AOF needs to rewrite and simplify the commands.

AOF rewriting does not read the original AOF file, but directly generates a new AOF file according to the current state of the database, similar to the direct generation of INSERT statements when SQL data is exported.

For keys with multiple elements, such as large lists and sets, simply merging the writes of all elements into a single statement may result in a write statement that is too large, resulting in client input buffer overflows during subsequent command execution. So Redis is configured with a constant REDIS_AOF_REWRITE_ITEMS_PER_CMD that splits a command into multiple statements when it has more elements

AOF buffer

In the process of AOF rewriting, Redis server still needs to receive write requests from clients. In order to ensure data security, a child process is used to perform AOF rewriting. At this time, if the write command is executed, the child process does not know the modification made by the parent process. After AOF is complete, the data in the AOF file is inconsistent with the data in the actual database. Therefore, during AOF rewrite, the client receives commands that are written to the AOF rewrite buffer in addition to the AOF buffer

When the AOF rewrite is complete, the child sends a completion signal to the parent. The parent process receives it and appends the contents of the AOF rewrite area to the new AOF file, then changes the name of AOF and overwrites the original AOF file

RDB

Manually perform persistence

Redis RDB persistence uses the SAVE and BGSAVE commands to generate compressed binary RDB files that can be used to restore the database state at the time the file was generated.

SAVE blocks the main thread and cannot process any requests until the RDB file is generated. BGSAVE forks a child process to create an RDB file, and the parent process can still handle client commands. However, during BGSAVE execution, new SAVE and BGSAVE commands will be rejected due to race conditions, and BGWRITEAOF commands will be delayed until BGSAVE finishes. In contrast, when BGWRITEAOF is executed, the BGSAVE command is rejected for performance reasons and there is no actual conflict between the two

Prior to Redis 6.0, Redis Server had other threads working in the background, such as AOF flushing every second and closing file descriptors asynchronously, although Redis processed requests single-threaded

Both SAVE and BGSAVE call rdb.c/rdbSave to perform the real persistence process.

When Redis starts, the RDB file is loaded according to dir and dbfilename in the /etc/redis/redis.conf configuration file. If AOF persistence is enabled, Redis will preferentially use AOF to restore the database. Configuration files such as:

# RDB configuration example
# https://github.com/redis/redis/blob/48e24d54b736b162617112ce27ec724b9290592e/redis.conf#L125

# Important parameters:
dbfilename dump.rdb
dir /var/lib/redis
Copy the code

The actual work is done by rdb.c/rdbLoad while the main thread is blocked.

Automate persistence

Redis Enable automatic saving according to user – set saving conditions. Add save


to the /etc/redis/redis.conf configuration file to indicate that changes to the database have been made within seconds and the BGSAVE command will be executed. This configuration is loaded into the struct saveParam parameter of struct redisServer. Saveparam is a linked list. When multiple Save criteria are configured, the criteria are added to the list.

How do I determine whether the conditions for automatic saving are met?

Long long dirty (struct redisServer); long long dirty (struct redisServer); Sadd




adds 3 to dirty. Time_t lastsave records the last time RDB persistence was completed



Redis uses the int serverCron function to perform scheduled tasks such as automatic save condition checking, updating timestamps, updating LRU clocks, and so on. ServerCron is executed every 100 ms, where the code to check the auto-save condition is as follows:

// https://github.com/redis/redis/blob/48e24d54b736b162617112ce27ec724b9290592e/src/redis.c#L1199

// Check whether there are RDB and AOF processes running in the background before starting the automatic save condition check
if(server.rdb_child_pid ! =- 1|| server.aof_child_pid ! =- 1) {
	// There is a background RDB or AOF process
} else {
  // Iterate through all the configuration criteria in the SaveParams list
	for (j = 0; j < server.saveparamslen; j++) {
    struct saveparam *sp = server.saveparams+j;

    /* Meet the automatic save criteria: 1. The number of database changes (DIRTY) since the last RDB completion has reached the value of changes in the save configuration 2. The time since the last RDB completion (lastsave) has reached the value of seconds in the Save configuration of 3. The last RDB attempt was successful, or the time since the last RDB attempt (lastBGSAVe_try) has reached the configured timeout period (REDIS_BGSAVE_RETRY_DELAY) */
    if (server.dirty >= sp->changes &&
        server.unixtime-server.lastsave > sp->seconds &&
        (server.unixtime-server.lastbgsave_try >
         REDIS_BGSAVE_RETRY_DELAY ||
         server.lastbgsave_status == REDIS_OK))
    {
        redisLog(REDIS_NOTICE,"%d changes in %d seconds. Saving...",
            sp->changes, (int)sp->seconds);
        rdbSaveBackground(server.rdb_filename);
        break; }}}Copy the code

RDB file format (version 0006 as an example)

The RDB file consists of five parts:

All data is stored in a data file. The leading SELECTDB constant (value 376), followed by the number, indicates which database to which subsequent loaded data will be written when the RDB file is read.

Key_values holds all key-value pairs, including the key, value, and value type, as well as the EXPIRETIME_MS constant (value 374) and the expiration time represented by Unix timestamps for keys with expiration time. Where the types can be values in the following table, which correspond to the types of Redis data structures respectively:

Data structure type Code constants
string REDIS_RDB_TYPE_STRING, the value is 0
The list of REDIS_RDB_TYPE_LIST, the value is 1
A collection of REDIS_RDB_TYPE_SET, the value is 2
Ordered set and REDIS_RDB_TYPE_ZSET, the value is 3
The hash REDIS_RDB_TYPE_HASH the value is 4
A list implemented using a compressed list REDIS_RDB_TYPE_LIST_ZIPLIST
A collection implemented using a collection of integers REDIS_RDB_TYPE_SET_INTSET
An ordered collection implemented using a compressed list REDIS_RDB_TYPE_ZSET_ZIPLIST
Hash using a compressed list implementation REDIS_RDB_TYPE_HASH_ZIPLIST

The corresponding values of these coded constants can be viewed in rdb.h

Does this affect how the following values represented by value are interpreted when reading data, while key is always treated as REDIS_RDB_TYPE_STRING

The corresponding value structure of each type is as follows:

The value structure of note The sample type
Coding, value Represents a string that can be represented as an 8-bit integer REDIS_RDB_ENC_INT8, 123 REDIS_RDB_TYPE_STRING
Representation string REDIS_ENCODING_RAW, 5, hello
Number of elements, list elements The length of each element is recorded 3, 5, “hello”, 5, “world” REDIS_RDB_TYPE_LIST
Number of elements, set elements The length of each element is recorded 3, 5, “hello”, 5, “world” REDIS_RDB_TYPE_SET
Number of key-value pairs, key-value pairs The length of each key and value pair is recorded 2, 1, “a”, 5, “apple”, 1, “b”, 6, “banana” REDIS_RDB_TYPE_HASH
Number of elements, member and score pairs Where the length of member will be recorded, member before score 2, 2, “PI “, 4, “3.14”, 1, “e”, 3, “2.7” REDIS_RDB_TYPE_ZSET
A collection of integers converted to string objects The RDB is read by converting the string object back to a collection of integers REDIS_RDB_TYPE_SET_INTSET
To a compressed list of string objects It needs to be converted to a list when read REDIS_RDB_TYPE_LIST_ZIPLIST
To a compressed list of string objects It needs to be hashed when it reads REDIS_RDB_TYPE_HASH_ZIPLIST
To a compressed list of string objects It needs to be converted to an ordered set when it reads REDIS_RDB_TYPE_ZSET_ZIPLIST

How can I ensure that the write operation is properly executed

COW is used to fork out the memory data of the main thread. When the main thread modifies the data, it makes a copy of the data. At this point, the child process writes the copy to the RDB and the main thread still modifies the original data

Frequent full snapshot execution

  1. The disk is under heavy pressure because full data is written to the disk. If snapshots are too frequent, the previous task is not completed. As a result, snapshot tasks compete for disk bandwidth, causing a vicious cycle
  2. The fork operation itself blocks the main thread, and the larger the main thread, the longer it blocks because the page table is copied

** Solution: ** After a full snapshot is created, only incremental snapshots are created. However, the modified data needs to be recorded in the memory before the next full snapshot is created. Therefore, Redis 4.0 proposes a mix of AOF and full snapshot, set with aof-use-rdb-preamble yes. In this way, changes made between full snapshots are recorded to the AOF file

RDB backup risks in the scenario of excessive write and insufficient read

  1. Memory resource risks: If the parent process writes a large number of new keys, the machine will soon run out of memory. If the Swap mechanism is enabled on the machine, part of the Redis data will be transferred to disk. Redis performs poorly when accessing this part of the data on disk. If Swap is not enabled on the machine, OOM is triggered and the parent and child processes may be killed by the system.
  2. CPU resource risk: Although the child process is doing RDB persistence, the RDB snapshot generation process consumes a lot of CPU resources. As a result, the parent process takes longer to process requests, and the child process takes longer to generate RDB snapshots. As a result, the Redis Server performance deteriorates.
  3. If the Redis process is bound to the CPU, the child process will inherit the CPU affinity attribute of the parent process, and the child process will inevitably compete for the same CPU resource with the parent process. The performance of the entire Redis Server will be affected, so if Redis needs to enable timed RDB and AOF rewriting, The process must not be bound to the CPU.

Ref

  1. Redis-RDB-Dump-File-Format
  2. Redis design and implementation