introduce

Redis is a key-value database server, which contains any non-empty database, and each non-empty database can contain any key-value pair. As shown below:

When Redis provides external data access services, it uses the data residing in the memory. The data will disappear after Redis restarts. To allow data to be recovered after a restart, Redis has the ability to persist data to local disks.

Redis can be persisted in three ways: full mode (RDB), incremental mode (AOF), and mixed persistence.

Full schema Persistence (RDB)

As a stateful node, Redis’s “state” can be defined by the key-value values of all db in the instance. Every time Redis processes a data access write command to modify the key-value data of DB, Redis will experience a state change. Full persistence is a snapshot (RDB file) that stores all db key-values at the time when persistence is triggered. As shown below:

The RDB file is a compressed binary file that restores the database state when the RDB file was generated.

RDB file The default file name is dump.rdb.

# The filename where to dump the DB
dbfilename dump.rdb
Copy the code

RDB persists the write process

Redis full writes contain two methods (two Redis commands) : SAVE and BGSVAVE

SAVE

SAVE can be triggered either explicitly by the client or when redis is shutdown; either way, SAVE is executed as a single-threaded serialized command. The SAVE command blocks the server process, and the server process cannot process other command requests during the block. Therefore, the SAVE method ensures that the data state is consistent at the moment and will not change. If the data volume is large, the persistence time will be long.

BGSAVE

BGSAVE can be triggered explicitly by a client command, by a scheduled task, or by a slave node in a master-slave distributed structure.

BGSAVE command execution forks a child process, which then creates the RDB file, and the server process continues to process the command requests (fork child asynchronous persistence). A copy of the parent’s database state at the time of the fork is stored in the child. A copy of the child process’s database state does not change later, and the child process concurrently writes the copy to the RDB file.

As mentioned earlier, BGSAVE can be triggered by a scheduled task through configuration (setting the save option for the server configuration). This is because BGSAVE is executed without blocking the server process, so we can configure the SAVE option to have the server execute BGSAVE commands automatically every once in a while.

The default Redis server save option is configured as follows:

# # save <seconds> <changes> # # Will save the DB if both the given number of seconds and the given # number of write operations against the DB occurred. # # In the example below the behaviour will be to save: # after 900 sec (15 min) if at least 1 key changed # after 300 sec (5 min) if at least 10 keys changed # after 60 sec if  at least 10000 keys changed # # Note: you can disable saving completely by commenting out all "save" lines. # # It is also possible to remove all the previously configured save # points by adding a save directive with a single empty string argument # like in the following example: # # save "" save 900 1 save 300 10 save 60 10000Copy the code

Any of the above conditions are met and the BGSAVE command is automatically executed for backup. For example, save 900 1 indicates that the server has modified the database at least once within 900 seconds. We can customize the configuration according to our needs.

Comparison of SAVE and BGSAVE

The advantage of BGSAVE over SAVE is that it can continue to provide data read and write services during persistence. However, when the child thread forks, it involves copying the parent process memory, which increases the server memory overhead. When the memory overhead is high enough to use virtual memory, BGSAVE forks the server, causing more than second unavailability. Therefore, using BGSAVE requires sufficient free memory.

RDB persistent recovery process

The Redis startup loads the previously persistent files from the local disk into memory and processes subsequent data access commands from the client. The persistent method loadDataFromDisk method comes from server.c.

/* Function called at startup to load RDB or AOF file in memory. */ void loadDataFromDisk(void) { long long start = ustime(); If (server.aof_state == AOF_ON) {//loadAppendOnlyFile is the method of loading AOF file into memory if (loadAppendOnlyFile(server.aof_filename) == C_OK) serverLog(LL_NOTICE,"DB loaded from append only file: %.3f seconds",(float)(ustime()-start)/1000000); } else { rdbSaveInfo rsi = RDB_SAVE_INFO_INIT; If (rdbLoad(server.rdb_filename,&rsi) == C_OK) {serverLog(LL_NOTICE,"DB loaded from disk: %.3f seconds", (float)(ustime()-start)/1000000); /* Restore the replication ID / offset from the RDB file. */ if (server.masterhost && rsi.repl_id_is_set && rsi.repl_offset ! = -1 && /* Note that older implementations may save a repl_stream_db * of -1 inside the RDB file in a wrong way, see more information * in function rdbPopulateSaveInfo. */ rsi.repl_stream_db ! = -1) { memcpy(server.replid,rsi.repl_id,sizeof(server.replid)); server.master_repl_offset = rsi.repl_offset; /* If we are a slave, create a cached master from this * information, in order to allow partial resynchronizations * with masters. */ replicationCacheMasterUsingMyself(); selectDb(server.cached_master,rsi.repl_stream_db); } } else if (errno ! = ENOENT) { serverLog(LL_WARNING,"Fatal error loading the DB: %s. Exiting.",strerror(errno)); exit(1); }}}Copy the code

RDB persistence is full backup, so the backup frequency should not be too frequent. In this case, a lot of data will be lost in the event of a sudden outage, and the data in the current time and the last snapshot generation period will not be backed up.

Since full persistence loses a lot of data when it goes down, is there any other way to reduce the loss of data in special cases? Yes, that is the next to introduce incremental persistence, which is very different from RDB persistence. RDB persistence stores the snapshot data of Redis’ key-value pair data, while incremental persistence records database state by saving write commands.

Incremental Persistence (AOF)

Appendonly in redis.conf is set to yes, and no is disabled by default.

# By default Redis asynchronously dumps the dataset on disk. This mode is # good enough in many applications, but an issue with the Redis process or # a power outage may result into a few minutes of writes lost (depending on # the  configured save points). # # The Append Only File is an alternative persistence mode that provides # much better durability. For instance using the default data fsync policy # (see later in the config file) Redis can lose just one second of writes in a # dramatic event like a server power outage, or a single write if something # wrong with the Redis process itself happens, but the operating system is # still running correctly. # # AOF and RDB persistence can be enabled at the same time without problems. # If the AOF is enabled on startup Redis will load the AOF, that is the file # with the better durability guarantees. # # Please check http://redis.io/topics/persistence for more information. appendonly noCopy the code

AOF(append-only-file) persistence records the database state by appending the write commands executed by the Redis server. If AOF persistence is enabled, data in memory can be recovered by executing write commands in AOF files when Redis is started.

AOF persists the write process

AOF persistence can be divided into three steps: command append, file write and file synchronization. Let’s go through them one by one.

Command to add

When AOF persistence is enabled, the server appends a write command to the end of the server state’s AOF_buf buffer (append to the aOF_buf variable of the redisServer object) in a protocol format after executing the write command.

Write and synchronize AOF files

The Redis server process is an event loop in which file events are responsible for receiving client requests and sending command replies to the client. Time events are responsible for executing timed functions. File events may execute write commands, with some content appended to the aOF_buf buffer. At the next iteration, the main loop calls the flushAppendOnlyFile method to write the contents of aOF_buf into the corresponding file of AOF before entering the multiplexed SELECT method. But the write operation simply writes data to the cache, and when it actually hits disk depends on the operating system. Only an explicit call to the fsync() method can cause the operating system to land data to disk.

The behavior of the flushAppendOnlyFile function is determined by the appendfsync option set in the server configuration.

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no
Copy the code

The three synchronization policies of AOF are configured with the appendfSync option.

The default generated AOF file name is called appendone.aof.

# The name of the append only file (default: "appendonly.aof")

appendfilename "appendonly.aof"
Copy the code

AOF persists the recovery process

Also, when Redis is started, if AOF exists, the AOF file is used to restore the memory data. The process is to re-execute the command in the AOF file. Compared with RDB files, AOF files have more complete data. Therefore, AOF files are preferred for data recovery.

We can see that AOF persistence records database state by saving write commands that are executed. The size of AOF files will become larger and larger, which may affect the Redis server. Meanwhile, data restoration by AOF may be time-consuming. Does Redis provide any way to optimize? Yes, Redis provides file rewriting to reduce the size of AOF files.

AOF rewrite

Let’s take a look at the changes to AOF files before and after the rewrite:

From the figure above, we can see that after the AOF file is rewritten, the write command of expired data is cleared, and the data of the same key is replaced by a new command of the latest data (one new command replaces multiple historical commands).

We can summarize the principle of AOF rewriting: first read the current value of the key from the database, and then use a command to record the key/value pair instead of multiple commands to record the key/value pair.

If the aOF_rewrite function was called by the main process, it could not handle command requests from clients during rewriting. This is not desirable because of the amount of write operations involved. So Redis is putting the AOF rewrite into a child thread. Background AOF rewrite is implemented in the same way as the bgrewriteaof command.

So under what conditions can AOF rewrite be triggered? This defines the default configuration in the parameters of the redis.conf configuration (which can of course be modified as needed) :

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
Copy the code

The auto-aof-rewrite-min-size configuration indicates the minimum size of the aOF file to be rewritten. To reduce the number of rewrites, you can set this parameter to a larger value. Personally, you are advised to set this parameter to more than 6 GB.

The auto-aof-rewrite-percentage parameter indicates the percentage above which the aOF optimized file size is overwritten. 100 indicates that the file can be rewritten twice the size of the AOF optimized file (provided that the condition greater than auto-aof-rewrite-min-size is met).

A flowchart for writing the rewrite mechanism in the AOF pattern is shown below:

According to the figure above, we can figure out the relevant procedures of AOF rewriting:

1) the main loop runs to timing task to determine whether a Rewrite conditions meet, meet would fork out a child process by rewriteAppendOnlyFileBackground function, after completion of the child to create the main process in that state of data (has nothing to do with the old AOF file, Read data state in memory directly)

2) The child process writes state into the rewrite AOF file. While the child thread is running, the Redis main thread continues to provide services, and new increments are written to aof_rewrite_buf_Blocks of the redisServer object

3) After the child thread completes, it will send a signal to the parent process. After receiving the signal, the parent process will call a signal processing function

4) The signal handler appends the entire contents of the AOF rewrite buffer to the rewrite snapshot file and then atomically overwrites the existing AOF file. The server process blocks when performing this step, so the new AOF file saves the same data as the server’s current data.

5) After the above steps are complete, the following write commands are written to the new AOF file as normal

RDB persistence versus AOF persistence

Based on the above analysis, RDB recovers data quickly and saves the data itself, which is suitable for large-scale data recovery and cold backup. Because RDB is full persistence and the execution frequency is lower than THAT of AOF, AOF loses less data in downtime compared with RDB. Moreover, AOF adopts the incremental append mode, which results in higher write performance. However, AOF is not suitable for cold backup. The files to be recovered may be large, and the recovery speed may be slow. In this case, the recovery may be unstable after commands are executed.

Mixed Persistence (AOF+RDB)

Redis4.0 starts to support mixed persistence. If you want to support mixed persistence, you need to turn on the following configuration (on by default) :

# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
#   [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
aof-use-rdb-preamble yes
Copy the code

When aOF -use-rdb-preamble is enabled, the rewritten AOF file is optimized so that the first half is RDB data and the second half is AOF data ([RDB file][AOF tail]). When loading Redis, identify the AOF file that starts with the string “Redis” and load the RDB file with the prefix. After loading the RDB, continue loading the following AOF.

Mixed persistence is also done according to bgrewriteAOF, except that the child process forked out writes the full memory copy to the AOF file in RDB mode and appends the aOF_rewrite_buf_blocks overwrite buffer to the AOF file. The other processes are the same as AOF rewriting.

What’s the good of that?

  • – Overwrite and restore loading will be faster (RDB data will be a big part of AOF files later)
  • AOF files are followed by incremental write commands to reduce data loss in special cases such as downtime

The downside of using hybrid persistence is that the AOF file becomes slightly more complex as the content becomes less uniform.

How to choose between RDB and AOF in practice

RDB and AOF have their own strengths, and AOF persistence alone is not recommended because RDB stores data rather than write commands, which is better for data backup. In addition, RDB can recover data faster. Although AOF provides more complete data and faster incremental appending, it is slow to recover data (execute write commands) in the case of large data volumes.

In the production environment, you are advised to use RDB and AOF at the same time and use RDB for Dr Backup. Mixed persistence is recommended if your Redis version is 4.0 or greater.

Reference books: Redis Design and Implementation, Deep Into Distributed Caching