Redis data persistence

As a memory database, Redis data is stored in memory, so once the Redis server process exits, the data in the server will also disappear. To solve this problem, Redis provides a persistence mechanism, which saves data from memory to disk to avoid accidental loss of data

Redis provides two persistence schemes: RDB persistence and AOF persistence. One is snapshot and the other is log append

The RDB snapshot is persisted

RDB persistence is performed by taking snapshots, that is, snapshots of data sets in memory are written to disk within a specified period of time. After creating a snapshot, users can back up the snapshot, copy the snapshot to another server to create a server copy of the same data, or restore the data after restarting the server. RDB is the default Redis persistence mode

Snapshot persistence

RDB persistence generates an RDB file. The file is a compressed binary file that can be used to restore the database state when the snapshot is taken, that is, the server data when the RDB file is generated. The default RDB file is dump. RDB in the current working directory. You can set the name and location of the RDB file according to dbfilename and dir in the configuration file

# Set the file name of dump
dbfilename dump.rdb

# Working directory
Dbfilename specifies only the name of the file,
# but it will be written to this directory. The configuration item must be a directory, not a file name.
dir ./
Copy the code

Timing when a snapshot is triggered

  • performsaveandbgsaveThe command
  • Configuration file Settingssave <seconds> <changes>Rules, automatically spaced executionbgsaveThe command
  • During the primary/secondary replication, the secondary database synchronizes data from the primary database in full replication modebgsave
  • performflushallCommand to clear server data
  • performshutdownCommand to disable Redis is executedsaveThe command

Save and bgsave commands

You can run the save and bgsave commands to manually trigger snapshots and generate RDB files. The differences between the two commands are as follows

Using the save command blocks the Redis server process. The server process cannot process any command requests until the RDB file is created

127.0.0.1:6379 > save OKCopy the code

Unlike the bgsave command, the basave command forks a child process, which then creates the RDB file, while the server process continues to process the command request

127.0.0.1:6379> bgsave
Background saving started
Copy the code

Fork () is a function provided by the operating system that creates a copy of the current process as a child

Fork a child process that writes the data set to a temporary file. When the data is written successfully, it replaces the RDB file with binary compression, ensuring that the RDB file is always full and persistent

Automatic interval trigger

You can set the save

rule in the configuration file to automatically run the bgsave command periodically

################################ SNAPSHOTTING ################################
#
# Save the DB on disk:
#
# save 
      
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""

save 900 1
save 300 10
save 60 10000
Copy the code

Save

Indicates that the gbsave command is automatically triggered if at least changes occur within seconds

  • save 900 1When the time reaches 900 seconds, if at least one key changes, it is automatically triggeredbgsaveCreating a Snapshot
  • save 300 10When the time reaches 300 seconds, if at least 10 keys change, it will automatically triggerbgsaveCreating a Snapshot
  • save 60 10000When the time reaches 60 seconds, if at least 10,000 keys change, the trigger is automatically triggeredbgsaveCreating a Snapshot

AOF persistence

In addition to RDB persistence, Redis also provides AOF (Append Only File) persistence, which writes the written command to the end of the AOF File to record changes in the data. By default, Redis does not have AOF persistence enabled. Every time a command is executed to change Redis data, the command is appented to the AOF file. This can degrade Redis performance, but in most cases the impact is acceptable

You can enable AOF persistence by configuring the redis.conf file

The # appendonly parameter enables AOF persistence
appendonly no

# AOF persistent file name. Default is appendonly. AOF
appendfilename "appendonly.aof"

The # AOF file is saved in the same location as the RDB file, which is set by the dir parameter
dir ./

# Sync policy
# appendfsync always
appendfsync everysec
# appendfsync no

# aof is synchronized during rewrite
no-appendfsync-on-rewrite no

Override trigger configuration
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

How to handle error of loading aof
aof-load-truncated yes

# File rewrite policy
aof-rewrite-incremental-fsync yes
Copy the code

The realization of AOF

AOF records every Redis write command in the following steps: append, write, and sync

Append command

After the AOF persistence function is enabled, each time a write command is executed, the server appends the command to the end of the AOF_buf cache in protocol format instead of directly writing the command to a file. In this way, the server does not write the command directly to the disk each time a command is executed, reducing the number of DISK I/OS

File write and file Sync

Redis provides multiple policies for when to write the contents of the AOF_BUF buffer to an AOF file

  • appendfsync alwaysWill:aof_bufAll contents of the buffer are written and synchronized to the AOF file, and each write command is written to disk synchronously
  • appendfsync everysecWill:aof_bufThe contents of the cache are written to the AOF file, which is synchronized once per second by a single thread
  • appendfsync noWill:aof_bufIs it up to the operating system to decide when the contents of the cache are written to the AOF file

The default configuration of the appendfsync option is everysec, which means that synchronization is performed once per second

The synchronization strategy for AOF involves the write and fsync functions of the operating system, as described in Redis Design and Implementation

In modern operating systems, when a user calls the write function to write data to a file, the operating system will temporarily store the data in a memory buffer, and then write the data to disk when the buffer space is full or exceeds a specified period of time.

While this improves efficiency, it also creates a security problem for data writing: if the computer is shut down, the data in the memory buffer is lost. Therefore, the system provides fsync and fDATASync synchronization functions, which can force the operating system to immediately write the data in the buffer to the hard disk, thus ensuring the security of data writing.

As we know from the above, the operating system does not necessarily immediately synchronize the data we write to disk, which is why Redis provides the appendfsync option. When this parameter is set to “Always”, data security is the highest, but a large amount of data is written to the disk. The Redis processing speed is limited by the disk performance. Appendfsync everysec provides both data security and write performance, synchronizing AOF files once per second and losing up to one second of generated data in the event of a system crash. With appendfsync no option, Redis does not synchronize AOF files. Instead, the operating system decides when to synchronize them. Redis performance is not affected, but an indefinite amount of data may be lost if the system crashes

AOF rewrite (rewrite)

Before looking at the AOF rewrite, let’s first look at what is stored in the AOF file, first performing two writes

127.0.0.1:6379 >setS1 hello OK 127.0.0.1:6379 >set s2 world
OK
Copy the code

Then we open the appendone.aof file and see the following

* 3$3
set
$2
s1
A $5
hello
*3
$3
set
$2
s2
A $5
world
Copy the code

This command is in the Redis serialization Protocol (RESP) format. *3 means that the command has three arguments, and $3 means that the argument is 3 in length

Looking at the content of the AOP files above, we can imagine that Redis will execute more write commands over time, and the AOF files will become larger. Large AOF files may affect the Redis server, and it will take longer to restore data using the AOF files

Over time, AOF files often have redundant commands, such as expired data commands, invalid commands (duplicate set, delete), and multiple commands that can be combined into a single command (batch command). So the AOF file has a compact compressed space

The purpose of the AOF rewrite is to reduce the size of the AOF file. However, it is worth noting that the AOF rewrite does not require any reading, sharing, or writing to the existing AOF file. Instead, it is done by reading the current database state of the server

File rewriting can be manually triggered or automatically triggered. The bgrewriteaof command is manually triggered. The execution of the bgrewriteaof command is similar to that of the bgsave command

127.0.0.1:6379> bgreWriteaof Background append only file outline startedCopy the code

The bgrewriteaof command is automatically executed based on the auto-aof-rewrite-percentage and auto-aof-rewrite-min-size 64mb

When the size of the AOF file is greater than 64MB, and the size of the AOF file is twice the size of the last rewrite (100%), 'bgrewriteaof' is executed
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
Copy the code

Let’s take a look at the rewriting process by executing the bgrewriteaof command

  • Overrides involve a lot of write operations, so the server process willforkA child process to create a new AOF file
  • During the rewrite, the server process continues to process command requests and, if there are written commands, appends toaof_bufAt the same time, it’s going to append toaof_rewrite_bufAOF overrides the buffer
  • When the child process completes the rewrite, it will give a signal to the parent process, and the parent process will write the contents of the AOF rewrite buffer into the new AOF temporary file, and then rename the new AOF file to complete the replacement, so that the new AOF file can be consistent with the current database data

Data recovery

– Redis4.0 supports mixed persistence of RDB and AOF (this can be enabled with the AOF – use-rDB-preamble)

  • If the redis process fails, restart the redis process and restore data based on the AOF log file
  • If the machine where the Redis process is running is down, restart the machine and try to restart the redis process. If the AOF file is damaged, then useredis-check-aof fixThe command to repair
  • If there is no AOF file, the RDB file will be loaded
  • If redis’ current latest AOF and RDB files are lost/corrupted, then you can try to restore data based on one of the latest copies of RDB data currently on the machine

RDB vs AOF

RDB persistence and AOF persistence have been introduced, so take a look at the advantages and disadvantages of each and how to choose a persistence solution

Advantages and disadvantages of RDB and AOF

Redis. IO/Topics /pers…

RDB

Advantages:

  • An RDB snapshot is a compact, compressed file that holds a data set at one point in time, suitable for data backup and disaster recovery
  • When storing the RDB file, the server process only needs to fork a child process to create the RDB file. The parent process does not need to do IO operations
  • It is faster to recover large data sets than AOF

Disadvantages:

  • The data security of RDB is less than that of AOF, and the process of saving the entire data set is heavier than that of AOF. Depending on the configuration, it may take several minutes to take a snapshot, and if the server goes down, several minutes of data may be lost
  • When the Redis data set is large, it can be CPU intensive and time consuming for the forks to complete snapshots

AOF

Advantages:

  • More complete data, higher security, second level data loss (depending on fSYNC policy, if everySEC, up to 1 second of data loss)
  • An AOF file is a log file that is appended only, and the write operation is saved in Redis protocol format. The content is readable and suitable for emergency recovery of accidental deletion

Disadvantages:

  • For the same data set, the size of the AOF file is larger than that of the RDB file, and data recovery is slower
  • Depending on the Fsync policy used, the AOF may be slower than the RDB. In general, however, fsync performance per second is still very high

How to select RDB and AOF

  • If the data is less sensitive and can be regenerated from elsewhere, you can turn off persistence
  • If the data is important, you don’t want to get it from anywhere else, and you can afford to lose it for a few minutes, such as the cache, you can just use RDB
  • For in-memory databases, it is recommended that both RDB and AOF be enabled, or periodicallybgsaveFor snapshot backup, RDB is more suitable for data backup, and AOF can ensure that data is not lost