How does Redis avoid data loss when it goes down?

  • AOF
  • RDB

AOF

Redis writes data to Redis memory before logging. To avoid extra checking overhead, Redis does not first check the syntax of commands when logging to AOF. Therefore, if the log is recorded before the command is executed, the wrong command may be recorded in the log. When Redis uses the log to recover data, it may make an error.

AOF log format:

Take the AOF logs recorded after Redis receives the “set key1 value1” command as an example. *3 indicates that the current command has three parts. Each part starts with a + number and is followed by a specific command, key, or value. Here, “number” indicates the number of bytes of the command, key, or value in this section. For example, a “+ number” begins, followed by a specific command, key, or value. Here, “number” indicates the number of bytes of the command, key, or value in this section. For example, a “+ number” begins, followed by a specific command, key, or value. Here, “number” indicates the number of bytes of the command, key, or value in this section. For example, “3 set” means that this part has three bytes, which is the “set” command.

Benefits:

  • Post-write logging prevents incorrect commands from being logged.
  • It logs after the command is executed, so it does not block the current write operation.

Risk:

  • If a command goes down just after it is executed without logging, the command and its data are at risk of being lost. If Redis is used as a cache, you can re-read data from the back-end database for recovery, but if Redis is used directly as a database, the command is not logged, so recovery cannot be logged.
  • While AOF avoids blocking the current command, it may risk blocking the next operation. This is because the AOF log is also executed in the main thread, and if the disk is under a lot of write pressure while the log file is being written to disk, the write to disk will be slow and subsequent operations will not be able to be performed.

Write back policy :(appendfsync)

What if the log file is too large?

Existing problems:

  • The file system itself has a limit on the file size and cannot save large files
  • If the file is too large, it becomes inefficient to append command records to it later
  • In the event of an outage, the commands recorded in the AOF must be re-executed one by one for recovery. If the log file is too large, the recovery process will be very slow, which will affect the normal use of Redis

The AOF override mechanism solves this problem: simply put, multiple buckets merge commands with the same key

Will AOF overwriting block?

Unlike AOF logs written back by the main thread, the rewrite process is done by the background thread bgreWriteAof to avoid blocking the main thread and causing database performance degradation.

Rewrite process: one copy, two logs

“One copy” : Each time a rewrite is performed, the main thread forks out the background bgrewriteaof child. Fork sends a copy of the main thread’s memory to the bgrewriteaof child, which contains the latest data from the database. The bgrewriteAof child process can then write the copied data as operations, one by one, to the rewrite log without affecting the main thread.

Two logs: Since the main thread is not blocked, it can still process incoming operations. At this point, if there is a write operation, the first log is the AOF log being used, and Redis writes this operation to its buffer. This way, even if there is an outage, the OPERATION of the AOF log is still complete and ready for recovery.

The second log is the new AOF rewrite log. This operation is also written to the buffer of the rewrite log. This way, the rewrite log does not lose the latest operations. After all operation records of copying data are overwritten, the latest operation records of rewriting log records will also be written to a new AOF file to ensure the latest state records of the database. At this point, we can replace the old file with the new AOF file.

RDB

When you use the AOF method for fault recovery, you need to execute all operation logs one by one. If there are too many operation logs, Redis will recover slowly, affecting normal use. Other quick recovery methods -> Memory snapshot

Memory snapshot. A memory snapshot is a record of the status of data in memory at a certain time. It’s similar to a photo, when you take a picture of a friend, a single photo can capture exactly what that friend looked like in an instant. For Redis, it achieves a photo-like effect by writing the status of a given moment to disk as a file, known as a snapshot. In this way, even if there is a downtime, snapshot files will not be lost, and data reliability is guaranteed.

Existing problems:

  • Can data be added, deleted, or changed when taking snapshots? This is related to whether Redis is blocked and can handle requests properly at the same time.

  • How often do you take snapshots?

Redis provides two commands to generate RDB files, save and BGSave.

  • Save: Executed in the main thread, blocking;
  • Bgsave: Create a subprocess that is dedicated to writing to RDB files, avoiding blocking on the main thread, which is also the default configuration for Redis RDB file generation.

It is certainly not acceptable to pause a write for a snapshot. In this case, Redis uses the copy-on-write (COW) technology provided by the operating system to process Write operations while performing snapshots. In simple terms, the BGSave child process is generated by the main thread fork and can share all memory data of the main thread. Once the BGSave child process runs, it starts reading the main thread’s memory data and writing it to an RDB file. At this point, if the main thread also reads these data (for example, key-value pair A in the figure), then the main thread and the BGSave child do not affect each other. However, if the main thread modifies a piece of data (such as the key pair C in the figure), the piece of data is copied, making a copy of that data. The BGSave child then writes the copy to the RDB file, while the main thread can still modify the original data directly.

In the redis.conf configuration file, we did the following configuration for save:

  • Save 900 1: If at least one key value changes within 900 seconds, the value is saved
  • Save 300 10: If at least 10 key values change within 300 seconds, the system saves the change
  • Save 60 10000: If the values of at least 10000 keys change within 60 seconds, the keys are saved

The dirty counter and the Lastsave attribute

The dirty counter records how many changes (writes, deletes, updates, etc.) have been made to the Redis server since the last save or BGSave command was successfully executed.

The lastSave attribute is a timestamp that records the last time the save or BGSave command was successfully executed.

With these commands, the dirty counter increases by 1 when the server successfully performs a change, and the lastSave attribute records the last time save or BGSave was executed, The Redis server also has a periodic operator, severCron, which executes every 100 milliseconds by default. This function iterates through and checks all the save criteria in the SaveParams array, and executes the BGSave command if one of the criteria is met.

After execution, the dirty counter is updated to 0 and lastsave is updated to the time when the command was executed.

Compared with AOF, snapshot recovery is faster. However, the snapshot frequency is difficult to determine. If the snapshot frequency is too low, more data may be lost once the two snapshots break down. If the frequency is too high and there is extra overhead, what other ways to take advantage of RDB’s fast recovery and lose as little data as possible with less overhead?

Redis 4.0 proposes a hybrid approach using AOF logging and memory snapshots. In simple terms, memory snapshots are taken at a certain frequency, and AOF logs are used to record all command actions between snapshots. This way, snapshots are not executed as frequently, which avoids the impact of frequent forks on the main thread. Also, AOF logs only record operations between snapshots, which means that you don’t need to record all operations, so you don’t get too big files and you avoid overwriting. As shown in the following figure, changes made at time T1 and time T2 are recorded in AOF logs. After the second full snapshot is taken, the AOF logs can be cleared because the changes made at this time have been recorded in the snapshot and will not be used in recovery.