Star /fork: github.com/Wasabi1234/…

When Redis provides external data access services, it uses data resident in memory. If only the data is stored in the memory, all data will be lost once the system restarts.

Introduction to persistence

1.1 What is Persistence

Redis keeps all data in memory and updates to data are asynchronously saved to disk. Persistence is mainly for disaster recovery and data recovery, which can be classified as high availability.

If your Redis goes down, all you need to do is make Redis available, as soon as possible!

Restart Redis and make it available to the outside world as soon as possible. If you do not make data backup, even if Redis is started, the data is gone! What can we use?

It is possible to say that a large number of requests come, the cache can not hit, in Redis can not find the data, this time caused cache avalanche, will go to MySQL database to find, suddenly MySQL to accept high concurrency, crash!

MySQL is down and you can’t even find data to restore to Redis. Where does Redis data come from? It’s from MySQL!

If you do a good job of Redis persistence, backup and recovery plan, so even if your Redis failure, you can also backup data, rapid recovery, once the restoration of external services immediately

1.2 Persistence Mode

Redis provides two methods of persistence:

Redis RDB – Snapshot

RDB performs point-in-time snapshots of data sets at specified intervals, similar to MySQL Dump.

Redis AOF – Command log

AOF records every write received by the server, which is performed again when the server is started to rebuild the original data set. The commands are recorded in the same format as the Redis protocol itself, and appends-only. Redis can override logs in the background when they become too large. Similar to MySQL Binlog and Hbase HLog. When Redis restarts, the entire data is reconstructed by playing back the write instructions in the log.

If you want Redis to be used only as a pure memory cache, you can also disable RDB and AOF.

You can use both AOF and RDB in the same instance. In this case, when Redis restarts, the AOF file will be used to rebuild the original data set, as it is guaranteed to be the most complete.

The most important thing is to understand the different tradeoffs between RDB and AOF persistence. If both RDB and AOF persistence mechanisms are used, then when Redis restarts, AOF is used to rebuild the data, because the data in AOF is more complete!

2 RDB – Full write

The k v stored by Redis Server in multiple dB can be understood as a state of Redis. When a write occurs, Redis switches from one state to another. Full-volume persistence is to persist all Redis data to hard disk at some point in time to form a snapshot. When Redis is restarted, the Redis can be restored to the last persistent state by loading the latest snapshot data.

2.1 Trigger Mode

The save command

Save can be triggered by the client display or when redis is shutdown. Save itself is executed in a single-threaded serial manner, so a long lag of Redis Server may occur when there is a large amount of data. However, no other command is executed during the backup. Therefore, data status is consistent during the backup.

If an old RDB file exists, the new one will replace the old one, O(N) time.

bgsave

Bgsave also can be made of

  • Explicitly triggered by the client
  • Configure scheduled task triggering
  • Triggered by slave nodes in master-slave architecture

When executing the BGsave command, a child process is forked. After the subprocess submits, it immediately returns a response to the client. The backup operation is performed asynchronously in the background without affecting the normal response of Redis.

For BGSave, when the parent forks the child, the asynchronous task copies the current memory state as a version and changes made during the replication are not reflected in the backup.

Instead of commands, use configurations

In the default configuration of Redis, bgSave execution is automatically triggered when any of the following conditions are met:

configuration seconds changes
save 900 1
save 300 10
save 60 10000

The advantages of BGSave over Save areAsynchronous executionThe command does not affect subsequent command execution. However, forking the child process, which involves memory replication of the parent process, can increase the server memory overhead.When memory overhead is high enough to use virtual memory, the bgSave Fork child process blocks running, may cause second level unavailability. Therefore, using BGSave requires that the server has enough free memory.

The command save bgsave
IO types synchronous asynchronous
Whether blocking blocking Non-blocking (blocking at fork)
The complexity of the O(N) O(N)
advantages No extra memory will be consumed Do not block client commands
disadvantages Block the client command The child process is forked and the memory is expensive

RDB optimal configuration

Disable automatic RDB:

dbfilename dump-${port}.rdb
dir /redisDataPath
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
Copy the code

Trigger time that needs to be paid attention to

  • The master node performs bgSave for full replication at master/slave replication time
  • debug reload
  • shutdown
  • FlushDB, flushAll

RDB nature

  1. The RDB is a Redis memory-to-disk snapshot for persistence
  2. Save usually blocks Redis
  3. Bgsave does not block Redis, but forks a new process
  4. The save autoconfiguration meets either of these criteria and is executed

Advantages of RDB

  • RDB generates multiple data files, each of which represents all data in Redis at one time. This approach is ideal for cold backup. Such complete data files can be sent to a cloud server storage, such as ODPS distributed storage, to periodically back up data in Redis with a predetermined backup policy
  • RDB has very little impact on the read and write services provided by Redis. This allows Redis to maintain high performance because the main Redis process only has to fork a child process to execute the RDB
  • Compared to AOF, it is faster to restart and restore Redis processes directly based on RDB files

RDB shortcomings

  • Take, O (n)
  • Fork () : memory consumption, copy-on-write policy

Each time the RDB forks a child process to perform the RDB snapshot data file generation, if the data file is very large, the service provided to the client may be suspended for milliseconds, or even seconds

  • Uncontrolled, easy to lose data

Generally, the RDB is generated every 5 minutes or longer. If Redis goes down during the process, the most recent unpersisted data will be lost

2.2 Restoration Process

When Redis restarts, the previously persistent files are loaded from the local disk. When the recovery is complete, subsequent requests are processed.

3 AOF (Append only File) – Incremental mode

RDB records the full data of each state, while AOF records the record of each write command. Through the execution of all write commands, the final data state is finally recovered.

  • Its file generation is as follows:

3.1 Write Process

AOF’s three strategies

always

  • Each time the buffer is flushed, a synchronization operation is triggered synchronously. This policy reduces Redis throughput because synchronization is triggered for every write operation, but this mode has the highest fault tolerance.

every second

  • Asynchronous per second trigger synchronization operation, for RedisThe default configuration.

no

  • The operating system decides when to synchronize. In this mode, Redis cannot decide when to land, so it cannot be controlled.

contrast

The command always everysec no
advantages No data loss Fsync is performed once per second, and data is lost for 1 second Do not need to set up
disadvantages IO overhead is high. A typical STAT disk has only a few hundred TPS Lost 1 second of data uncontrolled

3.2 Playback Process

The playback time of AOF is also when the machine is started. Once AOF exists, Redis will select incremental playback.

Because incremental persistence is a continuous write to disk, data is more complete than full persistence. The playback process is to execute the command stored in AOF again. Then continue to receive new commands from the client.

Optimal rewriting of AOF pattern

As Redis continues to run, a large amount of incremental data is appended to AOF files. To reduce hard disk storage and speed recovery, Redis uses the rewrite mechanism to merge historical AOF records. As follows:

Native AOF

set hello world set hello java set hello hehe incr counter incr counter rpush mylist a rpush mylist b rpush mylist c Stale dataCopy the code

AOF rewrite

set hello hehe
set counter 2 
rpush mylist a b c
Copy the code

The role of AOF rewrite

  • Reduce disk usage
  • Accelerated recovery rate

3.3 AOF rewriting is implemented in two ways

bgrewriteaof

AOF overrides configuration

Configuration items

  • AOF file growth rate/size required for AOF file rewrite

  • AOF Current size (in bytes)

  • aof_base_sizeSize of AOF last started and overwritten in bytes

Automatic trigger configuration

aof_current_size > auto-aof-rewrite-min-size
aof_current_size - aof_base_size/aof_base_size > auto-aof-rewrite-percentage
Copy the code

3.4 AOF rewrite process

AOF overrides configuration

Modifying a Configuration File

appendonly yes
appendfilename "appendonly-$(port).aof"
appendfsync everysec
dir /opt/soft/redis/data
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
no-appendfsync-on-rewrite yes
Copy the code

The advantages of AOF

  • Better to avoid data loss

In general, the AOF performs fsync every 1s through a sub-process and loses a maximum of 1s data

  • append-onlyPattern appending

Therefore, there is no disk addressing overhead, high write performance, and the file is not damaged, even if the file tail is damaged, it is easy to repair

  • Even if the log file is too large and background rewriting is performed, client read and write operations are not affected

This is because when you rewrite log, you compress the instructions in it, creating a minimal log that needs to be retrieved. When a new log is created, the old log file is written as usual. When the new merge log files are ready, swap the old and new log files!

  • Commands are recorded in a very readable manner

This feature is ideal for emergency recovery for catastrophic deletions. Rewrite (AOF, flushhall, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF, AOF

2.2.2 Disadvantages of AOF

  • AOF logs are generally larger than RDB snapshots for the same data
  • When AOF is enabled, write QPS will be lower than RDB because AOF is typically configured to log files once every second fsync, although performance is still very high
  • In the past, there was a bug in AOF, that is, the same data was not recovered when the logs recorded by AOF were recovered

More complex command-log /merge/ playback approaches such as AOF are a bit more bug-prone than the RDB-based approach of persisting a full snapshot at a time, but AOF was designed to avoid the bugs caused by the rewrite process. So instead of merging the rewrite from the old instruction log each time, rewrite is more robust by reconstructing the instruction based on the data in memory at the time

4 Selection and best practice

The command RDB AOF
Startup priority low high
volume low high
Recovery rate fast slow
Data security Lost data By strategy
Order of magnitude heavyweight lightweight

4.1 Optimal RDB Policy

  • Shut down
  • Manage RDB operations manually in a centralized manner
  • Enable automatic configuration on the slave node, but do not execute RDB frequently

4.2 AOF optimal Strategy

  • It is recommended to enable it, but not if it is used purely as cache
  • AOF rewrite centralized management
  • everysec

4.3 Choice between RDB and AOF

  1. Don’t just use RDB, because that will cause you to lose a lot of data
  2. Don’t just use AOF either, because there are two problems with that
    • You can recover faster by using AOF to do cold backup, without RDB to do cold backup
    • RDB is more robust by simply generating snapshots at a time, avoiding the bugs of complex backup and recovery mechanisms such as AOF
  3. Use a combination of AOF and RDB
    • AOF is the first choice for data recovery to ensure that data is not lost
    • Use RDB to do different degrees of cold backup, in the case of AOF files are lost or damaged, you can also use RDB to quickly achieve data recovery

4.4 Some best practices

  • Small shard

For example, the maxMemory parameter is set to store only 4 GIGABytes of space per redis, so that all operations are not too slow

  • Monitoring (hard disk, memory, load, network)
  • Enough memory

reference

  • Redis. IO/switchable viewer/pers…
  • Redis Design and Implementation