Redis performance optimization of 13 catch-22! In the history of the most complete

Redis is implemented based on the single-threaded model, that is, Redis uses a single thread to handle all client requests, although Redis uses non-blocking IO and is optimized for various commands (most of which are O(1)). However, Redis is characterized by single thread execution, so its performance requirements are more demanding. In this paper, we will make Redis run more efficiently through some optimization methods.

In this paper, we will use the following methods to improve the running speed of Redis:

Shorten the storage length of key-value pairs;
Use the lazy Free feature;
Set the expiration time of key values.
Disable time-consuming query commands.
Run the slowlog command to optimize time.
Batch operation data using Pipeline;
Avoid simultaneous failure of large amounts of data;
Client usage optimization;
Limit Redis memory size;
Install Redis services on physical machines instead of virtual machines.
Check the data persistence strategy;
Disable the THP feature.
Use a distributed architecture to increase read and write speed.

1. Shorten the storage length of key/value pairs

The length of key-value pairs is inversely proportional to performance. For example, if we do a set of performance tests to write data, the results are as follows:

As can be seen from the above data, under the condition that the key remains unchanged, the larger the value is, the slower the operation efficiency will be, because Redis will use different internal encodings for the same data type. For example, there are three internal encodings for strings: Int (integer encoding), RAW (string encoding optimized memory allocation), embstr (dynamic string encoding), this is because the author of Redis wants to achieve the balance of efficiency and space through different encoding, but the larger the amount of data, the more complex the use of internal encoding, and the more complex the internal encoding storage performance will be lower.

That’s just the speed of writing, but there are a few other problems that can arise when the key-value pairs are large:

The larger the content, the longer the persistence time, and the longer the hang time, the lower Redis performance;
The larger the content, the more content is transferred over the network, the longer it takes, and the lower the overall speed;
The larger the content, the more memory it consumes, the more frequently it triggers memory flushing, which puts more of a running burden on Redis.

Therefore, while ensuring the integrity of semantics, we should shorten the storage length of key-value pairs as much as possible. If necessary, data should be serialized, compressed and then stored. Take Java as an example, protostuff or Kryo can be used for serialization, and Snappy can be used for compression.

2. Use lazy free

The lazy Free feature, a much-used addition to Redis 4.0, can be understood as lazy or delayed deletion. Delete a key asynchronously and delay the release of the key in the BIO(Background I/O) child thread to reduce the block on the main Redis thread.

Lazy free corresponds to four scenarios, all of which are turned off by default:

lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
Copy the code

They mean the following:

Lazyfree-lazy-eviction: specifies whether to enable lazyfree deletion when Redis running memory exceeds maxmeory.
Lazyfree-lazy-expire: indicates that the key value of the expire time is set. After the expiration, whether to enable the lazyfree mechanism to delete the expired key is enabled.
Lazyfree – lazy – server – del: Some commands have an implicit del key when dealing with existing keys, such as the rename command. Redis removes the target key when it already exists. If the target key is a big key, this will block deletion. This configuration indicates whether lazy free deletion is enabled in this scenario.
Slave-lazy-flush: Full data synchronization is performed on the slave node. Before loading the master RDB file, the slave runs flushall to flush its own data, which indicates whether the lazy free mechanism is enabled for deletion.

It is recommended to enable the lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del configurations to improve the execution efficiency of the main thread.

3. Set the expiration time of key values

We should set a reasonable expiration time for key values according to the actual business situation. In this way, Redis will automatically clear expired key pairs for you to save memory usage and avoid excessive accumulation of key values and frequent triggering of memory flushing policy.

4. Disable the time-consuming query commands

Redis most read/write commands have a time complexity between O(1) and O(N).

To avoid the impact of O(N) command on Redis, you can start with the following aspects:

Decided to disallow the keys command;
Instead of querying all members at once, use the scan command for batch, cursor traversal.
The data size of Hash, Set, Sorted Set and other structures is strictly controlled by a mechanism.
Sort, union, intersection and other operations on the client side to reduce the operating pressure of Redis server;
Deleting (del) a large piece of data can take a long time, so it is recommended to use asynchronous unlink, which will start a new thread to delete the target data without blocking the main thread of Redis.

5. Run the slowlog command to optimize time consumption

We can use the slowlog function to find out the most time-consuming Redis commands for optimization to improve the running speed of Redis. Slow query has two important configuration items:

slowlog-log-slower-than: Sets the evaluation time for slow query. That is, commands that exceed this parameter are recorded in slow query logs as slow operations. The execution unit is microseconds (1 second equals 1,000,000 microseconds).
slowlog-max-len: Sets the maximum number of slow query logs.

We can perform corresponding configuration according to the actual service situation. Slow logs are stored in the slow query log in reverse order of insertion. We can use slowlog get N to obtain relevant slow query logs, and then find the corresponding services of these slow query logs for optimization.

6. Batch operation data using Pipeline

Pipeline is a batch processing technology provided by the client to process multiple Redis commands at once, thereby improving the performance of the overall interaction.

We use Java code to test Pipeline and normal operation performance comparison, Pipeline test code is as follows:

public class PipelineExample {
    public static void main(String[] args) {
        Jedis jedis = new Jedis("127.0.0.1".6379);
        // Record the execution start time
        long beginTime = System.currentTimeMillis();
        // Get the Pipeline object
        Pipeline pipe = jedis.pipelined();
        // Set multiple Redis commands
        for (int i = 0; i < 100; i++) {
            pipe.set("key" + i, "val" + i);
            pipe.del("key"+i);
        }
        // Execute the command
        pipe.sync();
        // Record the execution end time
        long endTime = System.currentTimeMillis();
        System.out.println("Execution time:" + (endTime - beginTime) + "毫秒"); }}Copy the code

The execution results of the above programs are as follows:

Execution time: 297 ms

The common operation code is as follows:

public class PipelineExample {
    public static void main(String[] args) {
        Jedis jedis = new Jedis("127.0.0.1".6379);
        // Record the execution start time
        long beginTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            jedis.set("key" + i, "val" + i);
            jedis.del("key"+i);
        }
        // Record the execution end time
        long endTime = System.currentTimeMillis();
        System.out.println("Execution time:" + (endTime - beginTime) + "毫秒"); }}Copy the code

The execution results of the above programs are as follows:

Execution time: 17276 ms

As you can see from the above results, the pipe execution time is 297 ms, while the ordinary command execution time is 17,276 ms, which is about 58 times faster than the ordinary execution time.

7. Avoid simultaneous failure of a large number of data

This configuration can be configured in redis.conf. The default value is Hz 10. Redis randomly selects 20 values and deletes the expired keys. If more than 25% of the keys are out of date, repeat the process, as shown below:

If a large number of caches expire at the same time in a large system, Redis will continue to scan and delete the expired dictionary for many times until the expired key values in the expired dictionary are deleted sparsely, and the read and write of Redis will appear obvious lag in the whole execution process. Another reason for caton is that the memory manager needs to recycle pages frequently, so it also consumes CPU.

In order to avoid this phenomenon, we need to prevent a large number of cache expiration at the same time, a simple solution is to add a specified range of random number based on the expiration time.

8. Optimize client usage

In the use of the client side, in addition to the use of Pipeline technology, we also need to pay attention to the use of Redis connection pool as far as possible, instead of frequently creating and destroying Redis connections, so as to reduce the number of network transmission and reduce unnecessary call instructions.

9. Limit the Redis memory size

On 64-bit operating systems, there is no limit to the size of Redis memory, that is, the maxmemory

configuration item is commented out, which results in the use of swap space when physical memory is insufficient. However, when the care system moves the memory pages used by Redis to swap space, the Redis process will be blocked, resulting in delay of Redis, thus affecting the overall performance of Redis. Therefore, we need to limit the memory size of Redis to a fixed value. When the Redis operation reaches this value, the memory flushing strategy will be triggered. There are 8 kinds of memory flushing strategy after Redis 4.0:

Noeviction: Won’t exclude any data, new operations will bug when memory is low, Redis default memory discard policy
Allkeys-lru: Removes the oldest unused key from the entire key;
Allkeys-random: randomly eliminate any key value;
Volatile -lru: Eliminates the oldest unused key of all keys with expiration dates; volatile-lru: eliminates the oldest unused key of all keys with expiration dates.
Volatile -random: randomly weed out any key with expiration time; volatile-random: randomly weed out any key with expiration time.
Volatile – TTL: preemptively eliminates keys that expire earlier.

In Redis 4.0, two new elimination strategies have been added:

Volatile -lfu: eliminates the least used key of all expired keys.
Allkeys-lfu: Removes the least used key from the entire key.

Where allkeys-xxx indicates that data is disqualified from allkeys, and volatile- XXX indicates that data is disqualified from keys with expired keys.

You can set this parameter based on actual service conditions. The default elimination policy does not eliminate any data and an error will be reported when new data is added.

10. Use physical machines instead of VMS

Running a Redis server on a virtual machine, because it shares a physical network port with a physical machine, and a physical machine may have multiple virtual machines running, will have poor performance in terms of memory footprint and network latency. To check latency, run the./redis-cli –intrinsic-latency 100 command. If redis performance is required, deploy the redis server directly on a physical machine.

11. Check the data persistence policy

The persistence strategy of Redis is to copy memory data to hard disks for Dr Or data migration. However, maintaining this persistence requires high performance overhead.

After Redis 4.0, there are three ways to persist Redis:

Redis DataBase (RDB) writes memory data ata certain point in time to disks in binary mode.
AOF (Append Only File) records all operation commands and appends them to files as text.
Mixed persistence is a new method added after Redis 4.0. Mixed persistence combines the advantages of RDB and AOF. When writing, the current data is first written into the beginning of the file in the form of RDB, and then the subsequent operation commands are stored in the file in the format of AOF. This ensures the speed of Redis restart and reduces the risk of data loss.

Both RDB and AOF persistence have their own advantages and disadvantages. RDB may cause data loss within a certain period of time, while AOF may affect the startup speed of Redis due to the large file size. In order to have the advantages of both RDB and AOF, Redis 4.0 added mixed persistence mode. Therefore, when we must persist, we should choose hybrid persistence.

To check whether mixed persistence is enabled, run the config get aof-use-rdb-preamble command. The command output is as follows:

Enable it on the command line
Enable it by modifying the Redis configuration file

(1) Enable it on the CLI

Run the config set aof-use-rdb-preamble yes command. The following information is displayed:

2 Enable it by modifying the Redis configuration file

Conf file and change the aof-use-rdb-preamble no in the configuration file to aof-use-rdb-preamble yes as shown in the following figure:

It is important to note that you can turn off persistence in services that do not require it. This can effectively improve the speed of Redis without the problem of intermittent lag.

12. Disable THP

Transparent Huge Pages (THP) is added in the 2.6.38 kernel to support 2MB allocation of large memory Pages. This feature is enabled by default.

When THP is enabled, the speed of forking is slower, and each memory page after forking is increased from 4KB to 2MB, greatly increasing the memory consumption of the parent process during rewriting. In addition, the unit of memory page copied by each write command increases by 512 times, which slows down the execution time of write operations and results in a large number of write operations and slow query. For example, the incr command also appears in slow queries. Therefore, Redis recommends disabling this feature as follows:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

In order to make the machine after the restart THP configuration is still effective, can be in the/etc/rc. The local append echo never > / sys/kernel/mm/transparent_hugepage/enabled.

13. Use a distributed architecture to increase read and write speed

Redis distributed architecture has three important tools:

Master-slave synchronization
The guard mode
Redis Cluster Cluster

With master-slave synchronization, we can put writes on the master library and transfer reads to the slave service, thus processing more requests per unit of time, thus improving the overall speed of Redis.

Sentinel mode is an upgrade to the master/slave function, but when the master node crashes, it automatically restores the normal use of Redis without human intervention.

Redis Cluster is officially introduced by Redis 3.0. Redis Cluster is to balance the load pressure of each node by distributing the database to multiple nodes.

Redis Cluster uses virtual hash slot partitioning. All keys are mapped to integer slots from 0 to 16,383 according to the hash function. The calculation formula is slot = CRC16(key) & 16383. This allows Redis to spread the read-write load from one server to multiple servers, resulting in a significant performance improvement.

In these three functions, we only need to use one, there is no doubt that Redis Cluster should be the preferred implementation, it can automatically share the read and write pressure to more servers, and has the ability of self-disaster recovery.

Follow the qr code below and subscribe for more exciting content.