Redis design optimization

Estimate Redis memory usage

To estimate the size of memory occupied by data in REDis, you need to have a comprehensive understanding of the memory model of REDis

Suppose there are 90,000 key-value pairs, each key is 12 bytes long, and each value is 12 bytes long (and neither key nor value is an integer).

Let’s estimate the space taken up by those 90,000 key/value pairs. Before estimating the space occupied, you can first determine the encoding used for the string type: embstr

The memory space occupied by 90,000 key-value pairs can be divided into two parts: one is the space occupied by 90,000 dictentries; One part is the bucket space required by the key-value pair

The space occupied by each dictEntry consists of

  • A dictEntry structure, 24 bytes, jemalloc will allocate 32 bytes of memory block (64-bit operating system, a pointer 8 bytes, a dictEntry consists of three Pointers)

  • A key, 12 bytes, so SDS(key) requires 12+4=16 bytes ([SDS length =4+ string length), jemalloc will allocate 16 bytes of memory blocks

  • A redisObject, 16 bytes, jemalloc allocates 16 bytes of memory (4bit+4bit+24bit+4Byte+8Byte=16Byte)

  • A value, 12 bytes, so SDS(value) requires 12+4=16 bytes ([SDS length =4+ string length), Jemalloc will allocate 16 bytes of memory blocks

  • To sum up, a dictEntry takes up 32+16+16+16=80 bytes

The smallest 2^n bucket array size greater than 90000 is 131072; Each bucket element (which stores pointer elements) is 8 bytes (because Pointers are 8 bytes on 64-bit systems)

Therefore, it can be estimated that the memory size occupied by the 90,000 key-value pairs is: 9000080 + 1310728 = 8248576

As a comparison, if the length of key and value is increased from 12 bytes to 13 bytes, the corresponding SDS becomes 17 bytes and Jemalloc will allocate 32 bytes, so the number of bytes occupied by each dictEntry also changes from 80 bytes to 112 bytes. In this case, the memory occupied by the 90,000 key/value pairs is estimated to be 90000112 + 1310728 = 11128576

Optimize memory footprint

Understanding the memory model of Redis is helpful in optimizing redis memory footprint. The following describes several optimization scenarios and methods

  • Optimization using jEMalloc characteristics

    Since jemalloc allocates memory values discontinuously, a one-byte change in the key/value string can cause a large change in memory usage. Take advantage of this at design time

For example, if the key is 13 bytes long, SDS allocates 17 bytes and Jemalloc allocates 32 bytes. At this point, if the key length is reduced to 12 bytes, SDS is 16 bytes and JEMalloc allocates 16 bytes. The space occupied by each key can be reduced by half

  • Use an integer/long integer

    If it is an integer/long integer, Redis saves more space by using int (8 bytes) instead of string. Therefore, use long integers in scenarios where long integers can be used instead of strings

  • The Shared object

    With shared objects, you can reduce object creation (and redisObject creation), saving memory space. Currently, shared objects in Redis only contain 10,000 integers (0-9999); You can increase the number of shared objects by adjusting the OBJ_SHARED_INTEGERS parameter

For example, if OBJ_SHARED_INTEGERS is set to 20000, objects between 0 and 19999 can be shared. Forum sites store the number of views per post in Redis, and most of those views are in the range of 0 to 20,000. By appropriately increasing the OBJ_SHARED_INTEGERS parameter, you can save memory by using shared objects

  • Shorten the storage length of key-value pairs

The length of key-value pairs is inversely proportional to performance. For example, if we do a set of performance tests to write data, the results are as follows

As can be seen from the above data, under the condition that the key remains unchanged, the larger the value is, the slower the operation efficiency will be, because Redis will use different internal encodings for the same data type. For example, there are three internal encodings for strings: Int (integer encoding), RAW (string encoding optimized memory allocation), embstr (dynamic string encoding), this is because the author of Redis wants to achieve the balance of efficiency and space through different encoding, but the larger the amount of data, the more complex the use of internal encoding, and the more complex the internal encoding storage performance will be lower

That’s just the speed of writing, but there are a few other problems that can arise when the key-value pairs are large

  • The larger the content, the longer the persistence time, and the longer the hang time, the worse Redis performance will be

  • The larger the content, the more content that can be transferred over the network, the longer it takes, and the slower the overall performance

  • The larger the content, the more memory it consumes, the more frequently it triggers memory flushing, which puts more of a running burden on Redis

Therefore, while ensuring the integrity of semantics, we should shorten the storage length of key-value pairs as much as possible. If necessary, data should be serialized, compressed and then stored. Take Java as an example, protostuff or Kryo can be used for serialization, and Snappy can be used for compression

View Redis memory statistics

Used_memory :853464 # The amount of memory used by the operating system, not including virtual memory (bytes)used_memory is the amount of memory used by Redis, It contains the memory used by the actual cache and the memory used by Redis itself to run (e.g., metadata, LUa). It is allocated by Redis using the memory allocator, so this figure does not take into account the memory wasted by fragmentation. Used_memory_rss :12247040 # Memory fragmentation ratio Less than 1 indicates virtual memory usage mem_fragmentation_ratio:15.07 # Memory fragmentation bytes mem_fragmentation_bytes Mem_allocator :jemlloc-5.1.0Copy the code
  • used_memory

    The usED_memory field data represents the amount of memory allocated by the Redis allocator, in bytes

    Used_memory_human is just a more human display

  • used_memory_rss

    Records the Redis process memory allocated by the operating system and the memory fragments (in bytes) in Redis memory that can no longer be allocated by Jemalloc

Used_memory vs. USed_memory_rss

The former is from the Redis perspective, the latter is from the operating system perspective. The difference is partly due to the fact that memory fragmentation and Redis processes require less memory to run, making the former less likely than the latter

Due to the large amount of Redis data in practical applications, the memory occupied by the process running at this time is much smaller than the amount of Redis data and memory fragments. Therefore, the ratio of USED_memory_RSS to USED_memory becomes a parameter to measure the memory fragmentation rate of Redis. This parameter is mem_fragmentation_ratio

  • mem_fragmentation_ratio

    Memory fragmentation ratio, which is the ratio of USED_memory_rss to USED_memory

    Mem_fragmentation_ratio is generally greater than 1, and a larger value indicates a larger proportion of memory fragments

    mem_fragmentation_ratio<1

    The mem_fragmentation_ratio value is calculated as the ratio between the size of the process’s memory resident set (RSS, as measured by the OS) and the total number of bytes Redis allocates using the allocator. Now, if more memory is allocated using LIBC (compared to Jemalloc, TCMALloc), or if some other process on the system is using more memory during benchmarking, Redis memory can be exchanged through the operating system. It reduces RSS(because a portion of Redis memory is no longer in main memory). The resulting fragmentation rate will be less than 1. In other words, this ratio only makes sense if you are certain that the operating system does not swap Redis memory (if it does not, then there will be performance issues anyway)

    In general, mem_fragmentation_ratio around 1.03 is a healthy state (for Jemalloc); The initial mem_fragmentation_ratio is large because no data has been stored into Redis, and the Redis process itself runs in memory that makes USED_memory_rss much larger than USED_memory

  • mem_allocator

    The memory allocator used by Redis, specified at compile time; It can be libc, jemalloc, or TCMALloc. The default is jemalloc

Redis performance optimization

Sets the expiration time of a key value

We should set a reasonable expiration time for key values according to the actual business situation. In this way, Redis will automatically clear expired key pairs for you to save memory usage and avoid excessive accumulation of key values and frequent triggering of memory flushing policy

Redis has four different commands that can be used to set the lifetime of the key (how long the key can exist) or the expiration time (when the key will be removed)

  • The EXPlRE command is used to set the lifetime of a key to TTL seconds

  • The PEXPIRE command is used to set the lifetime of a key to TTL milliseconds

  • The EXPIREAT < timestamp> command is used to set the expiration time of key to the timestamp specified in seconds

  • The PEXPIREAT < timestamp > command is used to set the expiration time of key to the timestamp specified in milliseconds

Using the Lazy Free feature

The lazy Free feature, a much-used addition to Redis 4.0, can be understood as lazy or delayed deletion. Delete a key asynchronously and delay the release of the key in the BIO(Background I/O) child thread to reduce the block on the main Redis thread

Lazy Free corresponds to four scenarios, all of which are turned off by default

lazyfree-lazy-eviction no 
lazyfree-lazy-expire no 
lazyfree-lazy-server-del no 
slave-lazy-flush no
Copy the code

Represents the following meanings

  • Lazyfree-lazy-eviction: specifies whether to enable lazyfree removal when Redis running memory exceeds its maximum memory

  • Lazyfree-lazy-expire: specifies the key value of the expire time. Whether to enable the lazyfree mechanism after the expiration

  • Lazyfree – lazy – server – del: Some commands have an implicit del key when dealing with existing keys, such as the rename command. Redis removes the target key when it already exists. If the target key is a big key, this will block deletion. This configuration indicates whether lazy free deletion is enabled in this scenario

  • Slave-lazy-flush: Full data synchronization is performed on the slave node. Before loading the master RDB file, the slave runs flushall to flush its own data, which indicates whether the lazy free mechanism is enabled for deletion

It is recommended to enable the lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-server-del configurations to improve the execution efficiency of the main thread

Limit Redis memory size and set memory elimination policy

The largest cache

 maxmemory 1048576 
 maxmemory 1048576B 
 maxmemory 1000KB 
 maxmemory 100MB 
 maxmemory 1GB
Copy the code

Max cache is not specified, 32 bits will crash Redis if new data is added and the maximum memory is exceeded, so be sure to set this. The optimal setting is 75% of physical memory and 60% more writes

LRU principle

LRU (Least recently used) algorithm filters out data based on historical access records. Its core idea is that “if the data has been accessed recently, the probability of future access is higher”.

LFU principle

LFU is the most Frequently Used policy. The data that is Used the Least Frequently in a period of time is eliminated. * Least use * (* LFU*) is a caching algorithm used to manage computer memory. It records and tracks the number of blocks used, and when the cache is full and more space is needed, the system will clear the memory at the lowest block usage rate. The simplest way to use the LFU algorithm is to assign a counter to each block loaded into the cache. Each time the block is referenced, the counter increases by one. When the cache reaches its capacity and a new block of memory is waiting to be inserted, the system searches for the block with the lowest counter and removes it from the cache

LRU and LFU have different emphases. LRU is mainly reflected in the use time of elements, while LFU is mainly reflected in the use frequency of elements. The drawback of LFU is that certain caches are accessed so frequently over a short period of time that they are immediately promoted to hot data, guaranteed not to be obsolete, and thus remain in system memory. In fact, this part of data is only briefly accessed with high frequency, and then will not be accessed for a long time. Transient high frequency access will accelerate the reference frequency of this part of data, and some newly added caches are easy to be deleted quickly, because their reference frequency is very low

Redis cache elimination strategy

When redis memory data sets grow to a certain size, a data obsolescence strategy is implemented

Maxmemory-policy Voltile-LRU supports 8 hot-configured memory flushing policies after Redis 4.0

  • Noeviction does not weed out any data, new operations will bug when memory is low, Redis default memory bugs policy

  • Allkeys-lru removes the oldest unused key from the entire key

  • Allkeys-random Randomly discards any key value

  • Volatile -lru removes the oldest unused key of all values with expiration dates

  • Volatile -random Randomly eliminates any key value with an expiration date

  • Volatile – TTL prioritizes keys that expire earlier

Two more elimination strategies have been added in Redis 4.0

  • Volatile -lfu: Eliminates the least used key of all expired keys

  • Allkeys-lfu: Removes the least used key from the entire key

Where allkeys-xxx indicates flushing data from allkeys, and volatile- XXX indicates flushing data from expired keys

Disable the query commands that take a long time

The time complexity of most Redis read and write commands ranges from O(1) to O(N)

Where O(1) means safe to use, and O(N) should be careful, N means uncertain, the larger the data, the slower the query speed may be. Because Redis uses only one thread to query data, if these instructions take a long time, they can block Redis, causing a lot of latency

To avoid the impact of O(N) command on Redis, you can start from the following aspects

  • Decided to disallow the keys command

  • Instead of querying all members at once, use the scan command for a batch, cursor – style traversal

  • The data size of Hash, Set, and Sorted Set structures is strictly controlled by a mechanism

  • Sort, union, intersection and other operations on the client side to reduce the operating pressure of Redis server

  • Deleting (del) a large piece of data can take a long time, so it is recommended to use asynchronous unlink, which will start a new thread to delete the target data without blocking the main thread of Redis

Redis6.0 introduced multithreading

Why did Redis choose the single-threaded model in the first place

  • IO multiplexing

    Redis top-level design

    FD is a file descriptor that indicates whether the current file is in a readable, writable, or abnormal state. Use I/O multiplexing mechanisms to listen for the readable and writable states of multiple file descriptors simultaneously

    Once a network request is received, it is quickly processed in memory, which is very fast because most operations are pure memory

    This means that in single-threaded mode, even if the connected network has a lot of processing, due to IO multiplexing, it can still be ignored in high-speed memory processing

  • High maintainability

    Although multithreaded model performs well in some aspects, it introduces the uncertainty of program execution order and brings a series of problems of concurrent read and write. In single-threaded mode, debugging and testing can be carried out easily

  • Memory – based, single – threaded efficiency is still high

    Multithreading can make full use of CPU resources, but for Redis, due to the memory speed that is quite high, can reach 100,000 user requests in a second, if one hundred thousand a second is not satisfied, then we can use Redis sharding technology to give different Redis server. This approach avoids the introduction of a large number of multithreading operations in the same Redis service

    And memory-based, there’s basically no I/O involved unless you’re doing AOF backup. The reading and writing of these data occurs only in memory, so the processing speed is very fast; A multithreaded model for handling all external requests may not be a good solution

Based on memory and using multiplexing technology, single thread speed is fast, but also ensure the characteristics of multithreading. There is no need to use multithreading

Why did Redis add multithreading after 6.0 (in some cases, single threading has drawbacks that can be solved by multithreading)

Because read/write system calls on read/write networks consume most of the CPU time during Redis execution, making network reads and writes multithreaded can greatly improve performance

Redis can delete an element using the del command. If the element is very large, perhaps tens of megabytes or hundreds of megabytes, it cannot be done in a short period of time, thus requiring multithreaded asynchronous support

Use multithreading to delete work can be done in the background

Summary: Redis chooses to use single-threaded model to process client requests mainly because CPU is not the bottleneck of Redis server, so the performance improvement brought by using multi-threaded model cannot offset the development cost and maintenance cost brought by it, and the performance bottleneck of the system is mainly in the network I/O operation. The introduction of multi-threaded operation in Redis is also for the consideration of performance. For the deletion operation of some large key value pairs, the non-blocking release of memory space through multi-threading can also reduce the blocking time on the main thread of Redis and improve the efficiency of execution

There are asynchronous threads since 4.0

Run the slowlog command to optimize time

You can use the slowlog function to find the most time-consuming Redis commands for optimization to improve the speed of Redis. Slow query has two important configuration items

  • Slowlog-log-slower than: Slowlog-log-than: Sets the evaluation time for slow queries, meaning that commands exceeding this parameter are logged as slow operations executed in microseconds (1 second equals 1,000,000 microseconds)

  • Slowlog-max-len: sets the maximum number of slowly-queried logs

You can perform corresponding configuration according to actual services. Slow logs are stored in the slow query log in reverse order of insertion. You can use slowlog get n to obtain relevant slow query logs, and then find the corresponding services for optimization

Avoid simultaneous failure of large amounts of data

This configuration can be configured in redis.conf. The default value is Hz 10. Redis randomly selects 20 values and deletes the expired keys. If more than 25% of the keys are out of date, repeat the process

If a large number of caches expire at the same time in a large system, Redis will continue to scan and delete the expired dictionary for many times until the expired key values in the expired dictionary are deleted sparsely, and the read and write of Redis will appear obvious lag in the whole execution process. Another reason for caton is that the memory manager needs to recycle pages frequently, so it also consumes CPU.

In order to avoid this phenomenon, we need to prevent a large number of cache expiration at the same time, a simple solution is to add a specified range of random number based on the expiration time

Batch manipulation of data using Pipeline

A Pipeline is a batch technology provided by the client

You can batch execute a set of instructions and return all the results at once, which can reduce frequent request responses

Client usage optimization

In the use of the client side, in addition to the use of Pipeline technology, we also need to pay attention to the use of Redis connection pool as far as possible, instead of frequently creating and destroying Redis connections, so as to reduce the number of network transmission and reduce unnecessary call instructions

  import redis.clients.jedis.JedisPool; 
  import redis.clients.jedis.JedisPoolConfig;
Copy the code

Use a distributed architecture to increase read and write speed

The Redis distributed architecture has important tools

  • Master-slave synchronization

  • The guard mode

  • Redis Cluster Cluster

With master-slave synchronization, we can put writes on the master library and transfer reads to the slave service, thus processing more requests per unit of time, thus improving the overall speed of Redis

Sentinel mode is an upgrade to the master/slave function, but when the master node crashes, it automatically restores the normal use of Redis without human intervention

Redis Cluster is officially introduced by Redis 3.0. Redis Cluster is to balance the load pressure of each node by distributing the database to multiple nodes

The THP feature is disabled

Transparent Huge Pages (THP) is added in the 2.6.38 kernel to support 2MB allocation of large memory Pages. This feature is enabled by default

When THP is enabled, the speed of forking is slower, and each memory page after forking is increased from 4KB to 2MB, greatly increasing the memory consumption of the parent process during rewriting. In addition, the unit of memory page copied by each write command increases by 512 times, which slows down the execution time of write operations and results in a large number of write operations and slow query. For example, the incr command also appears in slow queries. Therefore, Redis recommends disabling this feature

echo never > /sys/kernel/mm/transparent_hugepage/enabled

In order to make the machine after the restart THP configuration is still effective, can be in the/etc/rc. The local append echo never > / sys/kernel/mm/transparent_hugepage/enabled