How InnoDB optimizes LRU performance to its fullest

The introduction

Through the explanation of the working principle of Buffer Pool, we know that a query operation may undergo hash search, and the disk load data to Buffer Pool, cache page, free chain, and Flush chain coordination can be completed. If the cache pages in the Buffer Pool are insufficient to support this operation, the LRU policy is used to eliminate the cache pages and clear enough space to complete the query. Is there any downside to the LRU approach? How does the InnoDB storage engine address its drawbacks? That’s what this article is about.

Review the LRU principle

The principle of

LRU (Least Recently Used) : Used Least Recently. As a cache flushing policy, the core idea of LRU is to flush out the memory pages that have not been accessed for the longest time. The principle is shown in the figure below:

In the figure, green represents free memory pages and red represents obsolete memory pages.

implementation

In order to maintain the order of memory pages while ensuring high efficiency, LRU generally adopts the structure of bidirectional linked list + hash table. The addition and deletion of the bidirectional linked list can be completed in O(1) time, and the memory page can be searched efficiently in O(1) time through the properties of hash table.

In the Java collection class, LinkedHashMap data is exactly this structure, and it also provides sorting by access order. We simply override the removeEldestEntry method. When the method returns true, the oldest node is removed. So we can use this structure to quickly implement LRU, as shown in the figure below:

Potential risks of the LRU policy in the Buffer Pool

1. The MySQL prefetch mechanism

MySQL prefetch: When a data page is loaded from a disk into a Buffer Pool, it is usually loaded into the Buffer Pool along with the adjacent data pages of the target data page.

Why does MySQL have a read-ahead mechanism?

It’s really about improving performance. The reason is simple: when you read data page 01, data page 02, it is very likely that you will also read data page 03.

A disk load to the Buffer Pool is actually a disk I/O, which is relatively performance consuming.

Therefore, MySQL has designed a prefetch mechanism. In some cases, it will prefetch some adjacent data pages into the Buffer Pool so that the next time you actually want to access the data page, you don’t need to load the data from disk, but directly from the Buffer Pool. This improves efficiency.

Prefetch trigger scenarios: After the prefetch mechanism, let’s talk about which scenarios trigger prefetch.

Innodb_read_ahead_threshold: Through parameter Settings, the default value is 56, which means that if the number of data pages in a section reaches 56 in sequence, the prefetch will be triggered and all data pages in the adjacent next section will be preread to the Buffer Pool.
Innodb_random_read_ahead: The default is OFF, which means that if the Buffer Pool buffers 13 consecutive pages of a data area and these pages are accessed frequently, the rest of the data area will be read ahead to the Buffer Pool.

Pitfalls: This improves performance on the one hand; On the other hand, it may happen that the data pages that are preread are almost never accessed, but due to the nature of LRU, these data pages will be near the head node of the linked list after being preread, and some of the data pages that are themselves frequently accessed may be eliminated.

Hidden danger 2, full table scan

A full table scan may load infrequently accessed pages into the Buffer Pool, thereby flushing out some of the frequently accessed pages. This affects performance because frequently accessed cached pages have to be reloaded from disk.

LRU strategy based on hot and cold isolation

Reviewing the above problems, in fact, the biggest problem is that some cache pages (frequently visited cache pages) should not be eliminated, that is, hot data is eliminated; The pages that should be weeded out (cache pages that are rarely accessed) are retained, i.e. the cold data is retained. This is clearly not what we want.

To solve this problem, the LRU chain is split into two parts: one stores hot data and one stores cold data. The proportion of cold data is controlled by the parameter innodb_old_blocks_pct, which is 37% by default.

The process by which data pages are loaded into the cache

When data is first loaded into the Buffer Pool, it is placed in the head of the cold chain. If it is accessed again after 1s, the cache page is moved to the head of the hot chain. This interval is controlled by the parameter innodb_old_blocks_time (default: 1000ms (1s)).

LRU chain performance optimization to the extreme

InnoDB also has a small optimization for the LRU chain. If a cached page is accessed and it is already in the first quarter of the hot chain, it will not move the cached page to the head. The goal is also to improve performance, because even with the bidirectional list + hash structure, there is still a performance cost to moving cached pages.

The LRU chain structure is as follows:

The cache page is flushed back to disk

First of all, regardless of whether the LRU chain is based on hot and cold data isolation, based on what we said earlier, there are background threads that periodically flush cached pages back to disk, so what cached pages are being flushed? The obvious: the cached page at the end of the cold chain.

Secondly, when the memory is insufficient, which cache pages need to be eliminated? Obviously: still a cached page at the end of the cold chain.

How to solve the above hidden dangers

To proofread the mechanism brings hidden trouble, because just loaded in the data is stored in a cold chain head, 1 s if accessed again will head into the hot chain, the pre-reading load come in but rarely accessed data page, there is no access to hot chain, therefore, when weeding, nature also won’t affect the thermal data, It prioritizes the end of the cold chain.

summary

Like pooling, separating hot and cold data is a good idea. We can also consider using similar ideas when designing our own caches.

We often use Redis as a cache, when we need to cache a large amount of data, and there is a clear difference between hot and cold data. You can’t cache all the data in Redis. Instead, you can cache some hot data in redis to improve cache hit ratio, improve performance and save resources. This is also called cache preloading of hot data. A series of hot data is calculated and preloaded into Redis to improve performance during peak traffic times, usually during low user visits.

Speaking of Redis, what is the cache elimination strategy for Redis as a high-performance key-value database?

Redis cache elimination strategy

When the actual redis memory exceeds the set maxmemory, Redis provides several policies for the user to decide how to make new space to continue to provide read and write services. The policy is set with the parameter maxmemory-policy. The relevant policies are as follows:

Write requests will not continue to service DEL requests, read requests will continue to service. In this way, data will not be lost, but the online business can not continue, the default elimination strategy;
Volatile -lru: Eliminates the oldest unused key from all keys that have expired.
Allkeys-lru: differs from volatile-lru, which eliminates the most unused key from allkeys.
Volatile -random: Randomly selects keys from all keys with expiration time.
Allkeys-random: randomly eliminate keys from allkeys;
Volatile – TTL: Selects the key with the smallest remaining TTL from all the keys whose expiration time is set.

LRU algorithm in Redis

Similarly, redis does not directly use the original LRU algorithm, but uses an approximate LRU algorithm. The original LRU algorithm imposes a significant additional cost on memory because existing Redis data structures need to be modified.

The approximate LRU algorithm uses random sampling to eliminate elements on the basis of existing data structure. To approximate the LRU algorithm, Redis adds an extra 24 bits to each key to store the timestamp of the last time the key was accessed.

In Redis, LRU is lazy. That is, when Redis performs a write operation and finds that the memory exceeds maxmemory, LRU is eliminated. The method is simple: randomly sample several keys (as many as possible), and then eliminate the oldest key. If the old key still exceeds maxmemory, continue to eliminate until the memory is smaller than maxmemory.

In Redis3.0, there is also a new elimination pool, which is essentially a big top heap. New and randomly generated keys are added to the elimination pool and the oldest keys are eliminated. I’m not going to go into the details here.

conclusion

This paper explains the common page replacement algorithm: LRU, explained the basic principle and implementation. At the same time, it analyzes the realization of LRU in InnoDB, the hidden trouble brought by simple LRU, and how InnoDB solve the hidden trouble through cold and hot data isolation and the ultimate optimization of performance. Finally, the realization of LRU algorithm in Redis is briefly analyzed.

The key to learning this article is to grasp the ideas and use them in your own design.