I. Introduction to caching

1.1 What is caching

A cache is a buffer for data exchange. The essence of a cache is an in-memory Hash. Caching is a space-for-time design with the goal of being faster and closer: vastly improved.

A storage (device) that writes/reads data faster;
Cache data to the nearest location to the application;
Cache the data to the nearest location to the user.

A cache is the part of the hardware or software used to store data for faster subsequent access to the corresponding data. The data in the cache may be pre-computed results, copies of data, and so on. Typical application scenarios include CPU cache and disk cache. The cache mentioned in this article mainly refers to the cache component used in Internet applications.

Cache hit ratio is an important metric of cache, and the higher the hit ratio, the better.

Cache hit ratio = reads from cache/total reads

1.2 When cache is needed

The introduction of caching will increase the complexity of the system. Therefore, before introducing cache, it is necessary to weigh whether it is worthwhile. The considerations are as follows:

CPU overhead – If applying a calculation consumes a lot of CPU, consider caching its results. Typical scenario: complex, frequently invoked regular calculations; Distributed computing intermediate state, etc.
IO overhead – If the database connection pool is busy, consider caching its query results.

Introducing caching at the data layer has several benefits:

Improves data read speed.
Improve system scalability by expanding the cache.
Reduce storage cost, Cache+DB can bear the original need for multiple DB to bear the volume of requests, saving machine cost.

1.3 Basic principles of Caching

Depending on the business scenario, the cache can be used in the following ways:

Lazy (triggered while reading) : Queries data in the DB and writes related data to the Cache.
Hungry (triggered while writing) : Writes data to the DB and then writes related data to the Cache.
Periodic refresh: Suitable for tasks that periodically run data, or column type data, and do not require absolute real time.

1.4 Cache elimination strategy

Types of cache obsolescence:

1) Space-based: set the cache space size.

2) Based on capacity: set the number of records stored in cache.

3) Based on time

TTL (Time To Live) Indicates the Time from creation To expiration of cache data.
TTI (Time To Idle) Indicates the Time during which the cache data is not accessed.

Cache elimination algorithm:

1) FIFO: First in, first out. In this elimination algorithm, the first one in the cache is eliminated first. This is the simplest, but it leads to a very low hit rate. Imagine if we had a very frequently accessed data that was the first to be accessed, and the less frequently accessed data that was accessed later, that would crowd out our first data that was accessed very frequently.

2) LRU: least recently used algorithm. In this algorithm, the above problem is avoided. Every time the data is accessed, it will be placed at the end of our team. If the data needs to be eliminated, only the first team can be eliminated. However, there is still a problem with this. If a data is accessed 10,000 times in the first 59 minutes of an hour (it can be seen that this data is a hot data), and the data is not accessed in the next minute, but there are other data accesses, our hot data will be eliminated.

3) LFU: the least frequency of recent use. In this algorithm, the above is optimized, using extra space to record the use frequency of each data, and then selecting the lowest frequency for elimination. This avoids the problem of LRU not being able to handle time periods.

Each of the three cache elimination algorithms has a higher implementation complexity than the other, and the same hit ratio is better than the other. And we generally choose the solution in the middle, that is, the implementation cost is not too high, and the hit rate is also good LRU.

Second, cache classification

From a deployment perspective, caching can be divided into client-side caching and server-side caching.

Client cache

HTTP cache
Browser cache
APP Cache (1, Android 2, IOS)

Server side cache

CDN cache: Stores static resources such as HTML, CSS, and JS.
Reverse proxy caching: Static and dynamic separation, caching only static resources requested by the user.
Database caches: Databases (such as MySQL) generally have caches themselves, but they are not recommended because of hit ratio and update frequency.
In-process cache: Caches commonly used data such as application dictionaries.
Distributed cache: Caches hotspot data in the database.

CDN cache, reverse proxy cache and database cache are generally maintained by full-time personnel (operation and maintenance, DBA). Back-end development generally focuses on in-process caching and distributed caching.

2.1 HTTP cache

2.2 the CDN cache

CDN caches data to the server that is physically closest to the user, so that the user can obtain the requested content nearby. CDN generally caches static resource files (pages, scripts, images, videos, files, etc.).

Domestic networks are extremely complex and cross-carrier network access is slow. CDN applications can be deployed in important cities to address cross-carrier or regional user access issues. Users can obtain the required content nearby, reducing network congestion and improving user access response speed and hit ratio.

Why Use a CDN

2.1.1 the CDN principle

The basic principle of CDN is widely used in a variety of the cache, the cache server distribution to the user to access relatively concentrated area or in the network, when users visit the web site, using the global load technology to the user’s access point to the nearest work cache server, directly by the cache server response to user requests.

1) Network path before CDN application is deployed:

Request: Local network (LAN) => Carrier network => application server room
Response: Application server room => Carrier Network => Local Network (LAN)

Without considering the complex network, it takes 3 nodes and 6 steps to complete a user access operation from request to response.

2) Network path after CDN application deployment:

Request: Local network (LAN) => carrier network
Response: Carrier network => Local Network (LAN)

Without considering the complex network, it takes two nodes and two steps to complete a user access operation from request to response. Compared with no CDN service deployment, 1 node and 4 steps of access are reduced. Greatly improve the response speed of the system.

2.1.2 the CDN characteristics

advantages

Local Cache acceleration: Improved access speed, especially for sites with lots of images and static pages.
Realize cross-carrier network acceleration: eliminate the impact caused by the bottleneck of interconnection between different carriers, realize cross-carrier network acceleration, ensure that users in different networks can get good access quality;
Remote acceleration: The remote access user automatically selects the Cache server based on the DNS load balancing technology and selects the fastest Cache server to speed up remote access.
Bandwidth optimization: Automatically generates a remote Mirror cache server for the server. When remote users access the cache server, they can read data from the cache server. This reduces the bandwidth for remote access, shares network traffic, and lightens the load on the WEB server of the original site.
Cluster anti-attack: the widely distributed CDN nodes and intelligent redundancy mechanism between nodes can effectively prevent hacker intrusion and reduce the impact of various D.D.O.S attacks on websites, while ensuring better service quality.

disadvantages

Caching dynamic resources is not appropriate

Solution: mainly cache static resources, dynamic resources to establish multi-level cache or quasi-real-time synchronization;

There are data consistency issues

1. Solution (mainly finding a balance between performance and data consistency).

2. Set the cache expiration time (1 hour, after which data will be synchronized).

3. Set the version number for the resource.

2.2 Reverse proxy Caching

In Reverse Proxy mode, a Proxy server receives Internet connection requests, forwards the requests to the Intranet server, and returns the results to the Internet client. In this case, the proxy server acts as a reverse proxy server.

2.2.1 Principle of Reverse proxy Caching

The reverse proxy resides on the same network as the application server and processes all requests to the WEB server. Reverse proxy caching works as follows:

If the page requested by the user is cached on the proxy server, the proxy server directly sends the cached content to the user.
If there is no cache, a request is sent to the WEB server to retrieve the data, which is cached locally and then sent to the user.

This approach reduces the load on the WEB server by reducing the number of requests to the WEB server.

Reverse proxy caching typically targets static resources and forwards dynamic resource requests to the application server for processing. Common cache application servers are Varnish, Ngnix, and Squid.

2.2.2 Reverse proxy Cache Comparison

Common proxy caches include Varnish, Squid, and Ngnix. Simple comparisons are as follows:

Varnish and Squid are professional cache services, Ngnix requires third-party module support;
Varnish adopts in-memory cache to avoid frequent file exchange in memory and disk, and has higher performance than Squid.
As Varnish is a memory cache, it supports small files such as CSS, JS and small images. The backend persistent cache can be Squid or ATS.
Squid function is full and large, suitable for a variety of static file cache, generally in the front of a HAProxy or Ngnix load balancing run multiple instances;
Nginx uses the third-party module Ncache to do caching, performance basically reach Varnish, generally used as a reverse proxy, can achieve simple caching.

In-process caching

In-process cache refers to the in-application cache. It is a standard distributed system and generally consists of multi-level cache. The local cache is the cache closest to the application and can cache data to disk or memory.

Disk cache: Caches data to disks and reads data from disks. The principle is to read the local file directly, reduce the network transmission consumption, faster than through the network to read the database. It can be used in scenarios where speed is not very high but a large amount of cache storage is required.
Memory cache: The fastest way to access data is to store data directly in native memory and maintain cache objects directly through programs.

Common local Cache implementations include HashMap, Guava Cache, Caffeine, and Ehcache.

3.1 ConcurrentHashMap

The simplest in-process cache can be implemented using the JDK’s built-in HashMap or ConcurrentHashMap.

Application scenario: Cache data does not need to be discarded.
Cons: No cache flushing, unlimited memory growth.

3.2 LRUHashMap

You can implement a simple LRUHashMap by inheriting LinkedHashMap. Override the removeEldestEntry method to complete a simple least-recently used algorithm.

Disadvantages:

Lock competition is serious and performance is low.
No expiration time is supported.
Automatic refresh is not supported.

3.3 Guava Cache

Several shortcomings in LRUHashMap are addressed. Guava Cache uses a similar idea to ConcurrentHashMap to fragment locking to reduce lock contention.

The Guava Cache does not immediately expire expired entries (that is, it does not have a background thread scanning). Instead, the Guava Cache does expire entries during read and write operations to avoid global locking during background thread scanning. Check whether the device meets the refreshing conditions and refresh the device.

Caffeine is 3.4

Caffeine implements W-Tinylfu (a variant of LFU + LRU algorithm), which has much better hit ratio and read/write throughput than Guava Cache. The implementation principle is more complicated, and you can refer to the cache evolution history you should know.

Ehcache 3.5

EhCache is a pure Java in-process caching framework. It is fast and clean, and is the default CacheProvider in Hibernate.

advantages

Fast and simple;
Support a variety of cache strategies: LRU, LFU, FIFO elimination algorithm;
There are two levels of cached data: memory and disk, so you don’t have to worry about capacity;
Cached data is written to disks during VM restart.
Distributed caching can be implemented through RMI, pluggable API, etc.
Listener interfaces with caches and cache managers;
Support for multiple cache manager instances, as well as multiple cache areas for one instance;
Provide Hibernate caching implementation.

disadvantages

The disk Cache occupies a lot of disk space.
Does not guarantee the security of data;
Although distributed caching is supported, it is inefficient (synchronizing data between different nodes through multicast).

3.6 Intra-process Cache Comparison

Comparison of common in-process caching techniques:

ConcurrentHashMap: Suitable for caching fixed elements with a small number of cached elements. It’s a bit paltry compared to the table above, but it’s a native JDK class that still has a lot of use in various frameworks. For example, we can use it to cache our reflected methods, fields, etc. You can also cache some links to prevent them from being duplicated. Caffeine also uses a ConcurrentHashMap to store elements.
LRUMap: You can use this if you don’t want to introduce third-party packages and you want to use an elimination algorithm to weed out data.
Ehcache: Due to its large jar package, it is heavyweight. Ehcache is an option for persistence and clustering. It should be noted that although Ehcache also supports distributed cache, it is generally not recommended to use Ehcache as a distributed cache because its communication between nodes is rmI and its performance is not as good as Redis.
Guava Cache: The Guava JAR has been introduced in many Java applications, so it is easy to use, lightweight and feature-rich. If Caffeine is not required, you can choose the Guava Cache.
Caffeine: Hit ratio, read/write performance is much better than Guava Cache, and its API is almost the same as Guava Cache, if not more. Caffeine has been used to great effect in the real world.

To summarize: Select ConcurrentHashMap if you don’t need to eliminate the algorithm, and recommend ConcurrentHashMap if you need to eliminate the algorithm and some rich apis.

Distributed cache

Distributed caching solves the biggest problem with in-process caching: if the application is a distributed system, nodes cannot share each other’s in-process cache. Application scenarios of distributed cache:

Caches data from complex calculations.
Cache frequently accessed hotspot data in the system to reduce database pressure.

The implementation principles of different distributed caches are often quite different. This article focuses on Memcached and Redis.

4.1 Memcached

Memcached is a high-performance, distributed memory object caching system that can be used to store data in a variety of formats, including images, videos, files, and database retrieval results, by maintaining a single large Hash table in memory.

In simple terms, data is cached in memory and then read from memory, thus greatly improving the reading speed.

4.4.1 Memcached features

Physical memory is used as a cache and can run independently on the server. Each process has a maximum of 2 GB. If you want to cache more data, you can create more Memcached processes (with different ports) or use distributed Memcached to cache data on different physical machines or virtual machines.
Use key-value to store data. This is a single index structured data organization, which can make the query time complexity of data items O(1).
Simple protocol, based on the text line protocol. Data can be accessed directly on the Memcached server through Telnet, which is simple and convenient for various caches to refer to this protocol.
High performance communication based on Libevent. Libevent is a program library developed by C, which encapsulates event processing functions such as Kqueue of BSD system and epoll of Linux system into an interface. Compared with traditional SELECT, Libevent improves performance.
Distributed capability depends on the Memcached client, and the servers do not communicate with each other. The Memcached servers do not communicate with each other and access data independently without sharing any information. The server does not have distributed capabilities, and distributed deployment depends on the Memcached client.
LRU cache elimination strategy is adopted. When storing data items in Memcached, you can specify when they will expire in the cache. The default is permanent. When the Memcached server runs out of allocated memory, the stale data is replaced first and then with the most recently unused data. In LRU, Memcached uses a Lazy Expiration policy. Instead of monitoring the Expiration of stored key/ VLUE pairs, Memcached looks at the timestamp of the record when fetching the key value to see if the key/value pair space is expired. This reduces the load on the server.
Built-in a set of efficient memory management algorithm. This method of memory management is efficient and does not cause memory fragmentation, but its biggest disadvantage is that it causes space waste. When the memory is full, the unused cache is automatically deleted through the LRU algorithm.
Persistence is not supported. Memcached does not consider data disaster recovery. Restart the Memcached service and all data will be lost.

4.1.2 How Memcached works

1) Memory management

Memcached uses slab Allocation to allocate and manage memory. It divides the allocated memory into memory blocks of specific length according to the preset size, and then divides the memory blocks of the same size into groups. When storing data, the slab size is matched according to the key value. Use the nearest slab for storage, so space is wasted.

This method of memory management is efficient and does not cause memory fragmentation, but its biggest disadvantage is that it causes space waste.

2) Cache elimination strategy

Memcached’s cache flushing policy is an LRU + expiration policy.

When you store an item in Memcached, you can specify when it will expire in the cache, which defaults to permanent. When the Memcached server runs out of allocated memory, the stale data is replaced first, then the recently unused data.

In LRU, Memcached uses a Lazy Expiration policy: Memcached does not monitor whether the stored key/ VLUE pair is out of date. Instead, Memcached looks at the timestamp of the record when retrieving the key value to check whether the key/value pair space is out of date, thus reducing the load on the server.

3) partition

Memcached servers do not communicate with each other and rely on the client for their distributed capabilities. Specifically, an algorithm is implemented on the client side to calculate which server node data should be read/written to based on the key.

There are three common algorithms for selecting cluster nodes:

Hash mod algorithm: Use the formula: Hash (key) % N to calculate the Hash value to determine which node the data is mapped to.
Consistent Hash algorithm: It can solve the stability problem. All storage nodes can be arranged on the Hash ring end to end, and each key will find the adjacent storage node clockwise after calculating the Hash. When a node joins or exits, only the subsequent nodes clockwise next to the node on the Hash ring are affected.
Virtual Hash slot algorithm: A well-distributed Hash function is used to map all data into a fixed range of integers defined as slots, which are typically much larger than the number of nodes. A slot is the basic unit for data management and migration in a cluster. The main purpose of adopting a wide range of slots is to facilitate data splitting and cluster expansion. Each node is responsible for a certain number of slots.

4.2 Redis

Redis is an open source (BSD licensed), memory-based, multi-data structure storage system. Can be used as database, cache, and messaging middleware.

Redis can also use client sharding to extend write performance. Built in replication, LUA scripting, LRU eviction, transactions and different levels of disk persistence, And provides high availability through Redis Sentinel and Automated partitioning (Cluster).

2 Redis features

Supports multiple data types – String, Hash, list, set, and sorted set.
Support a variety of data elimination strategies;

Volatile – lRU: Selects the least recently used data from a set with an expiration date;

**volatile- TTL ** : selects the data to be obsolete from the set with expiration time;

Volatile -random: Random selection of data from a set whose expiration time has been set;

Allkeys-lru: Select the least recently used data from all data sets for elimination;

Allkeys-random: Randomly select data from all data sets for elimination;

Noeviction: Forbid eviction data.

Two types of persistence are provided – RDB and AOF.
Redis Cluster provides the clustering mode.

4.2.2 Redis principle

1) Cache obsolescence

Redis has two data elimination implementations;

Negative: When accessing the Redis key, delete it if it is found to be invalid
Active mode: Periodically delete some invalid keys based on the elimination policy.

2) partition

Redis Cluster contains 16,384 virtual Hash slots. It uses an efficient algorithm to calculate which Hash slot a key belongs to.
Redis Cluster supports request distribution – when a node receives a command request, it checks to see if the slot is its own. If not, the node will return a MOVED error to the client. MOVED The error carries information that directs clients to redirect requests to the node that is responsible for the associated slot.

3) Master/slave replication

Redis 2.8 supports asynchronous replication. It has two modes:

Full resychronization – for initial replication. The execution procedure is basically the same as that of the SYNC command.

Partial resychronization – Repeat after disconnection. If conditions permit, the primary server can send the write commands executed during the disconnection between the primary and secondary servers to the secondary server. The secondary server only needs to receive and execute these write commands to keep the database status of the primary and secondary servers consistent.

Each node in the cluster periodically sends PING messages to other nodes in the cluster to check whether they are online.
If a master node is considered offline, one of its slave nodes is elected according to Raft algorithm and promoted to the master node.

4) Data consistency

Redis does not guarantee strong consistency, as this can significantly degrade cluster performance.
Redis uses asynchronous replication to achieve final consistency.

4.3 Distributed Cache Comparison

Different distributed cache features and implementation principles have great differences, so they adapt to different scenarios.

Here, three well-known distributed caches (MemCache, Redis and Tair) are selected for comparison:

MemCache: only suitable for memory-based caching frameworks; Data persistence and DISASTER recovery are not supported.
Redis: support rich data structure, high read and write performance, but data full memory, must consider resource cost, support persistence.
Tair: Supports rich data structures with high read/write performance, but some types are slow. Theoretically, the capacity can be expanded indefinitely.

Summary: Redis works best if the service is latency sensitive and Map/Set data is high. If the service has a large amount of data to cache and is not particularly latency sensitive, Memcached is an option.

5. Multi-level cache

5.1 Overall Cache framework

Typically, the cache of a large software system adopts a multi-level cache scheme:

Request process:

The browser sends a request to the client and returns it directly if the CDN has cache.
If the CDN has no cache, access the reverse proxy server.
Return if the reverse proxy server has cache.
If the reverse proxy server has no cache or dynamic requests, access the application server.
Application servers access the in-process cache. If there is a cache, the proxy server is returned and the data is cached (dynamic requests are not cached);
If the in-process cache has no data, the distributed cache is read. And return to the application server; The application server caches the data to the local cache (partially);
If the distributed cache has no data, the application program reads the database data and puts it into the distributed cache.

5.2 Using in-process Caching

If the application service is a single point of application, then in-process caching is of course the preferred solution for caching. For in-process cache, it is originally limited by the size of memory, and other caches cannot be known after the process cache is updated. Therefore, process cache is generally applicable to:

The data volume is not very large and the update frequency is low.
If you update data frequently and also want to use in-process caching, you can set its expiration time to a shorter time, or set the auto-refresh time to a shorter time.

This scheme has the following problems:

If the application service is a distributed system, the cache cannot be shared between application nodes, causing data inconsistency.
Since the in-process cache is limited by memory size, the cache cannot be extended indefinitely.

5.3 Using distributed Cache

If the application service is a distributed system, the simplest caching solution is to use distributed caching directly. Its application scenarios are shown in the figure below:

Redis is used to store hotspot data. If the cache is not hit, the database is queried and the cache is updated. This scheme has the following problems:

If the cache service fails, the application can only access the database, which may cause a cache avalanche.
Accessing the distributed cache service has some I/O and serialization deserialization overhead, which is high performance but not as fast as querying in memory.

5.4 Using Multi-level Caching

Both in-process caching and distributed caching have their own disadvantages. If you need higher performance and better availability, you can design the cache as a multi-level structure. The hottest data is stored in the memory using the in-process cache to further improve the access speed.

This design idea also exists in computer systems, for example, the CPU uses L1, L2, and L3 levels of cache to reduce direct access to memory and thus speed up access. Generally speaking, multi-level cache architecture can meet most business requirements by using two-level cache. Excessive hierarchical cache will increase system complexity and maintenance cost. Therefore, multi-level cache is not the more hierarchical the better, and needs to be weighed according to the actual situation.

A typical level 2 cache architecture can use in-process caches (e.g., Caffeine/Google Guava/Ehcache/HashMap) as level 1 caches. Use a distributed cache (e.g. Redis/Memcached) as a secondary cache.

5.4.1 Multi-level Cache Query

Multi-level cache query flow is as follows:

First, the L1 cache is queried. If the cache hits, the result is returned directly. If no, go to the next step.
Next, the L2 cache is queried, and if the cache hits, the result is returned and the L1 cache is backfilled. If no, go to the next step.
Finally, the database is queried, the results are returned and the L2 cache and L1 cache are backfilled successively.

5.4.2 Multi-level Cache Update

For THE L1 cache, if there is a data update, the cache can only be deleted and updated on the host machine. Other machines can only refresh the cache through the timeout mechanism. There are two strategies for setting a timeout:

Set to how much time after writing expires;
Set to how long it takes to refresh after writing.

For L2 cache, if there is a data update, other machines will see it immediately. However, you must also set a timeout that is longer than the valid L1 cache. In order to solve the problem of inconsistent in-process cache, the design can be further optimized.

Through the publish and subscribe mechanism of message queue, other application nodes can be notified to update the in-process cache. With this scheme, even if the message queue service is down or unreliable, the final consistency of the data is guaranteed when the cache is refreshed because the database update is performed first but the in-process cache expires.

6. Cache issues

6.1 Cache Avalanche

Cache avalanche refers to that the cache is unavailable or a large number of caches fail during the same time period due to the same timeout period. A large number of requests directly access the database and the system avalanches due to excessive database pressure.

For example, for system A, assuming A daily peak of 5,000 requests per second, the cache would have been able to handle 4,000 requests per second at peak times, but the cache machine unexpectedly went down completely. The cache is down, and 5000 requests per second are sent to the database, so the database can’t handle it, so it’s going to give an alarm, and then it’s down. At this point, if no special solution is adopted to handle the failure, the DBA is anxious to restart the database, but the database is immediately killed by the new traffic.

The main solutions to cache avalanche are as follows:

Increase cache system availability (prior). For example, deploy Redis Cluster (master/slave + Sentinel) to achieve high availability of Redis and avoid total crash.
Adopt multi-level cache scheme (in process). For example, local Cache (Ehcache/Caffine/Guava Cache) + distributed Cache (Redis/ Memcached).
Current limit, downgrade, circuit breaker scheme (in the event), to avoid being killed by the flow. For example, Hystrix is used to fuse and degrade.
Caching, if it supports persistence, can recover data after recovery (post facto). For example, Redis supports persistence. Once restarted, it automatically loads data from disk and quickly recovers cached data.

The above solution is simply a multi-level caching solution. When the system receives a query request, it checks the local cache first, then the distributed cache, and finally the database. If a match is hit, it returns immediately.

The following are some of the AIDS to cache avalanche:

Monitor cache and expand capacity flexibly.
The expiration time of the cache can be a random value. This is to avoid the cache failure at the same time, causing the database I/O surge. For example, the previous timeout period was set to 10 minutes. Then, each Key can expire in 8-13 minutes randomly. Try to make the expiration time of different keys different.

6.2 Cache Penetration

Cache penetration is when the queried data does not exist in the database, so it does not exist in the cache. Therefore, if the application is not found in the cache, it will query the database. As the number of such requests increases, the strain on the database increases.

There are generally two ways to resolve cache penetration:

1) Cache null values

Returns that return NULL are still cached. Returns that throw exceptions are not cached.

This method will increase the maintenance cost of our cache. We need to delete the empty cache when the cache is inserted. Of course, we can solve this problem by setting a shorter timeout period.

2) Filter data that can’t possibly exist

Make some rules to filter out data that doesn’t exist. You can use bloom filters (for binary data structures, so high performance), such as your order ID is obviously in the range of 1-1000, if the data is not in the range of 1-1000 can be directly filtered out.

For some malicious attacks, a large number of keys brought by the attack do not exist, so we use the first scheme to cache a large number of non-existent key data. At this point, it is not appropriate for us to use the first scheme, we can completely use the second scheme to filter out these keys. For the data with unusually large numbers of keys and low request repetition rate, there is no need to cache it, and the second scheme is used to filter it out directly. For empty data with limited key and high repetition rate, we can adopt the first way to cache.

6.3 Cache Breakdown

Cache breakdown is when a large number of requests directly access the database when hotspot data fails. For example, some keys are hot data that is accessed very frequently. When a key fails, a large number of requests are received and the cache fails to match. In this case, the number of database accesses increases sharply.

To avoid this problem, we can take the following two measures:

Distributed lock: Locks hot data keys to prevent multiple threads from accessing the same key at the same time.
Timed asynchronous refresh: Some data can be automatically refreshed before expiration rather than automatically eliminated after expiration. Elimination is also for the timeliness of data, so automatic refresh can also be used.

6.4 summary

The above describes the common problems with cache usage one by one. Here is a summary of the cache solution in terms of the time period in which it occurred.

Ex ante: Redis high availability solution (Redis Cluster + master/slave + sentry) to avoid full cache crash.
(1) Using multi-level Cache scheme, local Cache (Ehcache/Caffine/Guava Cache) + distributed Cache (Redis/ Memcached). (two) limit current + fuse + downgrade (Hystrix), to avoid extreme circumstances, the database was killed.
After: Redis persistence (RDB+AOF), once restarted, automatically load data from disk, fast recovery of cached data.

Distributed cache Memcached, because the data type is not as rich as Redis, and does not support persistence and disaster recovery. Therefore, Redis is generally chosen for distributed caching.

7. Cache strategy

7.1 Cache Preheating

Cache preheating refers to querying and caching hotspot data after the system starts. This avoids the problem of querying the database and then updating the cache when the user requests it.

Solution:

Manual cache refresh: directly write a cache refresh page, manual operation when online.
Flush the cache at application startup: Data is small and can be loaded automatically at project startup.
Periodically and asynchronously refresh the cache.

7.2 How To Cache

7.2.1 Cache without expiration

Cache update mode:
Start a transaction;
Writing SQL;
Commit transaction;
Write cache;

Do not place write cache operations in transactions, especially write distributed caches. Network jitter may cause slow response time of write cache and block database transactions. If the cache data consistency requirement is not high and the data volume is not large, you can consider periodic full cache synchronization.

In this pattern, there is a possibility that the transaction succeeds but the cache write fails. But this situation is less important than the problem above.

7.2.2 Expiration cache

Lazy loading is used. For hot data, you can set a short cache time and load it asynchronously periodically.

7.3 Cache Update

In general, systems that don’t strictly enforce cache and database consistency try not to serialize read and write requests. Serialization can guarantee that data inconsistency will not occur, but it will lead to a significant decrease in system throughput.

In general, there are two types of cache updates:

Delete the cache first, then update the database;
Update the database first, then delete the cache;

Why delete the cache instead of updating it?

You can imagine that when you have multiple concurrent requests to update data, you can’t guarantee that the order in which you update the database is the same as the order in which you update the cache, and then there will be inconsistencies between the database and the cache. So in general, consider removing the cache.

Delete the cache first, then update the database;

For a simple update operation, it is to delete all levels of cache first, and then update the database. There is a big problem with this operation. After the cache is deleted, there is a read request. At this time, because the cache is deleted, the library will be read directly, the read data is old and will be loaded into the cache.

Operations on the cache, whether successful or unsuccessful, should not block operations on the database, so many times removing the cache can be done asynchronously, but deleting the cache first does not work well in this scenario. Another benefit of deleting the Cache first is that if the operation on the database fails, it will only result in a Cache Miss due to the Cache removed first.

1) Update the database first, then delete the cache (note: this strategy is preferred).

This problem can be avoided if we update the database and then delete the cache.

But again, a new problem is introduced: if an update operation is performed and a query request is received, the old data in the cache will be returned. To make matters worse, if the database update operation fails, the cache may always be dirty.

2) Which update strategy should be selected

As we know from the above, both update strategies have concurrency issues.

However, it is recommended to update the database before deleting the cache, because the probability of concurrency problems can be very low, because this condition requires cache invalidation at read time and a concurrent write operation. In fact, the database write operation is much slower than the read operation and locks the table, while the read operation must enter the database operation before the write operation and update the cache after the write operation, all of these conditions are not very likely.

To ensure strong consistency between the database and cache, you can use the 2PC or Paxos protocol. But 2PC is too slow and Paxos is too complex, so strong consistency schemes are not recommended if data is not very important. See distributed database and cache dual write consistency scheme resolution for a more detailed analysis.

Eight, summary

Finally, a mind map is used to summarize the knowledge points described in this paper to help you have a systematic understanding of cache.

Ix. Reference materials

1. Technical Architecture of Large Websites: Core Principles and Case Analysis

2. What you should know about cache evolution

3. How to design and use cache elegantly?

4. Understand the Cache Architecture in distributed Systems (PART 1)

Cache those things

Distributed database and cache dual write consistency scheme parsing

Author: Zhang Peng, Vivo Internet Server Team

Parse the cache design for distributed systems