For high-concurrency systems, it is not advisable to put all requests on the database. The common approach is to relieve the strain on the database by caching. Caching is a silver bullet to solve the performance and stability problems of systems in high concurrency scenarios. The most important thing to consider is cache penetration and data consistency.

Let’s start with an example:

This is an example of a simple application cache with the following problems:

1: In the case of high concurrency, most requests go directly to the database. With multiple threads, most requests will go directly to the database before the cache is written in time.

2: The decoupling between service logic and cache is too high.

Let’s solve the first problem first. It’s the mechanism of locking. However, consider where to place the lock, which can cause performance problems, such as not being able to place the lock on the whole method.

Next, solve the second problem, which is decoupling, and make it easy for others to use this caching method. That is, using the template design pattern, make a cache template.

Call method:

So far, it’s a relatively perfect solution.

But as an architect, you need to broaden your horizons, and that is to implement a more convenient architecture for more people to use flexibly. As a reminder, take the idea of Spring Cache and use AOP + Annotation to decouple caching from business logic.

The general process is as follows:

1. Obtain the @cache annotation of the interception method and generate the Cache key;

2. Use the cache key to retrieve data from the cache.

3. If the cache matches, perform the following operations:

  • If automatic loading is required, the relevant information is saved to the automatic loading queue.
  • Otherwise, the system checks whether the cache is about to expire. If the cache is about to expire, an asynchronous refresh is initiated.
  • Finally, the data is returned to the user.

4. If the cache is not matched, perform the following operations:

  • Elects a leader to go back to the data source to load the data, then notifies other requests to fetch the data from memory.
  • The leader writes data to the cache; If automatic loading is required, the relevant information is saved to the automatic loading queue.
  • Finally, the data is returned to the user.