Moment For Technology

Paging list caching policy with query criteria

Posted on Dec. 2, 2022, 8:07 p.m. by James Gonzalez
Category: The back-end Tag: The back-end

Objective: To quickly retrieve the paging list data based on the query criteria

The application scenarios are as follows:

Search conditions can be divided into the following categories:

  • The optional values, such as type status, are finite sets. Use "=" to query
  • If the input value of title and name is uncertain, use "like" to query

Page:

  • pageNumber
  • pageSize

Data is stored in mysql

First determine if caching is the best option

There are three properties that can be used as a cache key generation condition:

  1. Enumeration class value type
  2. Fuzzy search class value title
  3. Page parameter pageNumber pageSize

Cached value:

  1. id
  2. The entire Record object

If you use the above three optional attributes, no matter which one you choose, you will face the problems of frequent cache clearing and low cache hit ratio. For example, you can directly combine all query conditions and paging parameters as keys

  • Database records add, delete, change, all need invalid cache, cache clearance frequency is very high;
  • Because of the user input field, once the user input text is involved, the input value of this field will be quite different, resulting in a large number of keys in the cache and a low cache hit ratio

If the paging parameter is not used, all the records that match the condition will be cached. On the other hand, when the database record changes, no matter which kind of add, delete or modify, each cache will determine whether the changed data is hit, and then update, the update logic is very complex

Therefore, in this case, caching is not feasible to improve the interface response speed

Mature solution: Search engine (Elasticsearch)

Elasticsearch (ES) is an open source distributed search analysis engine based on Lucene. It can index and retrieve data in near real time. It is highly reliable, easy to use, and active in the community. It is widely used in full-text search, log analysis, and monitoring analysis.

Data in the Elasticsearch database can be synchronized to the Elasticsearch database. When data is updated or deleted, data in the Elasticsearch database is updated at the same time. For list query, data can be queried from Es

Is the list cache useless

Specific problem specific analysis

Scenario: No query conditions, only require a quick query of paging list data, then one available design scheme is as follows:

Select Redis for the cache implementation. The cache is divided into two pieces:

  1. Caches all data ids, using zset, key for data ID, score for sorting field (ID,rank, timestamp, etc.)
  2. A cache of objects, which can be serialized using String

In paging query, all ids of the current pages to be obtained are firstly found in zset, and then the corresponding Record is obtained according to this batch of ids. If any Record is not in the cache, the database is searched using where ID in, and then the cache is updated

The cached list getById of the object can be used

Note that the Zset must have all ids, and that the SCORE synchronizes the changes in the selected sorted field in the database

Three key points of cache design:

  1. Initialize the
  2. update
  3. remove

Consider the above cache design from these three points:

Initialization:

ids Record
The cache can be warmed up by loading all ids into the cache before the service is started, or on the first list query Similarly, you can preheat a batch of data with a larger score into the cache, which is either hot data at the front, or loaded when it is not available the first time

Update:

ids Record
Data added or deleted or used as the SCORE field changes When data is updated

remove

ids Record
Do not remove Data deletion

Through the above analysis, it can be seen that this design hit ratio is relatively high, and the cache update strategy is relatively simple and clear

Local versus centralized caching

Case study: The service has multiple nodes, uses Spring Cache for local caching, uses @cacheable for fetching data, and has a separate method for clearing the cache. This method uses @cacheevict annotation, which does nothing to clear the cache, and calls it when the data is stored. There seems to be no problem clearing the cache when the data is updated

So what's the problem?

This service has multiple nodes, and the request to save data will only be sent to one node. That is, only the node receiving the request will call the method of clearing the cache, while other nodes do not clear the cache, so the cache of several nodes is inconsistent

Search
About
mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.