This is the 11th day of my participation in the August More Text Challenge. For details, see:August is more challenging

For a large website, the daily visits are huge, especially at certain time points, such as the shopping festival on the e-commerce platform and the back-to-school season on the education platform. When there is too much concurrency at some point in time, it will often overwhelm the server and cause the website to crash. Therefore, the website is very important to handle high concurrency, and the cache plays an important role. For data that doesn’t change very often, or is very hot, it can be stored in the cache, when the user accesses the cache will not query the database, thus greatly improving the throughput of the website.

Use of caching

First, set up a simple test environment, create a SpringBoot application, and write a controller:

@RestController
public class TestController {

    @Autowired
    private UserService userService;

    @GetMapping("/test")
    public List<User> test(a){
        returnuserService.getUsers(); }}Copy the code

accesshttp://localhost:8080/testAll user information can be obtained:We use thejmeterTo stress test the app, go to the website:Jmeter.apache.org/download_jm… Download the ZIP package to the local PC, decompress it, and double-click in the bin directoryjmeter.batJmeter can be started:Here we simulate 2000 concurrent requests per second to see what the throughput of the application is:It is found that the throughput is 421. You can imagine that when the amount of data in the data table is very large, if all requests need to query the database once, then the efficiency will be greatly reduced. Therefore, we can add cache to optimize:

@RestController
public class TestController {

    / / cache
    Map<String, Object> cache = new HashMap<>();

    @Autowired
    private UserService userService;

    @GetMapping("/test")
    public List<User> test(a) {
        // Get the data from the cache
        List<User> users = (List<User>) cache.get("users");
        if (StringUtils.isEmpty(users)) {
            // Query database without naming cache
            users = userService.getUsers();
            // Cache the data obtained from the query
            cache.put("users",users);
        }
        // Name the cache and return it directly
        returnusers; }}Copy the code

Using HashMap to simulate a cache in a nutshell, the following execution of the interface looks like this:When the request arrives, the data is first read from the cache. If the data is read, the data is returned directly. If not, the database is queried and the resulting data is stored in the cache so that the next request can read the data in the cache. Now test the throughput of the application:It’s not hard to see a significant increase in throughput.

Local and distributed caches

We just used caching to improve the overall performance of the application, but the cache is defined inside the application, and this cache is calledThe local cache. Local caching is a good solution for stand-alone applications, but in distributed applications, multiple copies of an application are often deployed to achieve high availability:At this time, each application will save a copy of its own cache. When data is modified, data in the cache should be modified accordingly. However, because there are multiple copies of the cache, other caches will not be modified, leading to data confusion. Therefore, we need to extract the cache to form a cache middleware that is independent of, but relevant to, all applications:The most popular cache middleware isRedis.

SpringBoot integrate Redis

To change the app to use the Redis cache, first download the Redis image:

docker pull redis
Copy the code

Create directory structure:

mkdir -p /mydata/redis/conf
touch /mydata/redis/conf/redis.conf
Copy the code

Go to the /mydata/redis/conf directory and modify the redis. Conf file:

Appendonly Yes # Persistence configurationCopy the code

Create an instance of Redis and start it:

docker run -p 6379:6379 --name redis\
                  -v /mydata/redis/data:/data\
                  -v /mydata/redis/conf/redis.conf:/etc/redis/redis.conf\
                  -d redis redis-server /etc/redis/redis.conf
Copy the code

Configure redis to start with Docker:

docker update redis --restart=always
Copy the code

At this point, Redis is ready, and then we introduce Redis dependencies into the project:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
Copy the code

Configure Redis in application.yml:

spring: redis: host: 192.16866.10.
Copy the code

Modify the controller code:

@RestControllerpublic class TestController {    @Autowired    private UserService userService;    @Autowired    private StringRedisTemplate redisTemplate;    @GetMapping("/test")    public String test(a) {        / / get data from Redis String usersJson = redisTemplate. OpsForValue () get (" users "); If (stringutils.isempty (usersJson)) {List
      
        users = userService.getUsers(); UsersJson = json.tojsonString (users); / / in the cache redisTemplate. OpsForValue (). The set (" users ", usersJson); } // Return the result return usersJson; }}
      
Copy the code

Some problems with the cache

Using The Redis cache is not an easy thing to do, there are still a number of issues that need to be addressed. Here are three common issues faced by cache middleware:

  1. The cache to penetrate
  2. Cache avalanche
  3. Cache breakdown

The cache to penetrate

Cache penetration refers to querying a data that must not exist. If the cache does not hit the data, it needs to query from the database. If the data is not found, the data is not written to the cache. Because caching helps take the pressure off the database, but if someone knows what data in the system must not exist, it can use that data to flood the system with requests and overwhelm our system. Solution is whether or not the data exist, are to be stored, such as a request to the data doesn’t exist, so still will be the key for storage of data, so the next time the request can be obtained from the cache, but the key if each request data are different, so can store a lot of useless Redis key, Therefore, you should set a specified expiration time for these keys. The expiration time can be automatically deleted.

Cache avalanche

Cache avalanche means that a large amount of data in the cache expires at the same time, and the amount of queried data is huge, resulting in excessive pressure and even downtime of the database. The solution is to add a random value to the original expiration time of data, so that the expiration time of data is inconsistent, so that a large number of data will not expire at the same time.

Cache breakdown

A cache breakdown is when a hot key expires at a certain point in time and a large number of concurrent requests for the key are coming in at the same time, resulting in a large number of requests to db. The solution is to lock the data. When a hot key expires, a large number of requests will compete for resources. When a request succeeds, other requests will have to wait until the data is in the cache.

Fix the cache breakdown problem

Both cache penetration and cache avalanche can be easily solved, but cache penetration requires locking, so let’s explore how to solve cache penetration with locking.

@GetMapping("/test")public String test(a) {    String usersJson = redisTemplate.opsForValue().get("users");    if (StringUtils.isEmpty(usersJson)) {        synchronized (this) {/ / again to confirm whether there is data in the cache String json. = redisTemplate opsForValue () get (" users "); if(StringUtils.isEmpty(json)){ List
      
        users = userService.getUsers(); System.out.println(" query database......") ); usersJson = JSON.toJSONString(users); }else{ usersJson = json; } redisTemplate.opsForValue().set("users",usersJson); } } return usersJson; }
      
Copy the code

The data still needs to be fetched from the cache first, and if the cache is not hit, the synchronized code block is executed, in which the cached data is confirmed. This is because when a large number of requests enter the outermost if statement at the same time, a request is executed and the database is queried successfully. However, after the request puts the data into Redis, if no judgment is made again, these requests will still query the database. The execution principle is as follows:After simulating 2000 concurrency times per second using JMeter, the results are as follows:

The database...... is queriedCopy the code

The console outputs only oneThe database...... is queried, indicating that the database was actually queried only once out of 2000 requests, but this was followed by a sharp drop in performance:This is not a problem for single-server applications, because SpringBoot’s default Bean is singleton, and there is no problem with using this to lock code blocks. However, in distributed applications, where multiple copies of an application are deployed, this cannot be used to lock requests from each applicationA distributed lock 。

A distributed lock

As with caching middleware, we can extract the lock outside, independent of all services, but associated with each service, as follows:

Every service needs to go to a common place for locking, so that even in a distributed environment, each service still has the same lock. This common place can have many choices, and Redis can be used to implement distributed locking. There is one directive in Redis that is very good for implementing distributed locking, and it issetnx“, to see what it says on the website:Setnx will only set the value in if the key doesn’t exist, otherwise nothing will be done, so for each service, we can just let it execute, rightsetnx lock 1Because this operation is atomic, even if there are millions of concurrent requests, only one request can be set successfully, and all other requests will fail because the key already exists. For the successful setting, it indicates that the occupancy of the lock is successful; If the setting fails, the lock will fail.The code is as follows:

@RestControllerpublic class TestController {    @Autowired    private UserService userService;    @Autowired    private StringRedisTemplate redisTemplate;    @GetMapping("/test")    public String test(a) throws InterruptedException {        String usersJson = redisTemplate.opsForValue().get("users");        if (StringUtils.isEmpty(usersJson)) {            usersJson = getUsersJson();        }        return usersJson;    }    public String getUsersJson(a) throws InterruptedException {        String usersJson = "";        / / take distributed lock. The lock = redisTemplate Boolean opsForValue () setIfAbsent (" lock ", "1"); If (lock) {/ / for success/lock/again to confirm whether there is data in the cache String json. = redisTemplate opsForValue () get (" users "); if (StringUtils.isEmpty(json)) { List
      
        users = userService.getUsers(); System.out.println(" query database......") ); usersJson = JSON.toJSONString(users); } else { usersJson = json; } redisTemplate.opsForValue().set("users", usersJson); // Release the lock. Redistemplate-delete ("lock"); } else {// Failed to occupy the lock, triggering a retry mechanism thread.sleep (200); // Call itself getUsersJson() repeatedly; } return usersJson; }}
      
Copy the code

, of course, here or there is a big problem, if before the lock is released, it appeared abnormal, in code is terminated, the lock is not timely release, will appear a deadlock problem, the solution is to take up the lock at the same time set the expiration date of the lock, so even if the program did not lock is released in a timely manner, such as Redis will lock automatically deleted when expired.

Even set the expiration date of the lock, still there will be a new problem, when the business of the expiration date of the execution time is greater than the lock, the business process have not completed at this time, but the lock has been Redis deleted, so other requests will be able to occupy the lock, and execute the business method, the solution is to let each request to take up the locks is unique, A request cannot arbitrarily remove the lock of another request, as follows:

public String getUsersJson(a) throws InterruptedException {    String usersJson = "";    String uuid = uuid.randomuuid ().tostring (); Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", uuid,300, TimeUnit.SECONDS); If (lock) {/ / for success/lock/again to confirm whether there is data in the cache String json. = redisTemplate opsForValue () get (" users "); if (StringUtils.isEmpty(json)) { List
      
        users = userService.getUsers(); System.out.println(" query database......") ); usersJson = JSON.toJSONString(users); } else { usersJson = json; } redisTemplate.opsForValue().set("users", usersJson); / / whether the current lock for your lock String lockVal = redisTemplate. OpsForValue () get (" lock "); If (uuid.equals(lockVal)) {// Release the lock if it is your own lock redistemplate.delete ("lock"); }} else {// Failed to occupy the lock, triggering the retry mechanism thread.sleep (200); getUsersJson(); } return usersJson; }
      
Copy the code

If you think about it, this is still a problem, because when the lock is released, the Java program sends instructions to Redis, and Redis executes and returns the results to the Java program, consuming time in the network transmission. In this case, the Java program receives the value of lock from Redis, Redis successfully returns the value, but during the return process, the lock expires. In this case, another request can take possession of the lock. In this case, the Java program receives the value of lock, and finds that it is its own lock. At this point, the lock in Redis is already the lock of another request, so there is still the problem of one request deleting the lock of another request. To this end, the Redis website also provides a solution:To solve this problem, execute a Lua script like this:

public String getUsersJson(a) throws InterruptedException {    String usersJson = "";    String uuid = uuid.randomuuid ().tostring (); Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", uuid,300, TimeUnit.SECONDS); If (lock) {/ / for success/lock/again to confirm whether there is data in the cache String json. = redisTemplate opsForValue () get (" users "); if (StringUtils.isEmpty(json)) { List
      
        users = userService.getUsers(); System.out.println(" query database......") ); usersJson = JSON.toJSONString(users); } else { usersJson = json; } redisTemplate.opsForValue().set("users", usersJson); String luaScript = "if redis.call(\"get\",KEYS[1]) == ARGV[1]\n" + "then\n" + " return redis.call(\"del\",KEYS[1])\n" + "else\n" + " return 0\n" + "end"; RedisScript = new DefaultRedisScript<>(luaScript, long.class); // Execute DefaultRedisScript
       
         redisScript = new DefaultRedisScript<>(luaScript, long.class); List
        
          keyList = Arrays.asList("lock"); redisTemplate.execute(redisScript, keyList, uuid); } else {// Failed to occupy the lock, triggering a retry mechanism thread.sleep (200); getUsersJson(); } return usersJson; }
        
Copy the code

Redisson

Redisson is a Java resident memory data grid based on Redis that we can use to easily implement distributed locking. First introduce the Redisson dependency:

<dependency>  <groupId>org.redisson</groupId>  <artifactId>redisson</artifactId>  <version>3.16.0</version></dependency>
Copy the code

Write configuration classes:

@Configurationpublic class MyRedissonConfig {    @Bean    public RedissonClient redissonClient(a) {        Config config = new Config();        config.useSingleServer().setAddress("Redis: / / 192.168.66.10:6379");        return Redisson.create(config);    }}
Copy the code

Write a controller to experience Redisson:

@RestControllerpublic class TestController {        @Autowired    private RedissonClient redissonClient;    @GetMapping("/test")    public String test(a) {        RLock lock = redissonclient.getLock ("my_lock"); The lock (); / / lock lock. Try {// Simulate service processing thread.sleep (1000 * 10); } catch (Exception e) { e.printStackTrace(); } finally {// Release the lock lock.unlock(); } return "test"; }}
Copy the code

Redisson automatically sets the lock expiration time and provides a lock watchdog that continuously extends the lock expiration time until the instance of Redisson is closed. If the lock thread has not finished processing the business (by default the watchdog is renewed for 30 seconds). You can also specify a lock expiration time:

lock.lock(15, TimeUnit.SECONDS);
Copy the code

Set the time when locking.

When the lock expiration time is set to 15 seconds, if the service execution takes more than 15 seconds, will Redis automatically delete the lock and other requests preempt the lock? In fact, this may happen, so we should avoid setting too small expiration time, make sure that the lock expiration time is greater than the service execution time.

Read/write locking can also be easily implemented using Redisson, for example:

@RestControllerpublic class TestController {    @Autowired    private StringRedisTemplate redisTemplate;    @Autowired    private RedissonClient redissonClient;    @GetMapping("/write")    public String write(a) {        RReadWriteLock wrLock = redissonClient.getReadWriteLock("wr_lock");        RLock wLock = wrLock.writelock (); The lock (); / / locking wLock. String uuid = ""; try { uuid = UUID.randomUUID().toString(); Thread.sleep(20 * 1000); / / in the redis redisTemplate. OpsForValue (). The set (" uuid, uuid); } catch (InterruptedException e) { e.printStackTrace(); } finally {// Release the lock wlock.unlock (); } return uuid; } @GetMapping("/read") public String read() { RReadWriteLock wrLock = redissonClient.getReadWriteLock("wr_lock"); RLock RLock = wrLock.readLock(); The lock (); / / locking rLock. String uuid = ""; Try {/ / read uuid uuid = redisTemplate. OpsForValue () get (" uuid "); } finally {// Release the lock rlock.unlock (); } return uuid; }}
Copy the code

As long as the same lock is used, the read operation must wait for the write operation, and the write lock is a mutex, so when one thread is writing, the other threads must wait. Read and write is a shared lock that can be read directly by all threads, ensuring that the latest data is read each time.

Cache consistency

Use a cache to increase the throughput of the system, but also poses a problem, when the cache with the data, can take out the data directly from the cache, but if the data in the database has been modified, users read to still the data in the cache, the data inconsistency problem, for this situation, there are generally two kinds of solutions:

  1. Dual-write mode: Make changes to the cache as well as the database
  2. Failure mode: The cache is deleted directly after the database is modified

Dual-write mode can cause dirty data problems, as shown below:Administrator of A and B in the modification of the price of A commodity, to submit to, administrator A administrator B after submission, arguably should be effective administrator B write caching operations, but due to unknown circumstances, such as network fluctuation causes the administrator A write cache effect before operation, and administrator B write caching effect after operation, the final data in the cache into the 2000, This leads to dirty data, but this dirty data is only temporary, because the data in the database is correct, so after the cache expires, the database is queried again, the data in the cache is normal. The problem becomes how to ensure data consistency in dual-write mode. The solution is to lock the operation of modifying the database and the operation of modifying the cache so that it becomes an atomic operation.

Failure mode can also lead to dirty data, so for frequently modified data, you should directly query the database instead of using the cache.

In summary, the general solution is to set the expiration time for all cached data, so that the cache expiration triggers a database query to update the cache; When reading or writing data, use Redisson to add read/write locks to ensure atomicity of write operations.