Scene: The Redis interview

Interviewer: I see on your resume that you are proficient in using Redis, so what is Redis for?

Xiaoming: (secretly pleased, isn’t Redis a cache?) Redis is primarily used as a cache to efficiently store nonpersistent data in memory.

Interviewer: Can Redis be used as persistent storage?

Xiaoming: HMM… That should be ok…

Interviewer: How does Redis persist?

Xiaoming: HMM… I’m not sure.

Interviewer: What are the memory obsolescence mechanisms in Redis?

Xiaoming: HMM… Did not understand

Interviewer: What else can we do with Redis? Which Redis directive is used?

Ming: ALL I know is that Redis can also do distributed locking, message queuing…

Interviewer: Ok, let’s move on to the next topic…

Thinking: Obviously, Xiao Ming’s performance and answer about Redis in the interview must be relatively failed. Redis is something we use every day at work. Why is it that when it comes to interviews, Redis become a missing item?

As developers, we’re used to using what the gods have already wrapped up to keep us focused on our business, but we don’t know what the underlying implementation of these common tools is, so while they work well on a daily basis, they still don’t impress the interviewer when it comes to interviews.

This paper summarizes some knowledge points of Redis, there are principles and applications, hope to help you.

What is a Redis

REmote DIctionary Server(Redis) is a key-value storage system written by Salvatore Sanfilippo.

Redis is an open source, BSD-compliant, net-based, memory-persistent-based, logging, key-value database written in ANSI, C, and providing apis in multiple languages.

Here I quote the description of Redis from the Redis tutorial, which is official but standard. It can be based on a log and key-value database that can be persisted in memory. I think that’s an apt and comprehensive description.

1.1 Industry status of Redis

Redis is the most widely used storage middleware in the Internet technology field. It is widely praised for its super high performance, perfect documents, various application capabilities and rich and perfect client support in storage. In particular, it has become the most favored middleware in the field due to its performance and read speed. Almost every software company uses Redis, including many large Internet companies such as JD.com, Alibaba, Tencent, Github, etc. As a result, Redis has become an essential skill for back-end developers.

1.2 Knowledge Graph

In my opinion, learning every technique requires a clear context and structure, otherwise you will not know what you have learned and how much you have not learned. Like a book, if there is no table of contents chapter, also lost the soul.

Therefore, I try to summarize The knowledge map of Redis, also known as brain map, as shown in the figure below. The knowledge may not be very complete and will be updated and supplemented continuously in the future.

This article will introduce the basic knowledge of Redis first, and the following article will introduce the data structure, application, persistence and other aspects of Redis in detail.

2. Advantages of Redis

2.1 fast

As a caching tool, Redis is best known for being fast. How fast is it? Redis stand-alone QPS (concurrent requests per second) can reach 110000 /s and write speed is 81000 /s. So why is Redis so quick?

  • Most requests are purely memory operations, very fast;
  • A number of data structures are used for data storage that are particularly fast in search operations, and the data structures in Redis are specifically designed. For example, HashMap, the time complexity of finding and inserting is O(1);
  • Using single thread, avoid unnecessary context switch and competition conditions, there is no multi-process or multi-threaded switching caused by CPU consumption, do not need to consider the problems of various locks, there is no lock, lock release operation, there is no possible deadlock caused by the performance consumption;
  • Non-blocking I/O multiplexing is used.

2.2 Rich Data Types

Redis has five common data types: String, List, Hash, set, and zset, each of which has its own uses.

2.3 Atomicity, support for transactions

Redis supports transactions and all its operations are atomic, while Redis also supports atomicity when several operations are combined.

2.4 Rich Features

Redis has rich features, such as being used as a distributed lock; Data can be persisted; Can be used as a message queue, leaderboard, counter; It also supports Publish/SUBSCRIBE, notifications, key expiration, and more. Redis comes in handy when it comes to solving real problems with middleware.

Comparison between Redis and Memcache

Both Memcache and Redis are excellent, high-performance in-memory databases. When we talk about Redis, we compare Memcache to Redis. (Why the comparison? How good is Redis? No comparison, no harm.

3.1 Storage Mode

  • Memcache stores all data in the memory and fails after a power failure. Data cannot be persisted and the size of data cannot exceed the memory size.

  • Redis has a part of the data on the hard disk, can achieve the persistence of data.

3.2 Supported Data types

  • Memcache’s support for data types is relatively simple. Only String data structures are supported.

  • Redis has a variety of data types, including String, List, Hash, Set, and Zset.

3.3 The underlying model used

  • They are different in terms of the underlying implementation and the application protocol used to communicate with the client.

  • Redis built the VM mechanism directly on its own, because the usual system calls system functions that waste a certain amount of time moving and requesting.

3.4 Size of the stored value

  • Redis can store up to 1GB of storage, while MemCache can store up to 1MB.

Seeing this, would you think that Redis is particularly good, all the advantages, perfect? In fact, Redis still has many shortcomings. How can we overcome these shortcomings?

Iv. Existing problems and solutions of Redis

4.1 Dual-write Consistency of the Cache Database

The problem: The problem of consistency is very common in distributed systems. There are two types of consistency: strong consistency and final consistency. When we want to meet strong consistency, Redis can’t do it perfectly, because the database and cache are double write, and there will be inconsistencies. Redis can only guarantee final consistency.

Solution: How do we ensure ultimate consistency?

  • The first method is to set a certain expiration time for the cache. After the cache expires, the database will be automatically queried to ensure the consistency between the database and the cache.

  • If the expiration time is not set, we must first select the correct update policy: update the database first and then delete the cache. However, we may have some problems deleting the cache, so we need to put the cache key to the message queue and try again until the deletion succeeds.

4.2 Cache Avalanche

Q: We’ve all seen avalanches in movies. They start out quiet, and then all of a sudden they collapse, and they can be devastating. The same is true here. When we execute the code, we set many caches to work for the same time, and then they all work at the same time, and then they all revisit the database to update the data, which will cause the database to crash due to the number of connections and pressure.

Solution:

  • Add a random value to the cache expiration time.
  • Set double caches, cache 1 is set to cache time, cache 2 is not set, 1 is returned to cache 2 after expiration, and start a process to update caches 1 and 2.

4.3 Cache Penetration

Problem: Cache penetration is when some abnormal user (hacker) intentionally requests data that does not exist in the cache, causing all requests to be concentrated on the database, resulting in an abnormal database connection.

Solution:

  • Use mutex. When the cache is invalid, the database cannot be accessed directly, but the lock must be obtained before the database can be requested. If no lock is obtained, sleep for a period of time and try again.

  • The asynchronous update policy is adopted. The value is returned regardless of whether the key fetched a value. If the cache expires, asynchronously start a thread to read the database and update the cache. You need to do a cache warm-up (load the cache before starting the project) operation.

  • Provides an interception mechanism to quickly determine whether a request is valid. For example, the bloom filter is used to maintain a series of valid keys internally and quickly determine whether the key carried by the request is valid or not. If not, return it directly.

4.4 Cache concurrent contention

Question:

The problem of concurrent cache contention occurs when multiple threads set a key, and data inconsistency occurs.

For example, in Redis we have a value with key amount and its value is 100. Both threads add 100 to the value at the same time and update it. The correct result should be 300. But when both threads get the value of 100, the end result is 200, which causes concurrent contention in the cache.

To solve

  • If multiple threads do not require order, we can set up a distributed lock, and then multiple threads fight for the lock, and the first to grab the lock can be executed first. This distributed lock can be implemented using ZooKeeper or Redis itself.
  • You can use the incr command of Redis.
  • When our multi-threaded operations need order, we can set up a message queue, add the required operations to the message queue, and execute the commands strictly according to the order of the queue.

5. Expiration policy of Redis

As Redis data increases, the memory usage continues to increase. We thought that some keys would be deleted when they reached the deletion time set, but when the time came, the memory usage was still very high. Why?

Redis uses the memory elimination mechanism of periodic deletion and lazy deletion.

5.1 Periodic Deletion

There is a difference between a scheduled delete and a scheduled delete:

  • Periodic deletion means that the cache must be deleted in strict accordance with the set time, which requires us to set a timer to continuously poll all keys to determine whether to delete. However, in this case, the CPU resources will be greatly occupied, and the resource utilization becomes low. So we chose to adopt regular deletion.

  • We can check every 100ms, but we still can’t check all the caches. Redis will still get stuck. We can only check some caches randomly, but some caches can’t be deleted within the specified time. This is where lazy deletion comes in.

5.2 Lazy Deletion

Take a simple example: when I was in middle school, I had too much homework to do at all. The teacher said that this paper would be taught in the next class. Have you all finished it? Actually, a lot of people didn’t finish it, so we need to make it up before the next class.

The value of this key should be gone, but it is still there. When you want to retrieve the key, you find that the key should be expired, delete it quickly, and return a ‘no value, expired! ‘.

Now that we have a periodic delete + lazy delete expiration policy, can we rest easy? That’s not true. If the key is never accessed, then it’s always stuck, which is not reasonable, so we need to have a memory weeding mechanism.

5.3 Redis memory weeding mechanism

There are 6 types of memory weeding mechanisms in Redis, as shown in the figure below:

So how do we configure the Redis memory weeding mechanism?

We can do this in redis.conf

# maxmemory-policy allkeys-lru
Copy the code

Six, the summary

In this paper, the knowledge graph of Redis is roughly sorted out, and it can be found that Redis has so many knowledge points to learn. Then we analyze the advantages and disadvantages of Redis, know its memory based efficient read and write speed and rich data types, also analyzed Redis in the face of data consistency, cache penetration, cache avalanche and other problems how to deal with; Finally, we understand the expiration policy and cache elimination mechanism of Redis.

In the next article, we will analyze the data structure of Redis, how each data type is implemented, and what the corresponding commands are.

By Yang Heng

Source: CreditEase Institute of Technology