Redis

Comprehensive experience

21 Tips you must Know to use Redis

RedisUnderlying data structure

Introduction to Redis data structures

RedisIO multiplexing, single thread, multithreading

Single threading and multiplexing

Only the network request module and data manipulation module are single threaded in Redis. Others, such as persistent storage modules, cluster support modules, are multithreaded.

Redis does not use the multi-threaded model in the network request module and data manipulation module, mainly for the following four reasons:

  • 1. Redis operations are based on memory, and the performance bottleneck of most operations is not in CPU
  • 2, the use of single-threaded model, higher maintainability, development, debugging and maintenance costs are lower
  • 3, single thread model, avoid the performance overhead caused by switching between threads
  • 4. Using multiplexing I/O technology in a single thread can also improve I/O utilization of Redis

Redis 6.0 multi-threading, also only for the processing of network request process using multi-threading, and data read and write commands, is still a single thread processing.

RedisThe transaction

Multi, Exec, Discard, Watch, Unwatch

Redis does not roll back when a transaction fails, but continues to execute the remaining commands. The WATCH command is used to monitor any number of keys before a transaction begins. When EXEC is invoked, monitoring of all keys is removed, regardless of whether the transaction executes successfully. Redis transactions guarantee ACID consistency (C) and isolation (I), but not atomicity (A) and persistence (D).

Expired cache Key deletion policy

Lazy deletion + Expires dictionary periodically (scan the Expires dictionary periodically for a certain period of time), and expires infrequently (start a timer after a key has been set to expire).

Memory flushing strategy

In general, it is reasonable to use volatile- LRU or allkeys-lRU. Delete the least used key. If redis is used as the cache, volatile- LRU is used. If storage is used in addition to caching, use allkeys-lRU.

  • volatile-lru: Removes the least-used key from the data set that is set to expire
  • volatile-ttl: Deletes expired keys from the data set
  • volatile-random: Randomly selects keys from the data set that is set to expire
  • allkeys-lru: Selects the least used data from all data setsrecommended
  • allkyes-random: Randomly select data from all data sets
  • no-envicition: No elimination
Cache avalanche, breakdown, penetration

Redis cache avalanche, cache penetration, cache breakdown

Cache avalanche

A large number of cache failures occur at the same time, such as a cache machine down.

  • In advance:RedisHigh availability, master slave + Sentinel,Redis clusterTo avoid total collapse.High availability cluster
  • Issue: local ehcacheCache +hystrixLimit stream & degrade to prevent MySQL from being killed.Level 2 cache, degraded traffic limiting
  • After the event:RedisPersistent. Once restarted, data is automatically loaded from the disk and cached data is quickly recovered.AOF, RDB recovery
Cache breakdown

Frequently requests to query data that does not exist in the system.

  • Cache NULL policy. If the query result is NULL, the null result is still cached and the expiration time is not more than 5 minutes
  • Bloom filter, all possible data mapped to a large enough bitmap
  • In an asynchronous update policy, the key is returned regardless of whether the value is obtained. Value maintains a cache expiration time. If the cache expires, a thread is asynchronously started to read the database and update the cache. Cache preheating (loading the cache before starting the project) is required.

Google Bloom filter: Memory based, restart failure does not support large data volumes and cannot be used in distributed scenarios. Guava BloomFilter

Redis BloomFilter: scalable, does not have restart failures, requires network I/O, and has lower performance than google. redisson BloomFilter

Cache breakdown, hot data update problem

That is, a key is very hot, access is very frequent, in the case of centralized high concurrent access, when the key fails at the moment, a large number of requests will break through the cache, directly request the database, just like cutting a hole in a barrier.

  • If the cached data is almost never updated, you can set the hotspot data to never expire.
  • If the cache data is not updated frequently and the whole process of cache refresh is time-consuming, you can use based Redis,zookeeper Distributed mutex, or local mutex in distributed middlewareEnsure that only a small number of requests can request the database and rebuild the cache, while the rest of the threads can access the new cache after the lock is released.
  • If the cache data is updated frequently or the cache refresh process takes a long time, the timed thread can be used to actively rebuild the cache before the cache expires or postpone the expiration time of the cache to ensure that all requests can always access the corresponding cache.
Blom filter with delete function

BloomFilter

Bloom filter determines the existence of an element by determining whether the corresponding position is 1. However, if you want to delete an element, you cannot change 1 to 0 directly, because there may be other elements in this position. So if we want to support deletion, what should we do? The simplest approach is to add a counter, that is an array of each bit if doesn’t exist is zero, there are several elements will have specific number, not just one, so that there is a problem, originally put 1 is a can meet, but if you want to save the specific figures such as 2, that would require the two, So a Bloom filter with a counter takes up more space.

Double-write consistency between cache and database
Avoid serialization

Read and write requests are serialized into a memory queue. Serialization guarantees that no inconsistencies will occur, but it can also result in a significant throughput reduction of the system, requiring several times more machines than normal to support a single request on the line.

Cache Aside Pattern
  • If the cache does not exist, read the database, and then fetch the data and put it in the cache, and return the response.
  • When updating, update the database first and then delete the cache.
Why delete the cache instead of update it

The reason is simple. Many times, in the cache scenario of complex points, the cache is not just the value directly fetched from the database. For example, a field of a table may be updated, and then the corresponding cache needs to query the data of the other two tables and perform operations to calculate the latest value of the cache.

Also, updating the cache can be costly.

RedisHot Key problems and solutions

Hot Key Issues

The hot key problem is that suddenly there are hundreds of thousands of requests to access a particular key on Redis.

Found the problem

  • Make predictions based on experience: for example, if you know the start of an activity in advance, use this Key as a hot Key
  • Client collection: in operationRedisBefore we do statistics on the data
  • Capture packets for evaluation:RedisTCP protocol is used to communicate with the client, and the communication protocol is RESP, so it can intercept packets for parsing
  • In the proxy layer, for eachRedisRequest collection and report
  • RedisBuilt-in command query: Redis4.0.4 provides itRedis - cli - hotkeysYou can find the hot Key

The solution

Level 2 cache – When a hot key is found, it is loaded into the system JVM

Backup hot keys – There are 16384 Hash slots in the Redis cluster. The cluster uses the formula CRC16(Key) % 16384 to calculate which slot the Key belongs to. So the same Key should be calculated with the same value, how to distribute the Key to other machines? Just add a random number to it

Don’t let key go to the same redis. Select a random number from 0 to 2N (N is the number of clusters) → HotKey+ random number to generate a new Key

Const M = N * 2 // generate random number random = GenRandom(0, BakHotKey = hotKey + "_" + random data = redis.GET(bakHotKey) if data == NULL {data = GetFromDB(); Redis.SET(bakHotKey, expireTime + GenRandom(0,5))}Copy the code
A distributed lock
  • DB – based unique index.
  • Temporary ordered nodes based on ZK.
  • Based on theRedisNX EXParameters.

SETNX + SETEX (SET PX)

RedissionThe lock

Redission various lock

  • Lock (implemented in Redisson with RLock)
  • Fair lock (implemented in Redissonwith RedissonFairLock)
  • Read-write lock (implemented in Redissonwith RReadWriteLock)

In addition, Redissonimplements the following distributed locking mechanisms:

  • Redisson(implemented in Redissonwith RedissonRedLock)
  • Multi-lock (implemented in Redissonwith RedissonMultiLock)
RReadWriteLock

A ReadWriteLock maintains a pair of associated locks, one for read-only operations and one for writing. The read lock may be held simultaneously by multiple reader threads, so long as there are no writers. The write lock is exclusive. Works in non-fair mode. Therefore order of read and write locking is unspecified.

RedissonRedLockRed lock

Distributed lock, when we request a distributed lock, it succeeds, but the slave has not copied our lock yet, the master is down, and when our application continues to request the lock, it will request it from the slave that succeeded the master, and it will succeed.

The lock is acquired only if most of the master instances in Redis have acquired the lock. The specific red lock algorithm is divided into the following five steps:

  • Gets the current time in milliseconds.
  • Request locks on N nodes using the same key and random values. The lock attempt time is much less than the lock timeout, in case some master goes down, we are still trying to acquire locks and are blocked for too long.
  • The lock is considered successful only when the lock is obtained on most nodes and the total acquisition time is less than the lock timeout time.
  • If the lock is acquired successfully, the lock timeout is the total time it took to obtain the lock from the initial lock timeout.
  • If a lock acquisition fails, either because less than half of the nodes successfully acquired the lock or because the lock acquisition took longer than the lock release time, the key on the master will be removed.
Config config1 = new Config(); Config1. UseSingleServer (.) setAddress (" redis: / / 172.0.0.1:5378 "). The setPassword (" a123456 "). SetDatabase (0); RedissonClient redissonClient1 = Redisson.create(config1); Config config2 = new Config(); Config2. UseSingleServer (.) setAddress (" redis: / / 172.0.0.1:5379 "). The setPassword (" a123456 "). SetDatabase (0); RedissonClient redissonClient2 = Redisson.create(config2); Config config3 = new Config(); Config3. UseSingleServer (.) setAddress (" redis: / / 172.0.0.1:5380 "). The setPassword (" a123456 "). SetDatabase (0); RedissonClient redissonClient3 = Redisson.create(config3); */ RLock lock1 = redissonClient1.getLock(lockKey); RLock lock2 = redissonClient2.getLock(lockKey); RLock lock3 = redissonClient3.getLock(lockKey); /** * RedissonRedLock = new RedissonRedLock(lock1, lock2, lock2) lock3); Try {/** * 4. Try {/** * waitTimeout Specifies the maximum waiting time for a lock to be acquired. If this value is exceeded, the lock is considered to have failed. */ Boolean res = redlock. tryLock((long)waitTimeout, (long)leaseTime, timeUnit.seconds); If (res) {catch (Exception e) {throw new RuntimeException(" Aquire lock fail"); }finally{// Redlock. unlock(); }Copy the code
RedissonMultiLock

Distributed RedissonMultiLock objects based on Redis group multiple RLock objects and treat them as a lock. Each RLock object may belong to a different Redisson instance.

RLock lock1 = redissonInstance1.getLock("lock1"); RLock lock2 = redissonInstance2.getLock("lock2"); RLock lock3 = redissonInstance3.getLock("lock3"); RedissonMultiLock lock = new RedissonMultiLock(lock1, lock2, lock3); // locks: lock1 lock2 lock3 lock.lock(); . lock.unlock();Copy the code
Persist AOF,RDB?

Bgsave does mirror full persistence and AOF does incremental persistence. Because BGSave takes a long time, is not real-time enough, and can result in massive data loss during downtime, AOF is needed to work with it. When the Redis instance is restarted, memory is rebuilt using the BGSave persistence file and the aOF is used to replay the recent operation instructions to verify the state before the full recovery restart.

RDB

RDB file format is compact and convenient for data recovery. When saving RDB file, the parent process will fork out a child process to complete the specific persistence work, maximizing redis performance and restoring large data sets faster. Backup operation can only be triggered when manually submitting save command or close command.

AOF

AOF records each write operation to the server (default 1s write once), save data more complete, in redis restart is to replay these commands to recover data, high operation efficiency, less failure loss of data, but the file size is larger;

bgsaveThe principle of

The fork and the cow. Cow refers to copy on write. After the child process is created, the parent process shares the data segment, and the parent process continues to provide read and write services. The written page data is gradually separated from the child process.

Persistent scheme

AOF and RDB persistence

Need from RedisExample find a list of keys with a particular prefix out of thousands of keys?

SCAN vs KEYS

Keys is used to list all keys that meet the rules for a particular regular string.

Scan has the following characteristics compared with Keys:

  • The complexity is also O(n), but it is done in steps by cursor and does not block the thread;
  • The limit argument controls the maximum number of results returned at a time. Limit is just a hint of an incremental iteration command that returns as many or as few results as possible.
  • Like Keys, it provides pattern matching;
  • The server does not need to store state for the cursor. The only state of the cursor is the cursor integer returned by SCAN to the client.
  • It is important that the results returned may be duplicated and need to be repeated by the client;
  • If data is modified during traversal, it is uncertain whether the modified data can be traversed.
  • A single return that is empty does not mean the traversal is over, but that the returned cursor value is zero
RedisLocate the big Key

–bigkeys -i 0.1 can add a sleep parameter

# Scanning the entire keyspace to find biggest keys as well as # average sizes per key type. You can use -I 0.1 to sleep 0.1 SEC # per 100 SCAN commands (not usually needed). [00.00%] Biggest string found so far 'key316' with 3 bytes [00.00%] Biggest string found so far 'key7806' with 4 bytes [12.79%] Biggest zset found so far 'salary' with 1 members [13.19%] Biggest string found so far 'counter:__rand_int__' with 6 bytes [13.50%] Biggest hash found so far 'websit' Biggest set found so far 'BBS' with 3 members [14.67%] Biggest hash found so far 'website' with 3 Fields [30.41%] Biggest list found so far 'mylist' with 100000 items [95.53%] Biggest zset found so far 'page_rank' with 3 members -------- summary ------- Sampled 10019 keys in the keyspace! Total key length in bytes is 68990 (AVg len 6.89) Biggest string found 'counter: __rand_INt__ 'has 6 bytes Biggest list found 'mylist' has 100000 items Biggest set found 'bbs' has 3 members Biggest hash found 'website' has 3 fields Biggest Zset found 'page_rank' has 3 members 10011 strings with 38919 bytes (99.92% of keys, Avg size 3.89) 3 Lists with 100003 items (00.03% of keys, AVG size 33334.33) 1 sets with 3 members (00.01% of keys, Avg size 3.00) 2 hashs with 5 fields (00.02% of keys, AVg size 2.50) 2 zsets with 4 members (00.02% of keys, Avg size 2.00)Copy the code
Viewing Slow Logs

SLOWLOG subcommand [argument]

Subcommands include:

  • getUsage:slowlog get [argument]Gets the number of slow logs specified in the argument argument.
  • lenUsage:slowlog len, the total number of slow logs.
  • resetUsage:slowlog resetTo clear slow logs.
Exception message queue

lpush queueabc bytedance tencent netease souho 360

brpop queueabc 10

Blpop/brpop.

Both instructions are prefixed with the b character that stands for blocking.

Blocking reads go to sleep when there is no data in the queue and wake up as soon as the data arrives. The delay in resolution is almost zero

Delay queue

Using the zset command, sort with the set timestamp as score, using zadd score1 value1…. Command to keep producing messages into memory. Then zrangeBySocre is used to query all the tasks that meet the conditions to be processed, and the queue tasks can be executed through a loop. You can also query the earliest task through zrangebyscore key min Max withscores limit 0 1 to conduct consumption.

> zrangebyscore runoobkey  0 100 withscores
redis
1
mongodb
2
mysql
4
fhjaiukosdoj
6
juihuinik
8
fahseiyufh
11

Copy the code

Redission has an implementation. The RQueue interface provides the function of adding items to the queue on request. This feature can be used to implement a sending strategy in which the message delivery delay increases geometrically or decays geometrically

RQueue<String> distinationQueue = ... RDelayedQueue<String> delayedQueue = getDelayedQueue(distinationQueue); Delayedqueue. offer("msg1", 10, timeunit.seconds); Delayedqueue. offer("msg2", 1, timeunit.minutes);Copy the code

When the object is no longer needed, it should be actively destroyed. Active destruction is not necessary only if the associated Redisson object also needs to be closed.

RDelayedQueue<String> delayedQueue = ...
delayedQueue.destroy();
Copy the code
ZSetThe underlyingSkipListRealize the principle of

Zset uses two data structures to store an ordered set of the same elements while maintaining O(log(N)) complexity of insert and delete operations. The elements in Zset are added to a hash table that holds the Redis object-score mapping. At the same time, these elements are added to a skiplist that maps score to Redis objects (objects are sorted by score).

O(log n)

It’s like binary search, space for time.

SkipList

Redis ZsetusingskiplistIt’s not a reason to balance the tree

Skiplist is as complex as a red-black tree and much simpler to implement.

In a concurrent environment, red-black trees need to rebalance and delete as well as hop tables.

Cluster Deployment Experience

(1) It is best for the Master not to do any persistent work, such as RDB memory snapshots and AOF log files

(2) If the data is important, a Slave enables AOF backup, and the policy is set to synchronize data once per second

(3) For the speed of Master/Slave replication and connection stability, it is better for Master and Slave to reside in the same LAN

(4) Try to avoid adding slave libraries to the master library under great pressure

Master < -slave1 < -slave2 < -slave3… In this way, the single point of failure can be easily solved and the Slave can replace the Master. If the Master fails, you can immediately enable Slave1 as Master.