Redis dissects from the shallow to the deep

preface

Commonly used SQL database data are exist in the disk, although also made the corresponding in the underlying database caching to reduce database IO pressure, but as a result of database cache is generally according to the content of the query, and particle size is small, usually only when the data in the table without change, database cache can produce effect, However, this can not reduce the IO pressure of the operation of adding, deleting and modifying the database by business logic. Therefore, the cache technology emerges as The Times require. The technology realizes the cache of hot data, which can greatly relieve the pressure of the back-end database.

Mainstream Application Architecture

The client
Buffer layer
Through the query
The database

Cache middleware – The difference between Memcache and Redis

Memcache: The code layer is similar to Hash

1. Support simple data types 2. Do not support persistent storage 3. Sharding is not supported
Redis

1. Diversified data types 2. Persistent storage on data disks 3. Support the shard

Why is Redis so fast

Redis is very efficient, the official data is 100000+QPS(Query per second), this is because:

1.Redis is completely based on memory, and most requests are pure memory operations with high execution efficiency. 2. Redis using a single process (K, V) of the single thread model database, the data stored in memory, access are not limited to the hard disk IO, therefore its execution speed, single thread also can handle concurrent requests high, also can avoid frequent context switch and lock the competition, if want to multicore running multiple instances can also be started. 3. The data structure is simple and the data operation is also simple. Redis does not use tables, does not force users to associate various relationships, and does not have complex relationship restrictions. 4.Redis uses a multiplex I/O multiplexing model for non-blocking IO.

Note: the Redis USES multiplex I/O functions: epoll/kqueue/evport/select choose strategies: 1. The time complexity of O(1) I/O multiplexing function is preferred as the underlying implementation. 2. The time complexity of the select is O(n) because it traverses every I/O. 3. Monitor I/O events based on the React design mode.

Redis data type

String

The most basic data type, whose value can store up to 512 MEgabytes, is binary safe (Redis String can contain any binary data, including JPG objects, etc.).

Note: If the same key pair is repeatedly written, the last one will overwrite the previous one.
Hash

A dictionary of String elements suitable for storing objects.
List

List, sorted by String element insertion order. The order is last in first out. Because of its stack characteristics, it can achieve such functions as “latest news ranking”.
Set

An unordered set of String elements, implemented by hash table (add, delete, change and check time complexity is O(1)), does not allow repetition.

In addition, when we use smembers to iterate over elements in a set, the order is undefined, the result of a hash operation. Redis also provides the set of intersection, union, difference and other operations, can be implemented asCommon attention, common friends, and other functions.
Sorted Set

Order the members of the collection from smallest to largest by a fraction.
More advanced Redis types

HyperLogLog for counting, Geo for supporting storage of geographic location information.

Query a Key with a fixed prefix from a mass of keys

Suppose there are a billion keys in Redis. How do you find a fixed prefix among so many keys?
- Method 1: Use KEYS [pattern] : find all KEYS that match the given pattern
  
  Keys [pattern] command can be used to find all keys that match pattern conditions, but keys will return all keys that match pattern conditions at one time, so redis will be stuck. Assuming that Redis is in production at this time, using this command will cause hidden trouble. In addition, if all keys are returned at once, the memory consumption can be significant under certain conditions. Ex. :
  
  Keys test* // returns all keys prefixed with test
- Use SCAN cursor [MATCH pattern] [COUNT COUNT]
  
  MATCH pattern: criteria for querying keys Count: number of returned keys. SCAN is oneVernier based iteratorTo continue the previous iteration based on the previous cursor. SCAN to0As a cursor, a new iteration is started until the command returns cursor 0 to complete a walk. This command does not guarantee that a given number of elements will be returned each time it is executed. It may even return 0 elements. However, as long as the cursor is not 0, the SCAN command will not be considered complete, but the number of elements returned will most likely match the count parameter. In addition, SCAN supports fuzzy queries. Ex. :
  
  SCAN 0 MATCH test* COUNT 10 // 10 keys prefixed with test are returned each time

How to implement distributed locks through Redis

A distributed lock

Distributed lock is an implementation of a lock that controls the common access to shared resources between distributed systems. If a system or hosts on different systems share a resource, mutual exclusion is required to eliminate interference and ensure data consistency. The distributed lock must be solved as follows: 1. Mutual exclusion: Only one client can obtain the lock at any time. 2. Security: A lock can only be deleted by the client holding the lock, not by other clients. 3. Deadlock: The client that obtains the lock breaks down for some reason and cannot release the lock. Other clients cannot obtain the lock any more, resulting in a deadlock. 4. Fault tolerance: When each node, such as a Redis node, goes down, the client can still acquire or release the lock.
How to use Redis to implement distributed locks
- Use SETNX
  
  SETNX key Value: Creates and assigns the key if it does not exist. The time complexity of this command is O(1). If the command is set successfully, 1 is returned; otherwise, 0 is returned.
  
  Due to theSETNXThe instruction is simple to operate and isatomic, so in the early stage, it is often used as a distributed lock. When we apply it, we can use SETNX instruction before a shared resource area to check whether the setting is successful. IfSet up the successNo client is accessing the resource. IfSetup failedIf yes, a client is accessing the resource. In this case, the client needs to wait. But if I did it, it wouldThere is a problemBecause theSETNX is here to stayIf a client is accessing a resource and locks it, the lock still exists when the client finishes accessing the resource, and the laters cannot successfully obtain the lock. How to solve this problem?
  
  Due to theSETNX does not support passing in EXPIRE parameters, so we can use the EXPIRE directive directly to set the expiration time for a particular key.
  
  usage:EXPIRE key seconds
  
  Application:
```
RedisService redisService = SpringUtils.getBean(RedisService.class);
long status = redisService.setnx(key,"1");
if(status == 1){
  redisService.expire(key,expire);
  doOcuppiedWork();
}
Copy the code
```
  The problem with this program is that if an exception occurs on the second line and the program ends without setting the expiration time, the key will always exist, which means that the lock will always be held and cannot be released. The root cause of this problem is that atomicity cannot be satisfied. Resolution: As of Redis2.6.12, we can use the Set operation to fuse Setnx and EXPIRE together, as follows.
```
SET KEY value [EX seconds] [PX milliseconds] [NX|XX]
Copy the code
```
  EX second: Sets the expiration time of the key to second seconds. PX millisecond: Set the expiration time of the key to millisecond. NX: Sets the key only when the key does not exist. XX: Configures the key only when the key already exists. Note: OK is returned if the SET operation completes successfully, otherwise nil is returned.
  
  With SET we can implement a distributed lock in an application using code like the following:
```
RedisService redisService = SpringUtils.getBean(RedisService.class);
String result = redisService.set(lockKey,requestId,SET_IF_NOT_EXIST,SET_WITH_EXPIRE_TIME,expireTime);
if("OK.equals(result)") {doOcuppiredWork();
}
Copy the code
```

How to implement asynchronous queues

Use the List in Redis as the queue

Using the List in the Redis data structure mentioned above as a queue, Rpush produces messages and LPOP consumes messages.

At this point we can see that the queue uses the RPUSH production queue and the LPOP consumption queue. In the producer-consumer queue, prove when lPOP has no messageThere are no elements in the queue, and the producer has not had time to produce new data.

disadvantages: LPOP does not wait for a value to be in the queue before consuming, but consumes it directly.

Make up for: can be introduced at the application layerSleep mechanismTo call LPOP retry.
Use BLPOP key [key…] timeout

BLPOP key [key …] Timeout: blocks until the queue has a message or times out.

disadvantages: According to this method, the data after our production can only be provided to each single consumer

Can it be produced once and consumed by multiple consumers?
Pub/SUB: Theme subscriber mode

The sender (pub) sends the message and the subscriber (sub) receives the message. Subscribers can subscribe to any number of channels

Disadvantages of pub/ SUB mode:

Messages are published stateless and cannot be guaranteed to be reachable. For publishers, messages are “send or lose”. If a consumer goes offline when a producer publishes a message, it cannot receive the message once it is online again. To resolve this problem, professional message queues such as Kafka are used. I will not repeat it here.

Redis persistence

What is persistence

Persistence, in which data is persisted without being compromised by power outages or other complex external circumstances. Redis stores data in memory rather than disk, so if memory fails, the data stored in Redis will disappear immediately, which is often not expected by users, so Redis has a persistence mechanism to ensure data security.
How does Redis persist

Redis currently has two types of persistence modes, namely RDB and AOF. RDB implements persistence of data by saving snapshots of full data at a certain point in time. When data is recovered, the data is directly recovered through the snapshots in the RDB file.
RDB(Snapshot) Persistence: Saves a snapshot of full data at a point in time

RDB persistence keeps a snapshot of the full amount of data at that point in time at a specific interval. RDB configuration file: redis.conf:
```
  save 900 1 If one piece of data is written within 900 seconds, a snapshot is generated.
  save 300 10 If 10 data is written in 300 seconds, a snapshot is generated
  save 60 10000 If 10000 entries are written in 60 seconds, a snapshot will be created
  stop-writes-on-bgsave-error yes 
  # stop - writes - on - bgsave - error:If yes, the primary process stops accepting new writes when the backup process fails. This is to protect persistent data consistency.Copy the code
```
- RDB creation and loading
  
  SAVE: block the Redis server process until the RDB file is created. The SAVE command is rarely used because it blocks the main thread to allow the snapshot to be written, and since Redis uses a single main thread to receive all client requests, this blocks all client requests. BGSAVE: This directive forks a child process to create the RDB file without blocking the server process, the child process receives the request and creates an RDB snapshot, and the parent process continues to receive requests from the client. When the child process completes the creation of the file, it sends a signal to the parent process, and the parent process receives the signal from the child process through polling at a certain interval during the process of receiving the client request. We can also check whether the BGSave was successfully executed by using the lastSave directive, which returns the last time the BGSave was successfully executed.
- Automatic way to trigger RDB persistence
  
  1. Trigger timer according to redis.conf configuration SAVE (BGSAVE is actually used) 2. The primary node automatically triggers the master/slave replication. Run Debug Reload. 4. Run Shutdown and AOF persistence is not enabled.
- The principle of BGSAVE
  
  Start: 1. Check whether subprocesses are performing AOF or RDB persistent tasks. If so, return false. 2. Call the rdbSaveBackground method in Redis source code, which executes fork() to generate the child process to perform the RDB operation.
  
  3. About copy-on-write in fork()
  
  Fork () creates child processes in Linux using copy-on-write, i.eIf there are multiple caller also require the same resources (such as the data on the memory or disk storage), they will get the same pointer to the same common resources, until a caller tried to change the contents of the resources, the system will truly reproduce a copy of a dedicated to the caller, and the other caller can see the original resources still remain the same.
- RDB persistence modedisadvantages
  
  1. Full memory data synchronization. When a large amount of data is generated, I/O performance deteriorates. 2. The data generated from the current snapshot to the latest snapshot may be lost due to the Redis breakdown.
AOF(append-only-file) persistence: Saves the write status

AOF persistence records the database by saving the write state of Redis. In contrast to RDB, RDB persistence records the database by backing up the state of the database, while AOF persistence is the instructions received by the backup database. 1.AOF records all instructions that change database state except query. 2. Add incremental files to the AOF file.
Enable AOF persistence

1. Open the redis.conf configuration file and change the appendonly property to yes. 2. Modify appendfsync attribute, this property can receive the three parameters, are always, everysec, no, always say always instant will buffer content into AOF file, everysec said every two seconds to write buffer content into AOF file, No indicates that the writing operation is left to the operating system. Generally, the operating system waits for the buffer to fill up before writing the buffer data to the AOF file for efficiency reasons.
```
  appendonly yes

  #appendsync always
  appendfsync everysec
  # appendfsync no
Copy the code
```
Log rewrite solves the problem of growing AOF files

As the number of writes increases, AOF files get larger and larger. Assuming incrementing a counter 100 times, if you use RDB persistent way, as long as we keep the end result can be 100, and AOF persistent way need to record this 100 time increment operation instruction, and in fact to restore to the records, just need to execute a command, so that one hundred actual can reduce to a command. Redis supports the ability to rewrite AOF files without interrupting the foreground service, also using COW (copy-on-write). 1. Call fork() to create a child process. 2. The child process writes the new AOF to a temporary file, independent of the original AOF file. 3. The main process keeps writing new changes to both memory and the original AOF. 4. The main process obtains the completion signal of the child process rewriting AOF and synchronizes incremental changes to the new AOF. 5. Replace the old AOF file with the new AOF file.
Advantages and disadvantages of AOF and RDB

Advantages of RDB: Full data snapshot, small file size, quick recovery. Disadvantages of RDB: Data cannot be saved after the last snapshot. Advantages of AOF: High readability, suitable for saving incremental data, data is not easy to lose. Disadvantages of AOF: Large file size and long recovery time.
Rdb-aof hybrid persistence mode

This method of persistence was introduced after Redis4.0, with RDB as full backup and AOF as incremental backup, and is used as the default. In the above two methods, RDB mode writes full data to the RDB file, which is characterized by small file size and fast recovery, but cannot save the data after the latest snapshot. AOF saves redis instructions to the file, which causes large file size and long recovery time. In rDB-AOF mode, the persistence policy first writes all the data in the cache to the file in RDB mode, and then appends the new data to the RDB data in AOF mode. In the next RDB persistence mode, the AOF data is written to the file in RDB mode again. This method not only improves read/write and recovery efficiency, but also reduces file size and ensures data integrity. During the persistence of this strategy, the child process reads incremental data from the parent process through the pipe, and when the full data is saved in RDB format, the child process also reads data through the pipe without blocking the pipe. In this way, the first half of the persistence file is full data in RDB format and the second half is incremental data in AOF format. This method is currently recommended as a persistence method.

Redis data recovery

Recovery process when RDB and AOF files coexist

As can be seen from the figure, when Redis is started, it will check whether AOF exists first. If AOF exists, it will load AOF directly; if not, it will load RDB file directly.

Pineline

Pipeline is similar to Linux pipes in that it allows Redis to execute instructions in batches. Redis is based on a request/response model in which a single request is processed one by one. If a large number of commands need to be executed at the same time, each command needs to wait for the last command to be executed. In this way, not only RTT is added, but system IO is used many times. Because pipelines can execute instructions in batches, they can save multiple IO and request response round trips. However, if there are dependencies between instructions, it is recommended to send them in batches.

Redis synchronization mechanism

Master slave synchronization principle

Redis generally uses one Master node for write operation, while several Slave nodes for read operation. Master and Slave respectively represent different instances of RedisServer. In addition, a separate Slave is selected for regular data backup operation. This maximizes the performance of Redis in order to ensure weak and final data consistency. In addition, the data of the Master and Slave does not need to be synchronized immediately, but after a period of time, the data of the Master and Slave tends to be synchronized, which is the final consistency.
- Fully synchronous process
  
  1. The Slave sends sync to the Master.
  
  2. The Master starts a background process to save a snapshot of the data in Redis to a file.
  
  3. The Master caches the write commands received during the data snapshot.
  
  4. After writing the file, the Master sends the file to the Slave.
  
  5. Replace the old AOF file with the new AOF file.
  
  6. The Master sends the collected incremental write commands to the Slave.
- Incremental synchronization process
  
  1. The Master receives the operation command from the user and determines whether to transmit the command to the Slave.
  
  2. Append the operation record to the AOF file.
  
  3. Propagate operations to other slaves: 1. Align master Slave libraries; 2. Write instructions to the response cache.
  
  4. Send the cached data to the Slave.
Redis Sentinel

Disadvantages of master-slave mode: When the Master goes down, the Redis cluster cannot provide external write operations. Redis Sentinel could solve this problem. To resolve the problem of primary/secondary switchover after primary/secondary synchronization Master breakdown: 1. Monitoring: Check whether the primary/secondary servers are running properly. 2. Notification: Sends fault notifications to administrators or other applications through the API. 3. Automatic failover: Master/Slave switchover (After the Master breaks down, one Slave becomes Master and the other slaves synchronize data with the Master).

Redis cluster

Principle: How to quickly find what you need from mass data?
- shard
  
  Divide data according to certain rules and store it on multiple nodes. Reduce the stress on a single Redis server by distributing data across multiple Redis servers.
- Consistent Hash algorithm
  
  Since it is necessary to fragment data, the usual method is to obtain the Hash value of nodes and then calculate modulus according to the number of nodes. However, such method has obvious disadvantages. When the number of Redis nodes needs to be dynamically increased or decreased, a large number of keys cannot be hit. So consistent Hash algorithm is introduced in Redis. The algorithm modulo 2^32, the Hash value space into a virtual circle, the entire circle is organized clockwise, each node is 0, 1, 2… 2^32-1, then Hash each server to determine the address of the server on the Hash ring. After determining the address of the server, use the same Hash algorithm for the data to locate the data to the specific Redis server. If there is no Redis server instance located, continue the search clockwise and the first server found is the final server location of the data.
Data skew of the Hash ring

The Hash ring is prone to uneven server nodes when there are few server nodes. This causes data skew. Data skew means that cached objects are concentrated on one or more servers in the Redis cluster.

As shown in the figure above, most of the data calculated by the consistent Hash algorithm is stored on node A, while node B only stores A small amount of data. As time passes, node A will burst.

In response to this question, yesImporting virtual NodesTo solve. In simple terms, multiple hashes are computed for each server node, and one of these server nodes is placed at each resulting location, called a virtual node, which can be implemented by placing a number after the server IP address or host name.

For example, in the figure above, divide NodeA and NodeB into nodes A# 1-a #3 and NodeB# 1-b #3.

conclusion

This article (Tou) preparation (LAN) quite a long time, because some things always feel uncertain to write up, almost closed, even now sent out also feel that there are many places need to be changed. If anyone thinks something is wrong, comment section or private chat about me… Well, I don’t want you to think. I want me to think.

The picture of this article is from the Internet.

Welcome to visit my personal Blog: Object’s Blog

Redis dissects from the shallow to the deep

preface

Mainstream Application Architecture

Cache middleware – The difference between Memcache and Redis

Memcache: The code layer is similar to Hash

Redis

Why is Redis so fast

Redis data type

String

Hash

List

Set

Sorted Set

More advanced Redis types

Query a Key with a fixed prefix from a mass of keys

Suppose there are a billion keys in Redis. How do you find a fixed prefix among so many keys?

Method 1: Use KEYS [pattern] : find all KEYS that match the given pattern

Use SCAN cursor [MATCH pattern] [COUNT COUNT]

How to implement distributed locks through Redis

A distributed lock

How to use Redis to implement distributed locks

Use SETNX

How to implement asynchronous queues

Use the List in Redis as the queue

Use BLPOP key [key…] timeout

Pub/SUB: Theme subscriber mode

Redis persistence

What is persistence

How does Redis persist

RDB(Snapshot) Persistence: Saves a snapshot of full data at a point in time

RDB creation and loading

Automatic way to trigger RDB persistence

The principle of BGSAVE

RDB persistence modedisadvantages

AOF(append-only-file) persistence: Saves the write status

Enable AOF persistence

Log rewrite solves the problem of growing AOF files

Advantages and disadvantages of AOF and RDB

Rdb-aof hybrid persistence mode

Redis data recovery

Recovery process when RDB and AOF files coexist

Pineline

Redis synchronization mechanism

Master slave synchronization principle

Fully synchronous process

Incremental synchronization process

Redis Sentinel

Redis cluster

Principle: How to quickly find what you need from mass data?

shard

Consistent Hash algorithm

Data skew of the Hash ring

conclusion

Related Posts

Go Web Programming (ii) – Custom table names and JSONB type parameter parsing

Collection! Do you know all of these IDE tips

Use AOP for logging