Git: github.com/SoWhat1412/…

1. Basic types and underlying implementations

1.1, String

USES:

It is suitable for simple key-value storage, setNx key value realization of distributed lock, counter (atomicity), distributed global unique ID.

Bottom layer: String is represented by char[] array in C language, and char[] is encapsulated by SDS(Simple Dynamic String) in source code, which is the smallest unit of Redis storage. One SDS can store up to 512M of information.

struct sdshdr{
  unsigned int len; // Marks the length of char[]
  unsigned int free; // Marks the number of unused elements in char[]
  char buf[]; // A pit for storing elements
}
Copy the code

Redis generates RedisObject for SDS reencapsulation, and the core has two functions:

  1. Indicate which of the five types it is.
  2. It has Pointers to SDS

When you set name soWHAT, Redis actually creates two RedisObject objects, RedisObject for the key and RedisOjbect for the value where type = REDIS_STRING, SDS stores the name and soWHAT strings respectively.

And Redis has the following optimization for SDS:

  1. If the SDS size is greater than 1 MB, the system allocates more space for SDS modificationSpace preallocation.
  2. SDS isInert release spaceYes, you free up space, but the system records the data and you can use it the next time you want. Don’t apply for new space.

1.2, the List

View the underlying source codeadlist.hIt turns out that the bottom layer is aDouble side chain table, the maximum length of the list is 2^32-1. These are the only combinations we use.

Lpush + LPOP = stack first out of the stack

Lpush + RPOP = queue FIFo queue LPUSH + LTRIM = capped collection LPUSH + BRPOP = Message Queue

It can be used for simple message queues, and may use unique compressed lists to improve performance when data volumes are small. RabbitMQ, ActiveMQ, etc

1.3, the Hash

Hashes are great for storing related data together, such as a user’s shopping cart. There are plenty of them for everyday use.

Just to be clear: Redis has only one K and one V. K is definitely a String object, and V can be a String, List, Hash, Set, or ZSet.

The bottom layer of hash mainly adopts the dict structure and presents layer encapsulation. From small to large:

1.3.1, dictEntry

Real data nodes, including key, value, and Next nodes.

1.3.2, dictht

Data dictEntry array, each array item may refer to a linked list.

2, array length size. Sizemask = size-1 4. The total number of nodes in the current dictEntry array.

1.3.3, dict

DictType, which includes some custom functions that enable keys and values to be stored

2. Rehashidx is actually a marker. If it is -1, it indicates that there is no expansion currently; if it is not -1, the expansion location is recorded. Dictht array, two Hash tables. Iterators record iterators in progress for the current dictionary

The composition is as follows:

1.3.4 Gradual expansion

Why are dictht HT [2] two? The purpose is to slowly transfer data from HT [0] to HT [1] without affecting the CURD of the front end during capacity expansion. Meanwhile, rehashIndex is used to record the transfer situation. When all transfer is completed, HT [1] will be changed to HT [0] for use.

Rehashidx = -1 indicates that the capacity is not expanded. Rehashidx! = -1 indicates the number of rows in the array.

The enlarged array size is the minimum 2 to the NTH power greater than used*2, similar to a HashMap. Then, the value of rehashidx is adjusted by traversing the number group one by one. For each dictEntry[I], the linked list is traversed one by one to remap the data Hash into Dictht [1]. And dictht[0]. Use and dictht[1]. Use are dynamic changes.

The whole point of this whole process isrehashidx, which is the subscript position of the first array being moved. If the current memory is insufficient or the operating system is busy, the expansion process can be stopped at any time.

What would it look like to operate on that object after it’s stopped?

1, If you add a new array, add the second array directly, because if you add a new array to the first array, you still need to move it over, there is no need to waste time. 2, If you delete, update, query the first array, then query the second array.

1.4, the Set

If you understand that a HashSet in Java is a shortened version of a HashMap, then this Set should also be understood. It’s all the same thing. Here you can think of it as a Dict with no Value. Look at the source t.sect.c can understand the essence.

int setTypeAdd(robj *subject, robj *value) {
    long long llval;
    if (subject->encoding == REDIS_ENCODING_HT) {
         DictAdd = dictAdd = dictAdd = dictAdd = dictAdd
         if (dictAdd(subject->ptr,value,NULL) == DICT_OK) {
            incrRefCount(value);
            return 1; }...Copy the code

1.5, ZSet

The natural enemy of range search is ordered set. If you look at the underlying redis. H, you will find that Zset uses skip lists comparable to binary trees to achieve order. A jumper is a combination of multi-layer linked lists. The jumper is divided into many levels, and each level can be regarded as the index of data. The significance of these indexes is to speed up the search speed of the jumper.

The data of each layer is ordered. The data of the upper layer is a subset of the data of the next layer, and the first layer (Level 1) contains all the data. The higher the level, the greater the jump and the less data it contains. And insert a random piece of data and whether or not that data is going to be a hop table index is completely random just like playing dice.

A hop table contains a header that looks up data from top to bottom and from left to right. Now take a node with a value of 37 as an example to illustrate a skip list versus a general linked list.

  1. No hop table query

For example, if I query data 37 without the above index, the route is as follows:3. The route of query 37 with hop table is shown as follows:Application Scenarios:

Leaderboards, time sorted news, delay queues.

1.6, Redis Geo

Write beforeRedis Geo core principle analysis, want to see the direct jump can. His core idea is to treat the earth approximately as a sphere, and then GEO uses GeoHash to convert the latitude and longitude of two dimensions into strings, so as to realize the division of positions and the query of specified distances.

1.7, HyperLogLog

HyperLogLog: A probabilistic data structure that uses probabilistic algorithms to count the approximate cardinality of a set. The origin of its algorithm is Bernoulli process + buckets + harmonic mean. Concrete implementation can see HyperLogLog explanation.

Function: very useful for error tolerance cardinal statistics (cardinality is the number of different values in a set), each HyperLogLog key can calculate the cardinality of close to 2^64 different elements with a size of only 12KB. The error rate is about 0.81%. So if it’s used for UV statistics, it’s good.

The bottom layer of HyperLogLog consists of 2^14 buckets, or 16,384 buckets. Registers the size of the bucket by concatenating the size of the bucket to the same size as the size of the bucket. This can be achieved by concatenating the size of the bucket to the same size as the size of the bucket. Finally, 16384*6/8/1024 = 12KB.

1.8, the bitmap

The original meaning of a BitMap is to map the state of an element in a single bit. Since a bit can only represent 0 and 1 states, bitmaps can map only a limited number of states, but the advantage of using bits is that they can save a lot of memory space.

In Redis, BitMap is implemented on the basis of string type. Bitmaps can be imagined as an array in bits, and each unit of the array can only store 0 and 1. The subscript of the array is called offset in Bitmaps, and the upper limit of the offset value of BitMap2^32 – 1.

  1. User sign in

Key = Year: user ID offset = (Today is the day of the year) % (days of the year)

  1. Counting active users

Use the date as the key, and set the user ID to offset to 0, 1.

PS: Redis uses RESP(Redis Serialization Protocol), an application layer Protocol based on TCP.

1.9, Bloom Filter

The judgment result obtained by using Bloom filter: what does not exist must not exist, and what does exist may not exist.

Principle of Bloom filter:

When an element is added to the collection, K hash functions map the element to K points in a bit array (effectively reducing the probability of collisions), setting them to 1. If any of these points is 0, the element must not be there. If both are 1, the element being checked is most likely in. That’s the basic idea of a Bloom filter.

You can use Google if you want to playguavaBag to play.

1.10 Publishing and Subscription

Redis providesPublish and subscribeThe messaging mechanism of a pattern in which message subscribers do not communicate directly with publishers, but publishers publish messages to a specified channel, and each client subscribing to that channel can receive messages. Nothing compared to professional MQ(RabbitMQ RocketMQ ActiveMQ Kafka), this feature is a ball.

2. Persistence

Because Redis data in memory, power loss, so persistent to disk is a must, Redis provides RDB and AOF two modes.

2.1 and RDB

RDB persistence mechanism is to periodically persist data in Redis. It’s better for cold preparation. Advantages:

1. Compressed binary text, suitable for backup, full copy, used for disaster recovery loading RDB recovery data much faster than AOF mode, suitable for large-scale data recovery.

2. RDB is a good choice if the business does not require high data integrity and consistency. Data recovery is faster than AOF.

Disadvantages:

1. The RDB is a periodic snapshot file. Data integrity and consistency are not high because the RDB may have crashed during the last backup.

2, the backup takes up memory, because Redis will fork a child process during the backup, write data to a temporary file (at this time the memory is twice as much as the original oh), and then replace the temporary file with the previous backup file. So you have to account for about twice the data expansibility.

Note manual trigger and COW:

1. SAVE calls rdbSave directly, blocking the main Redis process, resulting in service failure. 2. BGSAVE forks a child process that calls rdbSave and sends a signal to the main process when the save is complete. Client requests can continue to be processed during BGSAVE execution. 3. Copy On Write mechanism backs up the data in the memory at the beginning and only copies the modified memory page data, not all the memory data.

4. When copying On Write, the parent and child processes perform a large number of Write operations, resulting in paging errors.

2.2, AOF

The AOF mechanism logs every write command and writes it to a log file in appends-only mode. Since this mode is append only, there is no overhead of disk addressing, so it is fast, a bit like Mysql’s binlog. AOF is more suitable for hot standby.

Advantages:

AOF is used once a second through a background thread fsync operation without fear of data loss.

Disadvantages:

1. AOF files are usually larger than RDB files for the same number of data sets. RDB can recover large data sets faster than AOF.

2. According to different synchronization strategies, AOF is usually slower than RDB in running efficiency. In summary, the synchronization per second strategy is relatively efficient.

The whole AOF process is divided into two steps: the first step is the real-time writing of the command, and different levels may lose 1 second of data. Commands are appended to aOF_buf first and then synchronized to the AO disk. If the commands are written to the disk in real time, the DISK I/O is very high and the overall performance is affected.

The second step is for the AOF filerewrite, to reduce the size of AOF files, can be triggered automatically or manually (BGREWRITEAOF), is Fork out child process operation, during the Redis service is still available.

1. During the rewrite, as the main process is still responding to commands, to ensure the integrity of the final backup; It will still be written to the old AOF, ensuring that data is not lost if the rewrite fails.

2. A BUF is also reserved for the child process to prevent data loss in the newly written file in order to write the written information to the new file. 3, rewrite is to directly generate the current memory data corresponding commands, do not need to read the old AOF file for analysis, command merge. 4, both RDB and AOF first write a temporary file, and then rename to complete the file replacement.

Suggestions for Fork:

1. Reduce the frequency of forking, such as manually triggering RDB snapshots and AOF overrides;

2. Control the maximum memory used by Redis to prevent long fork time; 3. Configure an awesome point and reasonably configure Linux memory allocation strategy to avoid fork failure due to insufficient physical memory. 4. The load factor of the hash table is >=5 when Redis runs BGSAVE and BGREWRITEAOF, but >=1 when it does not run BGSAVE and BGREWRITEAOF. The goal is to minimize write operations and avoid unnecessary memory writes. 5, the expansion factor of the hash table: the number of nodes saved in the hash table/the size of the hash table. Factor determines whether to extend the hash table.

2.3, restore

On startup, the AOF(more complete data) file is checked to see if it exists, and if not, the RDB is tried to load.

2.4, advice,

Since using RDB alone would lose a lot of data. If AOF is used alone, data recovery is not as fast as RDB, so if there is a problem, RDB will be used in the first time, and then AOF will do data completion.

3. Why is Redis so fast

3.1 Memory based Implementation:

Data is stored in memory, which is hundreds of times faster than disk IO operations.

3.2 Efficient data structure:

The underlying data structure of Redis supports different data types, such as HyperLogLog, which does not want to waste even 2 bytes.

3.3 Rich and reasonable coding:

Redis provides rich and reasonable encoding, with five data types adapted to different encoding formats according to length and the number of elements.

String: Automatically stores int types. Non-int types are encoded in raw.

List: Ziplist encoding is used if the string length is less than a certain range, otherwise it is converted to LinkedList encoding. Hash: The Hash object stores the internal key and value string whose length is less than a certain value and key-value pair. Set: Saves the elements as integers and uses intset encoding when the number of elements is less than a certain range. If any condition is not met, uses Hashtable encoding. Zset: Ziplist encoding is used when the number of stored elements is less than the fixed value and the member length is less than the fixed value. Skiplist encoding is used if any condition is not met.

3.4 Appropriate threading model:

The I/O multiplexing model also listens for client connections, and multithreading requires context switching, which can be fatal for in-memory databases.

3.5. Introduced after Redis6.0multithreadingSpeed:

The bottleneck of Redis mainly lies in the IO consumption of the network. Optimization mainly has two directions:

To improve network IO performance, typical implementations such as using DPDK instead of the kernel network stack

Take advantage of multiple cores using multiple threads, typically implemented as Memcached.

This way of stack optimization has little to do with Redis, supporting multi-threading is the most effective and convenient way to operate. So Redis supports multithreading for two main reasons:

It can make full use of server CPU resources. Currently, the main thread can only use one core

Multithreading tasks can share the load of Redis synchronous IO reads and writes

About multithreading:

  1. Redis version 6.0 default multithreading is turned off with Io-threads-do-reads no
  2. When multi-threading is enabled in Redis 6.0, the number of threads should be set carefully.
  3. Multithreading can double performance, but multithreading is only used to handle network data reading and writing and protocol parsing, and commands are still executed in single-thread order.

4. Frequently asked Questions

4.1 cache avalanche

Avalanche definition:

The failure of a large number of keys in Redis at the same time caused all requests to be called to MySQL. And MySQL could not withstand the massive collapse.

Avalanche solutions:

1. Add a random value to the expiration time of cache data to prevent a large number of data expiration at the same time. 2. If the cache database is distributed, distribute the hotspot data evenly among different cache databases. 3. Set the hotspot data to never expire.

4.2. Cache penetration

Through definition:

Cache penetration refers to data that is not in the cache or the database. For example, if the default ID is >0, hackers keep requesting data with ID= -12, then the database will be under too much pressure, and the database will crash seriously.

Penetrating solutions:

1. Add user authentication verification and parameter verification at the back-end interface layer. 2. If the number of accesses per second of a single IP address exceeds the threshold, THE IP address is directly shielded, and I am locked in a small black room for 1 day. I was shielded when I obtained the IP proxy pool. 3. If the data cannot be cached or retrieved from the database, the key-value pair can also be written as key-null. The expiration time is 15 seconds to prevent malicious attacks. 4. The Bloom Filter feature provided by Redis is also OK.

4.3 Cache breakdown

Breakdown definition:

Symptom: The hot key is accessed in a large number of concurrent requests. When the key fails, the continuous large concurrent requests Pierce the cache and directly request the database.

Breakdown solution:

Setting hotspot data to never expire and adding mutex will do the trick

4.4 Double-write Consistency

Dual-write: How to ensure data consistency when both the cache and the database update data?

1. Update the database first, then update the cache

Safety: Thread A updates the database -> thread B updates the database -> Thread B updates the cache -> Thread A updates the cache. Cause dirty read. Business scenario: Read more than write less scenario, frequently update the database and cache is useless. Moreover, if the cache is superimposed, the result will waste more performance.

Delete the cache first, and then update the database

A requests A write to update the cache. B writes to the cache after discovering that the cache does not query the old data. When user A writes data to the database, the cache is inconsistent with the database.

Therefore, FackBook proposes Cache Aside Pattern

Invalid: The application retrieves data from the cache. If the application fails to retrieve data, it retrieves data from the database and puts it in the cache.

Hit: An application retrieves data from the cache and returns it. Update: Save the data to the database, and then invalidate the cache after success.

4.5, fissure

Split brain refers to that the master node, slave node and Sentinel cluster are in different network zones due to network reasonsCan’t perceiveTo the existence of the master, so the slave node is promoted to the master node where there are two different master nodes like a brain split into two. In fact inHadoopSparkThis happens all over the cluster, but the solution is different (ZK with forced kill).

In the cluster split problem, if the client continues to write data based on the original master node, the new master node will not be able to synchronize the data. When the network problem is resolved, the Sentinel cluster will reduce the original master node to slave node. Synchronizing data from the new master at this point will result in massive data loss.

The Redis solution is that there are two parameters in the Redis configuration file

Min-replicas-to-write 3 indicates the minimum number of slaves connected to the master. Min-replicas-max-lag 10 indicates the maximum latency for the slave to connect to the masterCopy the code

If the number of slaves connected to the master < the first parameter and the ping delay <= the second parameter then the master will reject the write request, After these two parameters are configured, if cluster brain split occurs, the original master node will reject the write request received by the client, thus reducing data loss after data synchronization.

4.6, transaction

MySQLIn the transaction is still quite a lot of road, but in Redis transaction as long as the following three steps:Specific conclusions of the business:

1. Redis transaction is a one-time, sequential and exclusive execution of a series of commands in a queue.

2, Redis transactions have no concept of isolation level: batch operations are placed in the queue cache before sending EXEC commands, and will not be actually executed, so there is no query in the transaction to see the update in the transaction, and the query outside the transaction cannot see the update. 3. Redis does not guarantee atomicity: Single commands in Redis are executed atomically, but transactions are not guaranteed atomicity. 4. Redis compilation error: all codes in transaction are not executed and instructions are used incorrectly. Exceptions are caused by incorrect commands. Other commands can be executed normally. 5. Watch instruction is similar to optimistic lock. When the transaction is submitted, if the value of any KEY in the multiple keys monitored by Watch has been changed by other clients, the transaction queue will not be executed when the transaction is executed by EXEC.

4.7 Common clients

  1. Jedis: Classic tool that provides comprehensive Redis operation instructions.
  2. Redisson: Provides distributed operations and extensible Java data structures with distributed locks and collections.
  3. Lettuce: Based on Netty implementation, the bottom layer is asynchronous, interested can try.

4.8 Cache Preheating and Degradation

Cache preheating:

After the system goes online, load relevant cache data into the cache system to prevent MySQL from being under too much pressure.

Cache degradation:

When the cache fails or becomes inaccessible, do the right thing by not overloading the database and returning the default value or setting the default value.

4.9 correct development steps

Pre-launch: Redis high availability, master/slave + Sentinel, Redis Cluster, avoid total crash.

When online: Local EhCache + Hystrix traffic limiting + degrade to prevent MySQL from being overwhelmed. After going online: Redis persistence uses RDB + AOF to ensure that data is automatically loaded from disk after breakpoint and cached data is quickly recovered.

5. Distributed locks

Daily development we can use synchronized, Lock to achieve concurrent programming. But locks in Java are only guaranteed to be executed within the same JVM process. What about locks in a distributed cluster environment? There are generally two options available on a daily basis.

5.1 Zookeeper implements distributed locks

There are some basic ZooKeeper facts you need to know:

1. Persistent nodes: ZK does not remove persistent nodes when clients disconnect

Sequential nodes: Sequential nodes are automatically followed by numbers like 0000001. Notification of node changes: when the client registers to listen for node changes, the callback method is called

The general flow is as follows, noting each nodeonlyMonitor the node state in front of it to avoidHerd behaviour. About template code Baidu can.Disadvantages:

Frequent creation and deletion of nodes and the registration of Watch events put great pressure on The ZooKeeper cluster, and the performance is not as good as the distributed lock implemented by Redis.

5.2 Redis implements distributed locking

The principle itself is relatively simple, Redis itself is a single-thread processor, with the characteristics of mutual exclusion, through setNX, exist and other commands can complete the simple distributed lock, handle the logic of timeout release lock.

SETNX

SETNX is short for SET if Not eXists. The everyday instruction is SETNX key value, which returns 1 if the key does Not exist and 0 if the key already eXists.

SETEX

SETEX key seconds value expresses the number of seconds to which the value value is associated with the key. If the key already exists, the setex command overwrites the old value. And SEtex is an atomic operation.

Lock:

This is usually done with a unique string such as UUID and SETNX.

Unlock:

LUA is guaranteed to be atomic. The idea is to check whether the Key is equal to the input parameter. If so, delete the Key and return success 1,0, failure.

Disadvantages:

This lock is not reentrant, and their solid words all corners should be considered, so understand a general idea flow can be, engineering or open source toolkit on the line.

5.3 Redisson implements distributed locking

Redisson is a service based on Redis, using Netty framework based on NIO, not only can be used as Redis underlying driver client, but also can be native RedisHash, List, Set, String, Geo, HyperLogLog and other data structures are encapsulated into the most familiar Map, List, Set, Object Bucket, Geospatial Bucket in Java. Cardinality estimation algorithm (HyperLogLog) and other structures.

Here we are just using a few instructions about distributed locking, which is the basic principle:

Redisson locks and unlocksThe general flow chart is as follows:

6. Redis expiration strategy and memory elimination strategy

6.1 Expiration strategy of Redis

There are three expiration policies in Redis:

1. Timing and expiration:

You need to create a timer for each key that is set to expire. When the key expires, the timer is deleted immediately. This policy can immediately clear expired data and is memory friendly; However, it takes up a lot of CPU resources to process expired data, which affects the response time and throughput of the cache.

2, inert expiration:

Only when a key is accessed, the system checks whether the key has expired. This strategy maximizes CPU savings but is very memory unfriendly. In extreme cases, a large number of expired keys may not be accessed again and thus will not be cleared, occupying a large amount of memory.

3. Regular expiration:

At certain intervals, a certain number of keys in the expires dictionary of a certain number of databases are scanned and expired keys are cleared. This strategy is a compromise between the first two. By adjusting the interval of periodic scan and the time limit of each scan, you can achieve the optimal balance of CPU and memory resources in different situations. The Expires dictionary holds expiration data for all keys with an expiration date set, where key is a pointer to a key in the key space and value is the expiration date represented by the millisecond precision UNIX timestamp for that key. The key space refers to all the keys stored in the Redis cluster.

Redis uses an expiration strategy: lazy deletion + periodic deletion. Expiration strategy used by memcached: lazy deletion.

6.2 six memory elimination strategies

Redis’s memory flushing strategy refers to how to deal with data that needs to be written and require additional space when Redis has insufficient memory for cache.

Volatile – lRU: Discard the least recently used data from a set with an expiration date (server.db[I].expires)

Volatile – TTL: selects an expiring data from a set (server.db[I].expires) to be discarded Select the least recently used data from the dataset (server.db[I].expires). Select data from a dataset (server.db[I].dict) and discard it.

Interview often ask often test is LRU, we are familiar with the LinkedHashMap also implemented LRU algorithm, the implementation is as follows:

class SelfLRUCache<K.V> extends LinkedHashMap<K.V> {
    private final int CACHE_SIZE;
    /** * The maximum amount of data that can be cached *@paramCacheSize cacheSize */
    public SelfLRUCache(int cacheSize) {
  // True allows the linkedHashMap to be sorted in order of access, with the most recent in the header and the oldest in the tail.
        super((int) Math.ceil(cacheSize / 0.75) + 1.0.75 f.true);
        CACHE_SIZE = cacheSize;
    }
    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        // When the amount of data in the map exceeds the specified number of caches, the oldest data is automatically deleted.
        returnsize() > CACHE_SIZE; }}Copy the code

6.2,

The selection of Redis memory flushing strategy does not affect the processing of expired keys. The memory flushing policy is used to process the data that needs to apply for extra space when the memory is insufficient, and the expiration policy is used to process the cache data that has expired.

7. Redis cluster is highly available

Single machine problems include machine failures, capacity bottlenecks, and QPS bottlenecks. In practical applications, Redis multi-machine deployment involves Redis master-slave replication, Sentinel Sentinel mode and Redis Cluster.

model advantages disadvantages
Stand-alone version The architecture is simple and the deployment is convenient Machine failure, capacity bottleneck, QPS bottleneck
A master-slave replication High reliability, read and write separation Fault recovery is complicated, and the write and storage of the master library is limited by single machine
Sentinel sentry Cluster deployment is simple and HA The slave mechanism is cumbersome, and resources are wasted. Therefore, the SLAVE cannot solve the read/write separation problem
Redis Cluster Dynamic data storage solT, scalable, highly available The client dynamically detects back-end changes and supports batch operations

7.1 Redis master/slave Replication

This mode provides high availability and read/write separation. Incremental synchronization and full synchronization are adopted.

7.1.1. Full Synchronization

Redis full copy generally occurs inSlave Initialization phaseIn which case the Slave needs to add theAll the dataMake a copy of each:

1. Connect the slave to the master and send the psync command.

2. After receiving the psync name, the master starts executing the bgsave command to generate the RDB file and uses the buffer to record all subsequent write commands. 3. The master sends the snapshot file to the slave and records the write commands executed during the sending. 4. After receiving the snapshot file, slave discards all old data and loads the received snapshot. 5. After the snapshot is sent, the master sends write commands in the buffer to the slave. 6. The slave completes loading the snapshot, receives command requests, and executes write commands from the master buffer.

7.1.2 Incremental Synchronization

Also known as instruction synchronization, is from the library to the master library to perform instructions. Redis stores instructions in a ring queue. Because memory capacity is limited, it is impossible to store instructions in all memory if the standby machine is never up, that is, if the standby machine is never synchronized, the instructions may be overwritten.

Redis incremental replication refers to the process of synchronizing write operations performed by the master to the Slave when the Slave starts to work properly after initialization. The process of incremental replication is that each write command executed by the master sends the same write command to the slave.

7.1.3 Redis Master/slave Synchronization Policy:

1. Perform full synchronization when the master and slave are first connected. After full synchronization is complete, incremental synchronization is performed. Of course, slave can initiate full synchronization at any time if needed. The Redis policy is to try incremental synchronization first anyway, and to require full synchronization from the slave machine if this fails. 2. Do not be afraid if the slave loses the connection when synchronizing the master data. The slave loses the connection after reconnecting. 3, Generally through master/slave to achieve read/write separation, but if the master fails, how to ensure the HA of Redis? Sentinel was introduced for master selection.

7.2 Highly available Sentinel mode

Redis-sentinel itself is an independent running process. Generally, the sentinel cluster has at least three nodes and an odd number of nodes. It can monitor multiple master-slave clusters. Sentinel can monitor any number of primary servers and secondary servers under the primary server and automatically failover when the monitored primary server goes offline. Note here that Sentinel also has single-point-of-failure issues. A general list of sentry uses:

Cluster monitoring: Monitors the master and slave nodes periodically. Message notification: When it sees a redis instance failing, it sends a message to the administrator for failover: this is classified as subjective offline (a single sentry detects a master failure). Objective logoff (switching begins when multiple sentinels make decisions and discover that quorum has been reached). Configuration center: If a failover occurs, it tells the client to write the master’s new address to the configuration center.

7.3, Redis Cluster

RedisCluster is a distributed solution of Redis, which was launched after version 3.0 and effectively solves the needs of Redis distributed.

7.3.1. Zoning Rules

Common partitioning rules

  1. Node to take more than: hash (key) % N
  2. Consistency hashing: consistent hash ring
  3. Virtual slot hash[key] : CRC16 & 16383

RedisCluster uses virtual slot partitioning. The implementation details are as follows:

1. Adopting the idea of decentralization, it uses the virtual slot SOLt partition to cover all nodes, and the process of taking data is the same. Lightweight protocol communication between nodes is used to reduce bandwidth consumption, so the performance is very high.

2, automatic load balancing and high availability, automatic failover and support dynamic expansion, the official has played to 1000 nodes to achieve low complexity. 3. Each Master also needs to be configured as a Master/slave, and internal sentry mode is also adopted. If half of the nodes find an abnormal node, they jointly decide to change the abnormal node state. 4. If the master in the cluster does not have a slave node, the cluster fails because slot mapping is incomplete. If more than half of the masters in a cluster fail, the cluster enters the Fail state. 5. It is recommended that at least three master nodes be deployed in a cluster.

8. Redis current limiting

Often take the Beijing West Two Banner subway or take in Beijing West Railway Station often encounter a situation is that if there are a lot of people, the subway staff take a small card in front of the first gear to let you wait for a later check-in, this is the actual life to deal with the huge flow of people measures.

When developing high-concurrency systems, there are three powerful tools to protect the system: caching, degradation, and limiting traffic. So what is flow limiting? As the name implies, limiting the flow is to limit the flow, just like you broadband packet 1 GIGAByte of data, used up. Through current limiting, we can control the QPS of the system well, so as to achieve the purpose of protecting the system.

1. Setnx and Zset based on Redis

1.2, setnx

For example, if we need to limit 20 requests within 10 seconds, we can set the expiration time of setNx to 10. When the number of setNx requests reaches 20, the flow limiting effect will be achieved.

Disadvantages: For example, the statistics of 1-10 seconds cannot be counted within 2-11 seconds. If we need to count M requests within N seconds, we need to keep N keys in Redis and other problems.

1.3, zset

In fact, the most important thing involved in current limiting is sliding window. It is also mentioned above how 1-10 becomes 2-11. So it’s going to be +1 for both the starting value and the end value. We can make requests into a Zset array, where each request comes in with a unique value that can be generated as a UUID, and score that can be represented as the current timestamp, because score is used to calculate how many requests are in the current timestamp. The zset data structure also provides a range method that makes it easy to get how many requests there are in the two timestamps,

Cons: Zset data structures get bigger and bigger.

2. Leaky bucket algorithm

Leaky bucket algorithm idea: water is compared to the request, leaky bucket is compared to the system processing capacity limit, water first into the leaky bucket, the water in the leaky bucketIt flows out at a certain rate, when the outflow rate is less than the inflow rate, the subsequent inflow water directly spills out (rejects the request) due to the limited leakage bucket capacity, thus achieving flow limiting.

3. Token bucket algorithm

The principle of the token bucket algorithm: it can be understood as the registration of the hospital to see a doctor, only after getting the number can be diagnosed.

Detailed process:

1. All requests require an available token before they can be processed.

2. Add the token to the bucket at a certain rate according to the traffic limiting size. 3. Set the maximum bucket capacity. When the bucket is full, newly-added tokens are discarded or rejected. 4. After the request is reached, the token in the token bucket should be obtained first, and then other business logic can be carried out with the token. After the business logic is processed, the token can be deleted directly. 5. Token buckets have a minimum limit. When the number of tokens in the bucket reaches the minimum limit, tokens will not be deleted after the request is processed to ensure sufficient traffic limiting.

Engineering:

1, custom annotation, AOP, Redis + Lua to achieve flow limiting. 2. Guava’s RateLimiter implementation is recommended.

9, Common knowledge

  1. String fuzzy queryKeysMay cause the thread to blockscanThe command performs a non-blocking fetch of data and then de-plays.
  2. Remember to use it for multiple operationspipeLineSend all commands at one time to avoid network overhead caused by frequent sending and receiving and improve performance.
  3. Bigkeys can scan for large keys in Redis. The scan command is used to scan all keys and run STRLEN, LLEN, SCARD, HLEN, ZCARD to obtain the length or number of elements of each key according to its type. The disadvantage is that the online trial and the number of not necessarily large space,
  4. Online applications remember to enable Redis slow query log oh, the basic idea is similar to MySQL.
  5. Redis because of the memory allocation strategy and add or delete data is causedMemory fragmentsYou can either restart the service or execute itactivedefrag yesMemory reordering resolves this problem.

Ratio >1 indicates that there are memory fragments. A larger Ratio indicates more serious memory fragments. < 1 indicates that virtual memory is being used. Virtual memory is actually hard disk and has much lower performance than memory, so the machine’s memory should be enhanced to improve performance. In general, mem_fragmentation_ratio values between 1 and 1.5 are healthy.

10 and the End

About Redis first blowing force so much (originally wanted to write seconds kill, afraid to write too long), if you feel not enough to see the price.

Previous selections:

  1. Sf Express: Please sign for MySQL Soul Ten company
  2. Interview HashMap reading this article is enough
  3. What about Spring loop dependencies on rotten streets
  4. Learn about Synchronized step by step
  5. Quick access to common concurrent containers under JUC
  6. 3W word play with SpringCloud