preface

Hello, everyone. I am a little boy picking up field snails. Gold nine silver ten coming soon, sorted out 20 classic Redis interview questions, hope to help you.

  • Public number: a boy picking up snails
  • Making the address

1. What is Redis? What is it mainly used for?

Redis (Remote Dictionary Server) is an open source, network-enabled, memory – and persistent-based, journalized, key-value database written in ANSI C, with multiple language apis.

Unlike MySQL databases, Redis data is stored in memory. It can read and write very fast, processing more than 100,000 read and write operations per second. So Redis is widely used for caching, and redis is often used for distributed locking. In addition, Redis supports transactions, persistence, LUA scripting, LRU driver events, and a variety of clustering schemes.

2. Talk about the basic data structure types of Redis

As most of you know, there are five basic types of Redis:

  • String (String)
  • Hash
  • List (List)
  • Set
  • Zset (Ordered set)

It also has three special types of data structures

  • Geospatial
  • Hyperloglog
  • Bitmap

2.1 Five basic data types of Redis

String (String)

  • Summary :String is the most basic data structure type of Redis. It is binary safe and can store images or serialized objects with a maximum value of 512 megabytes
  • Simple usage examples:set key value,get keyEtc.
  • Application scenarios: Shared session, distributed lock, counter, and traffic limiting.
  • There are three internal codes,Int (8-byte long integer) /embstr (less than or equal to 39 bytes)/RAW (more than 39 bytes)

C language string is char[] implementation, and Redis uses SDS (Simple Dynamic String) encapsulation, SDS source code is as follows:

struct sdshdr{ unsigned int len; // Mark the length of buf unsigned int free; // Mark the number of unused elements in buf char buf[]; // Pit where elements are stored}Copy the code

SDS structure diagram is as follows:

Why Redis choose SDS structure, while C native char[] is not fragrant?

For example, in SDS, O(1) time complexity can obtain the length of the string; C string, you have to traverse the entire string in order n time.

Hash

  • Summary: In Redis, a hash type means that v (value) is itself a key-value pair (k-V) structure
  • Simple usage examples:hset key field valuehget key field
  • Internal code:Ziplist (Compressed list)Hashtable
  • Application scenario: Cache user information.
  • Note: If the development uses hGEtall, the hash element is too many, may cause Redis to block, can use HSCAN. If you are only retrieving a partial field, hmGET is recommended.

String and hash type pairs like the following:

List (List)

  • The list type is used to store multiple ordered strings. A list can hold up to 2^32-1 elements.
  • Simple practical examples: lpush key value [value ...]lrange key start end
  • Internal coding: ziplist (compressed list), linkedlist (linkedlist)
  • Application scenarios: Message queue, article list,

Insert and eject of list type

List Application scenarios refer to the following:

  • Lpush + lpop = Stack (Stack)
  • Lpush +rpop=Queue
  • LPSH + LTRIM =Capped Collection
  • Lpush + brPOP =Message Queue

Set

  • Summary: The set type is also used to hold multiple string elements, but does not allow duplicate elements
  • Simple usage examples:sadd key element [element ...],smembers key
  • Internal code:Intset (Set of integers),Hashtable
  • Note: smembers, lrange, and hgetall are heavy commands. If there is a risk of blocking Redis with too many elements, use ssCAN to do this.
  • Application scenarios: user tags, generate random number lottery, social needs.

Ordered set (Zset)

  • Summary: a sorted collection of strings and elements that cannot be repeated
  • Simple format examples:zadd key score member [score member ...].zrank key member
  • Underlying internal code:Ziplist (Compressed list),Skiplist
  • Application scenario: leaderboards, social needs (such as user likes).

2.2 Three special data types of Redis

  • Geo: launched by Redis3.2, geo-location positioning, used to store geo-location information, and to operate the stored information.
  • HyperLogLog: A data structure used to make cardinal statistics algorithms, such as the UV of a statistical website.
  • Bitmaps: Map the state of an element using a single bit. In Redis, the underlying implementation is based on strings. Bitmaps can be used as an array of bits

3. Why is Redis so fast?

3.1 Implementation based on memory storage

We all know that memory read and write is much faster than in disk, Redis based on memory storage database, compared with the data in disk MySQL database, save disk I/O consumption.

3.2 Efficient data structures

As we know, Mysql index selects B+ tree data structure for efficiency. In fact, reasonable data structure, is to make your application/application faster. Let’s take a look at Redis data structure & internal coding diagram:

SDS simple dynamic string

  • String length processing: Redis obtains string length, the time complexity is O(1), while in C language, it needs to iterate from scratch, the complexity is O(n);
  • Space preallocation: The more frequently strings are modified, the more frequently memory is allocated, which consumes performance, whereas SDS modification and space expansion allocate additional unused space and reduce performance loss.
  • Lazy space release: When SDS is shortened, it does not recycle the excess memory space, but free records the excess space. If there is subsequent change, the space recorded in FREE is directly used to reduce allocation.
  • Binary security: Redis can store some binary data. In C the string ends when it meets ‘\0’, whereas in SDS the len attribute marks the end of the string.

The dictionary

Redis is a K-V in-memory database, and all key values are stored in dictionaries. A dictionary is a hash table, such as a HashMap, where a value can be obtained directly from a key. The property of a hash table is that the value can be obtained in O (1) time complexity.

Jump table

  • Skip list is a unique data structure of Redis, which adds multi-level indexes on the basis of linked list to improve search efficiency.
  • Skip lists support node lookup with average O (logN), worst O (N) complexity, and batch processing of nodes through sequential operations.

3.3 Reasonable data encoding

Redis supports multiple data data types, each base type, and possibly multiple data structures. When, what data structure to use, and what code to use are the results of redis designers’ summary and optimization.

  • String: If a number is stored, the encoding is int; If a non-numeric string of 39 bytes or less is stored, it is embstr. If the value is larger than 39 bytes, it is raw encoding.
  • List: If the List has less than 512 elements and each element of the List has a value less than 64 bytes (default), use ziplist encoding, otherwise use LinkedList encoding
  • Hash: If the number of Hash elements is less than 512 and all values are less than 64 bytes, use ziplist, otherwise use Hashtable.
  • Set: IntSet encoding is used if all the elements in the collection are integers and the number of elements is less than 512, otherwise hashTable encoding is used.
  • Zset: ziplist encoding is used when the ordered set has less than 128 elements and the value of each element is less than 64 bytes, otherwise skiplist (skiplist) encoding is used

3.4 Reasonable threading model

I/O multiplexing

Redis uses EPoll as an implementation of I/O multiplexing technology that allows a single thread to efficiently process multiple connection requests. In addition, Redis’ own event processing model converts epoll connections, reads and writes, and closes into events, and does not waste too much time on network I/O.

What is I/O multiplexing?

  • I/O: network I/O
  • Multiplexing: Multiple network connections
  • Reuse: Reuse the same thread.
  • IO multiplexing is actually a synchronous IO model, which enables a thread to monitor multiple file handles. Once a file handle is ready, the application can be told to read or write the file. When no file handle is in place, the application blocks and the CPU is surrendered.

Single threaded model

  • Redis is a single-threaded model, which avoids unnecessary CPU context switching and contention lock consumption. Because it is single-threaded, if a command is executed too long (such as the hgetall command), it will block. Redis is a database for fast execution scenarios. , so be careful with commands like smembers, lrange, and hgetall.
  • Redis 6.0 introduces multithreading, which is still a single thread executing commands to manipulate memory.

3.5 Virtual Memory Mechanism

Redis directly built its own VM mechanism, unlike the general system will call system function processing, will waste a certain amount of time to move and request.

What is the virtual memory mechanism of Redis?

The virtual memory mechanism is to temporarily swap infrequently accessed data (cold data) from memory to disk, freeing up valuable memory space for other data that needs to be accessed (hot data). The VM function separates hot and cold data so that hot data is stored in the memory and cold data is saved to disks. This avoids the problem of slow access due to insufficient memory.

4. What is cache breakdown, cache penetration, cache avalanche?

4.1 Cache penetration Problem

Let’s take a look at a common way of using cache: when a read request comes, check the cache first. If a value hits the cache, it is returned directly. When the cache fails, it looks up the database, updates the value of the database to the cache, and returns it.

Cache penetration: Data that does not exist must be queried. If the cache does not hit, the data must be queried from the database. If no data is found, the data will not be written into the cache.

In layman’s terms, when a read request is accessed, neither the cache nor the database has a value, which results in each query request for that value being passed through the database. This is called cache penetration.

Cache penetration typically occurs in one of these situations:

  • Business unreasonable design, such as most users do not have daemons, but your every request to cache, query a certain userID to see if there is a daemon.
  • Business/operations/development errors, such as cache and database data being deleted by mistake.
  • An illegal request attack, in which a hacker deliberately fabricates a large number of illegal requests to read non-existent business data.

How do you avoid cache penetration? There are three ways to do this.

  • 1. If the request is illegal, we verify the parameters in the API entry and filter the invalid values.
  • 2. If the query database is empty, we can set a null value for the cache, or the default value. However, if a write request comes in, the cache needs to be updated to ensure cache consistency, and finally set an appropriate expiration time for the cache. (Commonly used in business, simple and effective)
  • 3. Use bloom filter to quickly check whether data exists. That is, when a query request comes in, check whether the value exists through the Bloom filter, and then continue to search.

Bloem filter principle: It consists of an array of bitmaps with an initial value of 0 and N hash functions. One hashes a key N times to obtain N values, hashes the N values in the bit array and sets them to 1. Then, if all of these positions are 1, bloom filter determines that the key exists.

4.2 Cache Snow running problem

Cache snow rush: When a large amount of data in the cache reaches the expiration time and the query data is large, all requests directly access the database, causing the database to be overburdened or even down.

  • Generally, a large amount of data expires at the same time. To solve this problem, you can set the expiration time evenly, that is, the expiration time is relatively discrete. For a large fixed value + a small random value, 5 hours +0 to 1800 seconds.
  • Redis outages can also cause cache snowrush. This requires building a Redis high availability cluster.

4.3 Cache breakdown Problem

Cache breakdown: When a hot key expires at a certain point in time, a large number of concurrent requests are sent to the key, resulting in a large number of requests to the DB.

Cache breakdown looks a bit like, in fact, it is two differences, cache snow rush refers to the database pressure is too large or even down, cache breakdown is only a large number of concurrent requests to the DB database level. Think of breakdown as a subset of cache Snowrush. Some articles believe that the difference between the two is that the breakdown is targeted at a hot key cache, while Xueben is targeted at many keys.

There are two solutions:

  • 1. Use the mutex scheme. When the cache fails, instead of loading db data immediately, use some atomic operation command with success return, such as setnx of Redis, to operate, and then load DB database data and set the cache. Otherwise, retry to get the cache.
  • 2. Never Expire: If no expiration time is set but hotspot data is about to expire, the asynchronous thread updates and sets the expiration time.

5. What is the hot Key problem and how to solve it

What is a hot Key? In Redis, we call keys that are accessed frequently hot keys.

If a large number of requests are received from a hotspot key, the host resources may be insufficient or even break down. As a result, normal services may be affected.

How do hot keys come about? There are two main reasons:

  • Users consume far more data than they produce, such as seconds kill, hot news and other scenarios that read and write too much.
  • If the performance of a single Redi server is higher than that of a single Redi server, for example, a fixed key is assigned to the same server. As a result, the number of Hash requests exceeds the machine bottleneck, resulting in hot key issues.

So how do you identify hot keys in daily development?

  • Use experience to determine which are hot keys;
  • Client statistics report;
  • Reported by service agent layer

How to solve the hot key problem?

  • Redis cluster expansion: Add fragmented copies to balance read traffic.
  • Spread hot keys across different servers;
  • Use a level 2 cache, or JVM local cache, to reduce Redis read requests.

6. Redis expiration policy and memory elimination policy

6.1 Redis Expiration Policy

When we set a key, we can set an expiration time for it, such as expire key 60. Redis specifies that this key60 will expire after 60 seconds. Let’s start with several expiration strategies:

Time expired

You need to create a timer for each key that is set to expire. When the key expires, the timer is deleted immediately. This policy can immediately clear expired data and is memory friendly; However, it takes up a lot of CPU resources to process expired data, which affects the response time and throughput of the cache.

Inert expired

Only when a key is accessed, the system checks whether the key has expired. This strategy maximizes CPU savings but is very memory unfriendly. In extreme cases, a large number of expired keys may not be accessed again and thus will not be cleared, occupying a large amount of memory.

Regular date

At certain intervals, a certain number of keys in the expires dictionary of a certain number of databases are scanned and expired keys are cleared. This strategy is a compromise between the first two. By adjusting the interval of periodic scan and the time limit of each scan, you can achieve the optimal balance of CPU and memory resources in different situations.

The Expires dictionary holds expiration data for all keys with an expiration date set, where key is a pointer to a key in the key space and value is the expiration date represented by the millisecond precision UNIX timestamp for that key. The key space refers to all the keys stored in the Redis cluster.

Redis uses both lazy expiration and periodic expiration strategies.

  • Assuming that Redis currently holds 300,000 keys and all of them are set to expire, if you check all of them every 100ms, the CPU will be extremely overloaded and may eventually fail.
  • Therefore, Redis adopts periodic expiration and randomly selects a certain number of keys every 100ms to check and delete.
  • However, you may end up with a lot of expired keys that are not deleted. In this case, Redis uses lazy deletion. When you get a key, Redis checks it and deletes it if it has expired.

But alas, if you periodically delete a lot of expired keys, and then do not remove lazy. There will be a lot of expired keys in memory, which will directly cause memory explosion. Sometimes, when the volume of business increases, the key of Redis is used a lot, and the memory is directly insufficient, and the operation and maintenance brother also forgets to increase the memory. Did Redis just die like that? Won’t! Redis protects itself with eight memory flushing strategies

6.2 Redis Memory elimination strategy

  • Volatile – LRU: When memory is insufficient to accommodate new writes, the lRU (least recently used) algorithm is used to flush out expired keys;
  • Allkeys-lru: when memory is insufficient to accommodate new writes, the lru (least recently used) algorithm is used to flush out allkeys.
  • Volatile – LFU: New in version 4.0. When the memory is insufficient to accommodate new data, the LFU algorithm is used to delete expired keys.
  • Allkeys-lfu: new in version 4.0. When the memory is insufficient to hold new data, lfu algorithm is used to eliminate allkeys.
  • Volatile -random: Randomly discard data from expired keys when memory is insufficient to accommodate new writes. .
  • Allkeys-random: randomly weed out data from allkeys when memory is insufficient to accommodate new writes.
  • Volatile – TTL: When the memory is insufficient to accommodate new data, the key whose expiration time is set is discarded based on the expiration time. The key whose expiration time is earlier is discarded first.
  • Noeviction: Default policy, new write operations will bug when memory is insufficient to accommodate new write data.

7. Talk about the common application scenarios of Redis

  • The cache
  • list
  • Counter application
  • Sharing Session
  • A distributed lock
  • The social network
  • The message queue
  • Bit operation

7.1 the cache

When we mention Redis, we naturally think of cache, which is indispensable for large and medium websites at home and abroad. Rational use of caching, such as caching hotspot data, can not only improve the speed of website access, but also reduce the pressure of database DB. Redis also provides richer data structures than memcached, and provides persistence mechanisms such as RDB and AOF for stronger batch sizes.

7.2 list

Nowadays, there are various leaderboards for Internet applications, such as monthly sales leaderboards for e-commerce websites, gift leaderboards for social apps, voting leaderboards for mini-apps and so on. The Zset data type provided by Redis enables these complex leaderboards.

For example, a leaderboard that gets likes when users upload videos every day could be designed like this:

  • 1. Jay uploads a video and gets 6 likes.
zadd user:ranking:2021-03-03 Jay 3
Copy the code
    1. After a period of time, you get another like, you can do something like this:
zincrby user:ranking:2021-03-03 Jay 1
Copy the code
    1. If user John cheats, delete the user:
zrem user:ranking:2021-03-03 John
Copy the code
    1. Display the top 3 users with the most likes
zrevrangebyrank user:ranking:2021-03-03 0 2
Copy the code

7.3 Counter Application

Major websites and APP applications often need the function of counters, such as the number of short videos played, the number of e-commerce website browsing. These play numbers and browse numbers generally require real-time, and each play and browse need to be added by 1. If the concurrency is large, it is a challenge to the performance of traditional relational data. Redis is an important choice for counter systems with its natural support for counting and its performance is very good.

7.4 sharing Session

If a distributed Web service keeps users’ Session information on their respective servers, users may need to log in again after refreshing once, which is obviously problematic. In fact, users’ sessions can be centrally managed using Redis, and each time a user updates or queries login information, it is centrally retrieved directly from Redis.

7.5 Distributed Lock

Almost every Internet company uses distributed deployment. Under distributed service, it will encounter technical problems of concurrent access to the same resource, such as seckilling, ordering and reducing inventory and other scenarios.

  • You cannot synchronize or reentrantlock a local lock.
  • If the concurrency is not large, use the database pessimistic lock, optimistic lock to achieve no problem.
  • However, in the case of high concurrency, using database lock to control the concurrent access of resources will affect the performance of the database.
  • In fact, Redis setNx can be used to implement distributed locks.

7.6 Social Networks

Likes/treads, fans, common friends/likes, push, and pull-down refresh are essential functions of social networking sites. Due to the large volume of traffic on social networking sites, and the traditional relational data is not suitable for storing this type of data, the data structure provided by Redis can realize these functions relatively easily.

7.7 Message Queue

Message queue is the required middleware for large websites, such as ActiveMQ, RabbitMQ, Kafka and other popular message queue middleware, mainly used for service decoupling, traffic peak cutting and asynchronous processing of services with low real-time performance. Redis provides publish/subscribe and blocking queue functions to achieve a simple message queue system. Also, this is not comparable to professional message-oriented middleware.

7.8 bit operation

It is used in scenarios with hundreds of millions of data, such as system check-in of hundreds of millions of users, statistics on the number of delogins, whether a user is online and so on. Tencent 1 billion users, to a few milliseconds to query whether a user online, how to do? Don’t create a key for each user and keep track of them. (You can calculate the amount of memory required. Here we use the in-place operations — setbit, getbit, bitcount commands. If you build an array that is long enough for each element to be 0 and 1, and the index of the array represents the user ID (it must be a number), then you can build a memory system using the index and the element values (0 and 1).

8. What is the persistence mechanism of Redis? Advantages and disadvantages

Redis is a memory-based non-relational K-V database, and since it is memory-based, if the Redis server fails, the data will be lost. To avoid data loss, Redis provides persistence, that is, saving data to disk.

Redis provides two persistence mechanisms, RDB and AOF. The process of persistent file loading is as follows:

8.1 RDB

RDB is to save memory data to disk in the form of snapshot.

What is a snapshot? One way to think about it is, take a picture of the data at the current moment and save it.

RDB persistence refers to writing snapshots of data sets in memory to disk within a specified time interval and a specified number of write operations. It is the default Redis persistence mode. After the operation is complete, a dump. RDB file is generated in the specified directory. When Redis restarts, the dump. RDB file is loaded to restore data. RDB trigger mechanisms are as follows:

The advantages of RDB

  • Suitable for large-scale data recovery scenarios, such as backup and full replication

RDB shortcomings

  • There is no way to do real-time persistence/second persistence.
  • The RDB format is incompatible with the old version

AOF

AOF (Append only File) is persisted. Each write operation is recorded in the form of a log and appended to a file. When the file is restarted, the command in the AOF file is executed again to recover data. It mainly solves the real-time problem of data persistence. It is disabled by default.

AOF’s workflow is as follows:

The advantages of AOF

  • Higher data consistency and integrity

The disadvantage of AOF

  • The more AOF records, the larger the file, and the slower data recovery.

9. How to achieve high availability of Redis?

When we use Redis in the project, we will certainly not deploy Redis services in a single point. Because once a single point of deployment goes down, it’s not available. In order to achieve high availability, it is common practice to make multiple copies of the database to be deployed on different servers, one of which can continue to provide service even if it fails. There are three deployment modes for Redis to achieve high availability: master-slave, sentinel, and cluster.

9.1 Master/Slave Mode

In master-slave mode, Redis deploys multiple machines, with a master node responsible for read and write operations and a slave node only responsible for read operations. Data from the slave node to the master node is realized by the master-slave replication mechanism

Primary/secondary replication includes full replication and incremental replication. Generally, when the slave initiates the connection with the master for the first time or considers the connection to be the first time, the slave adopts full replication. The full replication process is as follows:

  • 1. The slave sends sync to the master.
  • 2. After receiving the SYNC command, the master runs the BGsave command to generate the full RDB file.
  • 3. The master uses the buffer to record all write commands during RDB snapshot generation.
  • 4. After the BGSave is complete, the master sends the RDB snapshot file to all slaves.
  • 5. After receiving the RDB snapshot file, slave loads and parses the received snapshot.
  • 6. Master uses the buffer to record all write commands generated during RDB synchronization.
  • 7. After the snapshot is sent, the master starts to send write commands in the buffer to the slave.
  • Salve receives command requests and executes write commands from the master buffer

After redis2.8, psync has been used to replace sync. Because sync consumes a lot of system resources, psync is more efficient.

After full synchronization between the slave and master, incremental replication is triggered if data on the master is updated again.

ReplicationFeedSalves () is triggered when data is changed or changed on the master node, and each subsequent command invoked on the master node is synchronized to the Slave node using replicationFeedSlaves(). Before executing this function, the master node determines whether the command has been updated, and if so, the slave node is not empty. This function sends the command executed by the user to all slave nodes for execution. The process is as follows:

9.2 Sentinel Mode

In the master-slave mode, once the master node fails to provide services due to failure, the slave node needs to be promoted to the master node manually and the application needs to update the address of the master node. Obviously, this approach to troubleshooting is unacceptable for most business scenarios. Redis has officially provided the Redis Sentinel architecture since 2.8 to solve this problem.

Sentinel mode is a Sentinel system consisting of one or more Sentinel instances. It can monitor all the primary and secondary Redis nodes and automatically upgrade a secondary node under the offline primary server to a new primary node when the monitored primary node goes offline. However, problems can arise when a single sentry process monitors the Redis node (a single point of problem), so multiple sentries can be used to monitor the Redis node, and between them.

In a nutshell, the Sentinel mode serves three purposes:

  • Send commands and wait for Redis servers (both master and slave) to return to monitor their running status;
  • When the sentinel detects that the master node is down, it will automatically switch the slave node to the master node, and then inform the other slave nodes through the publish and subscribe mode to modify the configuration file and let them switch the host.
  • Sentinels also monitor each other for high availability.

What is the process of failover

If the primary server is down and Sentry 1 detects this result first, the system does not perform a failover immediately, but sentry 1 thinks that the primary server is unavailable, and this phenomenon becomes a subjective offline. When the sentries at the back also detect that the primary server is unavailable and the number reaches a certain value, a vote is conducted between the sentries. The result of the vote is initiated by one sentry and failover is performed. After the switch is successful, each sentinel will switch the host from the secondary server monitored by himself through the publish and subscribe mode, which is called objective offline process. This makes everything transparent to the client.

Sentinels work as follows:

  1. Each Sentinel sends a PING command once per second to the Master, Slave, and other Sentinel instances that it knows of.
  2. If an instance has taken longer than the value specified in the down-after-milliseconds option since it last responded to the PING command, it will be flagged as subjective offline by Sentinel.
  3. If a Master is marked as subjectively offline, all sentinels that are monitoring the Master confirm that the Master is indeed subjectively offline at a rate of once per second.
  4. When a sufficient number of sentinels (greater than or equal to the value specified in the configuration file) confirm that the Master has indeed gone subjectively offline within the specified time frame, the Master is marked as objectively offline.
  5. In general, each Sentinel sends INFO commands to all known masters and slaves every 10 seconds.
  6. When the Master is marked as objective offline by Sentinel, Sentinel sends the INFO command to all slaves of the offline Master once every 10 seconds instead of once every second
  7. If not enough sentinels agree that the Master is offline, the Master’s objective offline status is removed. If the Master returns a valid reply to the PING command of Sentinel, the Master’s subjective offline status will be removed.

9.3 Cluster Cluster mode

Sentinel mode is based on master/slave mode to achieve read/write separation, and it can automatically switch, higher system availability. But it stores the same amount of data per node, wastes memory and is not easy to expand online. Therefore, Cluster Cluster came into being, which was added in Redis3.0 to realize distributed storage of Redis. Data is sharded, that is, different content is stored on each Redis node to solve the problem of online capacity expansion. It also provides replication and failover capabilities.

Cluster Communication between Cluster nodes

A Redis cluster consists of multiple nodes. How do the nodes communicate with each other? Through the Gossip protocol!

Redis Cluster Clusters communicate with each other through the Gossip protocol. Nodes constantly exchange information, including node faults, new nodes, changes of primary and secondary nodes, and slot information. There are four common Gossip messages: Ping, pong, meet, and fail.

  • Meet message: notifies the new node to join. The message sender informs the receiver to join the current cluster. After the normal completion of meet message communication, the receiving node will join the cluster and exchange ping and Pong messages periodically.
  • Ping message: The most frequently exchanged messages. Each node in a cluster sends ping messages to multiple nodes every second to check whether the nodes are online and exchange status information.
  • Pong message: When ping and meet messages are received, a reply message is sent to the sender to confirm the normal communication of the message. The PONG message internally encapsulates its own state data. The node can also broadcast its own PONG message to the cluster to inform the whole cluster to update its status.
  • Fail message: When a node determines that another node in the cluster goes offline, it broadcasts a Fail message to the cluster. After receiving the fail message, other nodes update the status of the node to offline.

In particular, each node communicates with other nodes through a cluster bus. For communication, use a special port number, that is, the external service port number plus 10000. For example, if the port number of a node is 6379, the port number used to communicate with other nodes is 16379. Communication between Nodes uses a special binary protocol.

Hash Slot Slot algorithm

Since it is distributed storage, is the distributed algorithm used by Cluster consistent Hash? No, it’s the Hash Slot algorithm.

The slot algorithm divides the database into 16384 slots. Each key/value pair entering Redis is hashed according to key and assigned to one of the 16384 slots. The hash mapping used is also relatively simple, using THE CRC16 algorithm to calculate a 16-bit value, and then modulo 16384. Each key in the database belongs to one of the 16,384 slots that can be handled by each node in the cluster.

Each node in the cluster is responsible for part of the hash slots. For example, the current cluster has A, B, and C nodes, and the number of hash slots on each node is 16384/3.

  • Node A is responsible for hash slots 0 to 5460
  • Node B is responsible for hash slots 5461 to 10922
  • Node C is responsible for hash slots 10923 to 16383

Redis Cluster Cluster

In a Redis Cluster Cluster, ensure that all nodes corresponding to 16384 slots are working properly. If a node is faulty, its slot becomes invalid, and the Cluster becomes unavailable.

Therefore, to ensure high availability, a Cluster introduces master/slave replication, where a master node corresponds to one or more slave nodes. When other primary nodes ping A primary node A, if more than half of the primary nodes time out, primary node A is considered to be down. If the primary node goes down, the secondary node is enabled.

On each node of Redis, there are two things. One is a slot, and its value range is 0 16383. The other is cluster, which can be understood as a cluster-managed plug-in. When our access key arrives, Redis calculates a 16-bit value based on CRC16 and modulo the result to 16384. Each key will correspond to a hash slot with a number between 0 16383. Through this value, you can find the node corresponding to the corresponding slot, and then directly jump to the corresponding node for access operation.

Although data is stored separately on different nodes, the entire Cluster is treated as a whole by the client. The client side connects to any node and looks like Redis operating on a single instance. When the client operation key is not assigned to the correct node, Redis will return the redirect instruction and end up pointing to the correct node. This is a bit like a 302 redirect redirect on a browser page.

failover

Redis cluster achieves high availability. When a node in the cluster fails, failover is implemented to ensure that the cluster can provide services normally.

The Redis cluster realizes fault discovery through ping/pong messages. This environment includes both subjective and objective loggings.

Subjective offline: a node considers another node unavailable, that is, offline status. This status is not the final fault judgment, but only represents the opinion of one node, and misjudgment may occur.

Objective offline: Refers to the actual offline of a node. Multiple nodes in the cluster consider the node unavailable and reach a consensus. If the primary node that holds the slot is faulty, perform failover for the node.

  • If node A marks node B as subjective offline, after A period of time, node A sends the status of node B to other nodes through messages. When node C receives the message and parses the message body, if the PFAIL status of node B is found, the objective offline process will be triggered.
  • When the primary node goes offline, the Redis Cluster collects the votes of the primary node holding the slot to see whether the number of votes reaches half. When the number of votes in the offline report is more than half, it is marked as the objective offline state.

The process is as follows:

Fault recovery: After a fault is discovered, if the offline node is the primary node, replace it with a secondary node to ensure high availability of the cluster. The process is as follows:

  • Qualification check: Check whether the secondary node meets the requirements to replace the faulty primary node.
  • Election preparation time: Update the trigger election time after the qualification check is passed.
  • Call an election: When the time is up, call an election.
  • Election vote: Only the master node that holds the slot has votes. Enough votes (more than half) are collected from the node to trigger the operation of replacing the master node

10. Have you ever used Redis distributed locks? What are the points to notice?

Distributed lock is the realization of a lock that controls the access of different processes to shared resources in distributed system. Distributed locks are needed in business scenarios such as placing orders in seconds and snatching red packets. Redis is often used as a distributed lock in our projects.

Select Redis distributed lock several implementation methods, we come to discuss, see what problem ha.

  • The setnx + expire command is written separately
  • Setnx + value Indicates the expiration time
  • Set extension command (set ex px nx)
  • Set ex px nx + check unique random value, then delete

10.1 The setnx + expire command is written separately

If (jedis.setnx(key,lock_value) == 1) {// Add lock expire (key, 100); // Set expiration time try {do something // service request}catch(){} finally {jedis.del(key); // Release the lock}}Copy the code

If the process crashes or needs to restart maintenance after executing setnx, the lock will never be available to other threads, so distributed locks cannot be implemented this way.

10.2 setNx + value The value is the expiration time

long expires = System.currentTimeMillis() + expireTime; // System time + Set expiration time String expiresStr = string.valueof (Expires); If (jedis.setnx(key, expiresStr) == 1) {return true; } String currentValueStr = jedis.get(key); If (currentValueStr! = null && long.parselong (currentValueStr) < system.currentTimemillis ()) { String oldValueStr = jedis. GetSet (key_resource_id, expiresStr); oldValueStr = jedis. if (oldValueStr ! = null && oldValueStr. Equals (currentValueStr)) {return true; }} return false; }Copy the code

I have seen some developers implement distributed locks in this way, but this solution also has some disadvantages:

  • The expiration time is generated by the client. In a distributed environment, the expiration time on each client must be synchronized.
  • There is no unique identifier of the save holder and it may be released/unlocked by other clients.
  • When the lock expires, multiple concurrent client requests come over at the same time, all executedjedis.getSet()In the end, only one client can successfully add the lock, but the expiration time of the client lock may be overwritten by other clients.

10.3: Set extension command (set ex px nx) (note possible problems)

If (jedis. Set (key, lock_value, "NX", "EX", 100s) == 1){try {do something // catch(){} finally {jedis.del(key); // Release the lock}}Copy the code

There may be problems with this scheme:

  • The lock expired and was released. The service was not finished.
  • The lock was deleted by another thread.

10.4 Set ex px Nx + check unique random value, then delete

If (jedis.set(key, uni_request_id, "NX", "EX", 100s) == 1){// try {do something // catch(){} finally {// Check whether the current thread is holding the lock (uni_request_id.equals(jedis.get(key))) { jedis.del(key); // Release the lock}}}Copy the code

In this case, determine whether the lock being added and released by the current thread is an atomic operation. If jedis.del() is called to release the lock, the lock may no longer belong to the current client.

Lua scripts are usually used instead. The lua script is as follows:

if redis.call('get',KEYS[1]) == ARGV[1] then 
   return redis.call('del',KEYS[1]) 
else
   return 0
end;
Copy the code

This is a good way to do it. In general, you can already use this implementation. However, there is a problem that the lock has expired and the business has not been completed (in fact, the estimated time for business processing is generally no problem).

11. Have you ever used Redisson? Talk about how it works

Distributed locks may expire and release services. Some people think that setting the lock expiration time slightly longer is ok. In fact, we think about whether we can give the thread that obtains the lock, open a regular daemon thread, every once in a while to check whether the lock still exists, exists to extend the expiration time of the lock, prevent the lock from being released in advance.

The current open source framework Redisson solves this distributed lock problem. Let’s take a look at the underlying principles of Redisson:

As soon as the thread successfully adds the lock, a watch Dog will be started. It is a background thread that will check every 10 seconds. If thread 1 still holds the lock, it will continuously extend the lifetime of the lock key. Therefore, Redisson uses Redisson to solve the problem that the lock expires and the service is not finished.

12. What is Redlock algorithm

Redis is usually clustered. If data is in the process of master/slave synchronization and the master node hangs, what problems may occur with Redis distributed lock? Take a look at this flowchart:

If thread 1 acquires the lock on the master node of Redis, but the lock key has not been synchronized to the slave node. When the master node fails, a slave node becomes the master node. Thread two can acquire the lock with the same key, but thread one has already acquired the lock, so the lock security is lost.

To solve this problem, Antirez, the author of Redis, proposed an advanced distributed lock algorithm: Redlock. The core idea of Redlock is this:

Have multiple Redis Master deployments so they don’t all go down at the same time. These master nodes are completely independent of each other and there is no data synchronization between them. At the same time, you need to make sure that locks are acquired and released on multiple Master instances in the same way that they are acquired and released on single instances of Redis.

We assume that there are currently five Redis master nodes running these Redis instances on five servers.

The RedLock implementation steps are as follows

  • 1. Obtain the current time, in milliseconds.
  • 2. Lock the five master nodes in sequence. The client sets network connection and response timeouts that are less than the lock expiration time. (Assuming automatic lock failure time is 10 seconds, the timeout time is usually between 5-50 ms, let’s assume the timeout time is 50ms). If you time out, skip the master node and try the next master node as soon as possible.
  • 3. The client obtains the lock acquisition time by subtracting the start time (the time recorded in Step 1) from the current time. The lock is successful if and only if more than half of the Redis master nodes (N/2+1, in this case 5/2+1=3 nodes) are locked and used for less than the lock expiration time. (10s> 30ms+40ms+50ms+4m0s+50ms)
  • If a lock is obtained, the true duration of the key is changed by subtracting the time taken to obtain the lock.
  • If the lock fails to be acquired (the lock has not been acquired for at least N/2+1 master instances, or the lock has been acquired for longer than the valid time), the client must unlock all master nodes (even if some master nodes have not been locked at all, to prevent some from escaping).

The simplified steps are:

  • Lock requests are made to the five master nodes in sequence
  • Determine whether to skip the master node based on the timeout period.
  • If more than three nodes are locked successfully and the lock duration is shorter than the lock validity period, the lock is successfully locked.
  • If you fail to acquire the lock, unlock it!

13. Redis skip list

  • Skip lists are one of the underlying implementations of zset, an ordered set
  • Skip lists support node lookup with average O (logN), worst O (N) complexity, and batch processing of nodes through sequential operations.
  • Skip list implementation consists of zskiplist and zskiplistNode structures, where Zskiplist is used to store skiplist information (such as head node, tail node, length), and zskiplistNode is used to represent skiplist nodes.
  • Skip list is on the basis of linked list, add multi-level index to improve search efficiency.

14. How do MySQL and Redis ensure double write consistency

  • Cache delay dual-delete
  • Delete the cache retry mechanism
  • Read biglog asynchronously delete cache

14.1 Delayed Dual-Delete?

What is delayed double deletion? The flow chart is as follows:

  1. Delete the cache first
  2. Update the database again
  3. Sleep for a moment (say 1 second) and delete the cache again.

This one sleeps for a while. How long does it last? All one second, right?

This sleep time = time spent reading business logic data + several hundred milliseconds. To ensure that the read request ends, the write request can delete the cache dirty data that the read request may bring.

This solution is ok, only sleep for a short time (for example, only that 1 second), there may be dirty data, the general business will accept. But what if the cache fails to be deleted the second time? The cache and the database might still be inconsistent, right? How about setting a natural expire time for the Key to expire automatically? Does the business have to accept inconsistencies in the expiration date? Or is there something better?

14.2 Deleting the Cache Retry mechanism

Delayed double deletion may cause cache deletion failure in the second step, resulting in data inconsistency. We can use this solution to optimize: delete the cache several times if the deletion fails, so we can introduce a retry mechanism for deleting cache

  1. Write requests update the database
  2. The cache failed to be deleted for some reason
  3. The key that failed to delete is placed on the message queue
  4. Consume messages from the message queue to get the key to delete
  5. Retry the cache deletion operation

14.3 Reading BigLog Asynchronously delete the cache

Retry delete cache mechanism is ok, but it will cause a lot of business code intrusion. In fact, you can optimize it by using the database’s binlog to asynchronously weed out keys.

Take mysql for example

  • Ali’s Canal can be used to send the binlog collection to the MQ queue
  • Then, the ACK mechanism is used to confirm and process the update message, and the cache is deleted to ensure data cache consistency

15. Why was multi-threading changed after Redis 6.0?

  • Prior to Redis6.0, When Redis processed client requests, including socket reading, socket parsing, socket execution, and socket writing, it was handled by a sequential main thread, which was called “single thread”.
  • Why didn’t you use multithreading before Redis6.0? With Redis, there is almost no CPU bottleneck; Redis is largely limited by memory and network. On a normal Linux system, for example, Redis can handle a million requests per second using Pipelining, so if the application uses mainly O(N) or O(log(N)) commands, it hardly consumes much CPU.

The use of multi-threading in Redis is not to completely abandon the single thread, redis still uses a single thread model to handle client requests, only using multi-threading to deal with data reading and writing and protocol parsing, or using a single thread to execute commands.

This is done because the performance bottleneck of Redis is network IO rather than CPU. Using multi-threading can improve the efficiency of I/O read and write, thus improving the overall performance of Redis.

16. Talk about Redis transaction mechanism

Redis uses a set of commands such as MULTI, EXEC and WATCH to implement the transaction mechanism. Transactions allow multiple commands to be executed at once, and all commands in a transaction are serialized. During the execution of a transaction, commands in the execution queue are sequentially serialized, and command requests submitted by other clients are not inserted into the transaction execution command sequence.

In short, Redis transactions are sequential, one-time, and exclusive executions of a series of commands in a queue.

Redis executes transactions as follows:

  • Start transaction (MULTI)
  • The command team
  • Execute transaction (EXEC), DISCARD transaction (DISCARD)
The command describe
EXEC Execute all commands within the transaction block
DISCARD Cancels the transaction, abandoning all commands in the transaction block
MULTI Marks the start of a transaction block
UNWATCH Unmonitor all keys with the WATCH command.
WATCH Monitor the key so that if the key is changed by another command before the transaction executes, the transaction will be interrupted.

17. What about Redis Hash conflicts

Redis is a K-V in-memory database that uses a global hash to hold all key-value pairs. This hash table is made up of hash buckets, and the entry element in the hash bucket is savedThe key andValue pointer, where *key refers to the actual key and *value refers to the actual value.

Hash table lookups are very fast, somewhat like a HashMap in Java, which allows us to quickly find key-value pairs in O(1) time. First, calculate the hash value through the key, find the corresponding hash bucket location, locate the entry, and find the corresponding data in the entry.

What is the Hashish conflict?

Hash conflict: Calculate the same hash value with different keys, resulting in the same hash bucket.

Redis uses chain hash to solve hash conflicts. Chained hash means that multiple elements in the same hash bucket are stored in a linked list, which are connected by Pointers in turn.

Some readers may also wonder: elements in a hash collision chain can only be looked up one by one by a pointer. When you insert a lot of data into the hash table, there will be more collisions, the longer the list of collisions will be, and the query will be less efficient.

To stay efficient, Redis rehashes the hash table, adding hash buckets to reduce collisions. To make rehash more efficient, Redis also uses two global hash tables by default, one for current use, called the primary hash table, and one for capacity expansion, called the standby hash table.

18. Can Redis process write requests simultaneously during RDB generation?

Yes, Redis provides two instructions to generate the RDB, namely Save and BGSave.

  • If it was a save instruction, it would block because it was executed by the main thread.
  • In the case of bgSave directives, a child process is forked to write to the RDB file, the snapshot persistence is left entirely to the child process, and the parent process can continue processing client requests.

19. What protocol is used for the underlying layer of Redis?

RESP, Redis Serialization Protocol, is a set of Serialization protocols specially designed for Redis. This protocol has been around since Version 1.2 of Redis, but it was only in Redis 2.0 that it finally became the standard for redis communication protocol.

RESP has the advantages of simple implementation, fast parsing, and good readability.

20. Bloom filter

To deal with cache penetration, we can use bloom filters. What is a Bloom filter?

It is composed of a long binary vector and a group of Hash mapping functions. It is used to retrieve whether an element is in a set. The spatial efficiency and query time are much better than the general algorithm, but the disadvantages are certain false identification rate and deletion difficulty.

What is the principle of bloom filter? Let’s say we have some set A, and A has n elements. Using k hash hash functions, each element in A is mapped to different positions in an array B of length A bits, all of which have binary numbers set to 1. If the element to be examined, after mapping k hash functions, is found to have binary numbers of 1 in all k positions, it is likely to belong to set A, and vice versa.

Let’s take A simple example. Suppose set A has three elements, {d1,d2,d3}. I have a hash function called Hash1. Now map each element of A to the 16-bit array B.

Let’s now map d1. If Hash1 (d1) = 2, let’s change the 2 subscript of array B to 1, as follows:

If Hash1 (d2) = 5, let’s change the array B with subscript 5 to 1 as follows:

Then we map d3 as well, assuming that Hash1 (d3) is also equal to 2, which is also the grid marked 1 with subscript 2:

So if we want to make sure that an element dn is in set A, we just compute the index index for Hash1 (dn), and as long as it’s 0, that means that the element is not in set A. What if the index index is 1? It could be some member of A. As you can see, the subscript values of d1 and D3 can both be 1, and they can be mapped from other numbers. Bloom filters have this disadvantage: false positives due to hash collisions, and errors in judgment.

How can this error be reduced?

  • Make more hash function mappings to reduce the probability of hash collisions
  • At the same time, increasing the bit length of the B array can increase the range of data generated by the hash function and reduce the probability of hash collision

Hash2 (d1) =6,Hash2 (d3) =8;

Even with the error, we can see that the Bloom filter does not hold the complete data, but just calculates the position using a series of hash mapping functions and populates the binary vector. In large numbers, bloom filters can be a bargain, with a minimal error rate in exchange for a significant storage savings.

Bloem filters are already implemented in open source libraries, such as Google’s Guava library, Twitter’s Algebird library, or Redis’s own Bitmaps.

Reference and thanks

  • Redis high availability solution summary
  • Redia Series 9: Redis cluster high availability