How is caching used in a project? What happens if the cache is not used properly?

Interviewer psychological analysis

This question, Internet companies must ask, if a person even do not know the cache, that is really more embarrassing.

Whenever you ask about caching, the first question to ask is where is caching used in your project? Why? Can’t I? What are the possible adverse consequences if it is used?

This is to see if you have thought behind your use of cache. If you are stupid and can’t give the interviewer a reasonable answer, you will have a bad impression and show that you don’t like to think

Analysis of interview questions

One by one

How is the cache used in a project?

Combined with their own projects to answer, nothing to say

Why is a cache used in a project?

Caching is mainly used for high performance and high concurrency

Look at the picture first

High performance: suppose such a scenario, you have an operation, a request came over, you all kinds of messy operation mysql, a long time did not find a result, time-consuming 600ms, but the result may not change in the next few hours, or change can not immediately feedback to the user. So what do we do now?

If you want to find a value from a key in the cache, you can find a value from a key in the cache. If you want to find a value from a key in the cache, you can find a value from a key in the cache in 2ms

This is called high performance

That’s when you check out the time it takes to do something complicated, and if you’re sure it’s not going to change much, and then there’s a lot of requests to read right away, just slow down the store and read from the cache.

High concurrency: Assume a scenario where there are 1 million simultaneous users accessing the system during the midday peak and there are 4000 requests per second to query the database. If the database receives 4000 requests per second, it may go down. But if you put some of your data in Redis, then maybe 3000 requests per second go to the cache and 1000 requests per second go to the database, and the system can handle that.

Why does caching support high concurrency?

Since the cache uses memory, memory naturally supports 4000/s, and 40,000 /s requests are fine. But the database generally recommends no more than 2000 requests per second.

Will there be any adverse consequences of using the cache?

The cache and database double-write are inconsistent
Cache avalanche
Cache breakdown
Cache concurrency contention

What’s the difference between Redis and memcached? What is the threading model of Redis? Why is single-threaded Redis more efficient than multi-threaded memcached (why is Redis single-threaded but still supports high concurrency?) ?

Interviewer psychological analysis

This is the most basic question when asking Redis, one of the most basic internal principles and characteristics of Redis is actually a single thread working model, if you do not know this, then later when playing Redis, do not know what to do?

The interviewer may also ask you the difference between memcached and Redis, but to be honest, redis is the most common way to cache these days, depending on the breadth of your skills.

Analysis of interview questions

The difference between Redis and memcached

In this case, you can compare the difference of N, here are some differences given by redis author

Redis supports server-side data manipulation. Redis has more data structures and supports richer data manipulation than Memcached. This greatly increases the number of network IO and data volume. In Redis, these complex operations are usually just as efficient as regular GET and set operations, so if you need a cache that can support more complex structures and operations, Redis is a good choice
Comparison of memory usage efficiency: Memcached is more efficient using simple key-value storage. Redis uses hash structure to create key-value storage, which is more efficient than memcached due to combined compression.
Performance comparison: Because Redis uses only one core and Memcached can use multiple cores, Redis performs better than Memcached on average for storing small data on each core, and memcached performs better than Redis for storing data over 100K. Redis has recently been optimized for storing big data, but it still lags behind memcached.
Cluster mode: Memcached has no native cluster mode, relying on the client to write data in fragments to the cluster, but Redis currently supports cluster mode natively

Redis thread model

Check out this article to see how redis’s single-threaded model works. One article is enough _ programmer Akita’s blog -CSDN blog _Redis single thread principle

Why is the Redis single-threaded model so efficient?

Pure memory operation
The core is IO multiplexing mechanism based on non-blocking
Single-threading instead avoids the frequent context switching problems of multithreading

What data types do Redis have? In what scenarios is it suitable?

String Simple K-V storage
Hash stores objects
List can be used for pagination
Set can be repeated, you can play intersection, union, difference set ah, for example, two people’s fan list once the intersection, you can see the common friends what
3. Sorted set sorted set

Can you introduce redis expiration policy? Why don’t you write an LRU by hand?

Interviewer psychological analysis

Before, some students asked me why the Redis in our production environment often lost some data. It’s in there. It’ll be gone in a minute. Redis is a cache, so you saved it as a cache, right?

What is cache? Using memory as cache, is memory infinite? Memory is precious and limited, disks are cheap and plentiful, and Redis is mainly based on memory for high-performance, high-concurrency read and write operations

So this is a basic concept of caching, data will expire, either you set an expiration date or Redis will kill it

And you set the expiration date. Do you know how Redis makes it expire for you? When will it be deleted? Why is redis still high in memory usage when so much data is supposed to be out of date?

Analysis of interview questions

(1) Set the expiration time

When we set a key, we can specify an expire time, such as an hour. Ten minutes? This is useful because we specify that the cache expires

If you set the expiration time of a key to 1 hour, how does Redis delete the key after the next hour?

The answer is: regular deletion + lazy deletion

The so-called periodic deletion means that Redis randomly selects some keys with expiration time every 100ms by default, detects whether they are expired, and deletes them if they are expired. Note that not all keys with expiration dates are iterated every 100ms, which would be a performance disaster. In fact, Redis randomly selects some keys every 100ms to detect and delete

However, the problem is that regular deletion may result in many expired keys not being deleted when the time comes. So what to do? All is lazy deletion, which means that when you get a key, Redis checks, is that key expired if it’s expired? If it is out of date it will be deleted at this time and will not give you fanH

By combining the above two methods, the expired key is guaranteed to be killed

In other words, your expired key is not deleted completely by regular deletion, and some of it still stays in the memory, occupying your memory, unless your system checks this key, it will be deleted by Redis

But this is actually a problem. What happens if you periodically delete a lot of expired keys and you don’t check them in time? What if a large number of expired keys pile up in memory, causing Redis to run out of memory?

The answer is: go through the memory elimination mechanism

(2) Memory elimination

If the redis memory usage is too high, the redis memory will be flushed out. There are some strategies:

Noeviction: New write operations will bug when memory is not large enough to accommodate new writes
Allkeys-lru: Removes the least-recently used key from the key space when memory is insufficient to accommodate new writes (this is the most commonly used)
Allkeys-random: Remove a random key from the key space when the memory is insufficient to hold new data
Volatile – lRU: Removes the least recently used key from the expired key space when memory is insufficient to accommodate new writes
Volatile -random: The random removal of a key from an expired key space when memory is insufficient to accommodate new writes (generally inappropriate)
Volatile – TTL: When the memory is insufficient to accommodate new data, the key whose expiration time is earlier is removed from the key space

How to ensure high concurrency, high availability and persistence of Redis? Can you introduce the master slave replication principle of Redis? Can you introduce the Sentry principle of Redis?

Interviewer psychological analysis

In fact, ask this question, mainly test you, redis single machine can bear how many high concurrency? How to expand capacity to withstand more concurrency if a single machine fails? Will Redis fail? How to ensure high availability of Redis since redis will hang?

These are questions that you must consider in your project, and if you haven’t, you haven’t given enough thought to the problems in your production system

Analysis of interview questions

If you use Redis cache technology, you must consider how to use Redis to add multiple machines, to ensure that Redis is highly concurrent, highly available.

The relationship between high concurrency in Redis and high concurrency in the whole system

If you want to get high concurrency in Redis, it's inevitable that you want to get the underlying cache right and high concurrency in mysql through a complex set of sub-libraries and sub-tables. Redis alone is not enough, but redis is a very important part of the larger cache architecture, the architecture that supports high concurrencyCopy the code

Where is the bottleneck that Redis can’t support high concurrency?

stand-alone

What should redis do if it wants to support more than 100,000 concurrent requests

Single redis is almost impossible, unless you are a very good machine performance, configuration, particularly high, physical machine, maintenance is also doing a good job, integral operation is not too complicated, single general read and write in the tens of thousands of separation, in general, for the cache is generally high concurrency support read, written request few master-slave architecture separation - > support - >, speaking, reading and writing 100000 + read QPSCopy the code

The core mechanism of Redis Replication

Take a look at the master/slave simple architecture diagram first

Redis replicates data asynchronously to multiple slave nodes, but starting with Redis2.8, slave nodes periodically confirm the amount of data they replicate each time
A master node can be configured with multiple slave nodes
Slave nodes can also connect to other slave nodes
The slave node does not block the work of the master node
The slave node does not block its own query operation when performing replication. Instead, it uses the old data set to provide services. However, when the replication is complete, the old data set needs to be deleted and the new data set needs to be loaded. At this time, the external service is suspended
The slave node is used for horizontal capacity expansion and read/write separation. The expanded slave node improves read throughput

The implications of master persistence for master-slave architecture security

If the master/slave architecture is used, it is recommended that master Node persistence be enabled
It is not recommended to use slave nodes as hot standby data for master nodes, because if you turn off persistence for master nodes, the data may be empty when the master is down and restarted, and the data from slave nodes may be lost after replication
Even if the sentinel mechanism is used, the slave node can automatically take over the master node, but the master node may restart automatically before the sentinel detects the master failure, which may cause all slave node data to be cleared

Principle of redis master-slave replication, resumable data transfer, diskless replication, expired key processing

Take a look at the sketch first

Core principles of master-slave architecture

When a slave node is started, it sends a PSYNC command to the master node
If the slave node reconnects to the master node, the master node copies only the missing data to the slave node. Otherwise, if the slave node connects to the master node for the first time, full resynchronization is triggered
When full resynchronization starts, the master starts a background thread to produce an RDB snapshot and cache all write commands received from the client in memory. After the RDB file is generated, the Master sends the RDB file to the slave. The slave first writes to the local disk and then loads the data from the local disk to the memory. The master then sends the write commands cached in memory to the slave and the slave synchronizes the data
If a slave node disconnects from the master node due to a network fault, the connection will be automatically reconnected. If the master finds that multiple slave nodes are reconnected, only one RDB operation will be initiated to service all slave nodes with one data copy.

Breakpoint continuation for master/slave replication

Since redis2.8, breakpoint continuation of master/slave replication is supported. If the network connection is disconnected during master/slave replication, the replication can continue where the last replication was made, rather than starting from scratch
The master node has a common backlog in the memory. Both master and slave keep a replica offset and a master ID. Offset is stored in the backlog. Slave will make the master copy from the previous Replica offset
But if no corresponding offset is found, a resynchronization is performed

Diskless replication

The master directly creates the RDB in memory and sends the RDB to the slave

repl-diskless-sync

Repl-diskless-sync-delay Waits a certain amount of time before starting replication because more slaves need to reconnect

Handling expired Keys

The slave does not expire a key, but only waits for the master to expire a key. If the master expires a key, or a key is eliminated through the LRU, a del command is simulated and sent to the slave

A further in-depth look at redis Replication’s complete flow history and principles

The complete flow travel chart is as follows:

Core mechanisms related to data synchronization

This refers to the full replication performed the first time the slave connects to the master

(1) Both master and slave maintain an offset

The master keeps adding offsets to itself, and the slave keeps adding offsets to itself

The slave reports its offset and master every second, and the master saves each slave’s offset

This does not mean that it is specifically used for full replication. Only when the master and slave know their respective data offsets, can they know the data inconsistency between them

(2)backlog

The Master node has a backlog, which is 1MB by default

When the master node copies data to the slave node, the master node synchronizes the data to the backlog

The backlog is primarily used for incremental replication after full replication breaks

(3) Master run ID

Info server, you can see the master run ID

Locating the master node based on host+ IP is unreliable. If the master node restarts or data changes, the slave nodes should be differentiated based on different run ids. If the run ids are different, perform full replication

To restart Redis without changing the run ID, run redis-cli debug reload

(4) psync

The secondary node uses psync to replicate data from the master node. The offset is psync runid

The master node returns a response based on its situation. The response may be fullresync runid offset to trigger full replication or CONTINUE to trigger incremental replication

Full amount of copy

Incremental replication

heartbeat

Asynchronous replication

How to achieve 99.99% high availability under the Master-slave architecture of Redis?

What does it mean that the system is unavailable?

What does system high availability mean?

Redis Sentinel architecture related to the basic knowledge of the explanation

Data loss in redis Sentry active/standby switchover: asynchronous replication, cluster split

Data Loss caused by asynchronous Replication How can I reduce the loss

In-depth analysis of several core underlying principles of Redis Sentry (including slave election algorithm)

How to ensure that the data can be recovered after the Redis hangs?

The interview questions

How can Redis be persisted? What are the pros and cons of different persistence mechanisms? How is persistence implemented at the bottom?

Interviewer psychological analysis

If Redis only cache data in memory, if Redis down, in the restart, all the data in memory is lost….. You must use redis persistence mechanism, to write data into memory at the same time, the asynchronous slowly writing data to disk file, persistent If reids is down, restart, automatically from a disk file is loaded before some data can be persistent, may lose a little data, but at least not all of the data is lost

Analysis of interview questions

See the following figure for the significance of Redis persistence

Implications of Redis persistence machines for disaster recovery in production environments

Data backup is when a failure occurs and most of the data is recovered

Redis RDB and AOF persistence mechanism

RDB and AOF are shown below:

AOF rewrite:

RDB persistence mechanism, periodic persistence of data in REDis
The AOF mechanism writes each write command as a log to a log file in appends-only mode. When Redis restarts, the entire data set can be reconstructed by playing back the write command in the AOF log
If we want Redis to be used only as a pure memory cache, then we can disable all persistence mechanisms of RDB and AOF
Data can be persisted to the disk through RDB and AOF, and then these data can be backed up to other places. For example, if Redis is down on the server of Ali Cloud, the memory and disk data on the server are lost, and the system can be backed up from other places, so that the system can continue to provide services
If both RDB and AOF persistence mechanisms are used, then when Redis restarts, AOF will be used to rebuild the data, because the data in AOF is more complete

The advantages and disadvantages of redis RDB and AOF persistence mechanism are compared

Advantages of RDB persistence mechanism

(1) RDB will generate multiple data files, and each data file represents the data of REDis at a certain time. This method of multiple data files is very suitable for cold backup, and such complete data files can be sent to some remote safe storage, such as cloud services

RDB can be cold standby, generating multiple files, each representing a complete snapshot of the data at one point in time

AOF can also be cold, just one file, but you can, at regular intervals, copy a copy of that file

What is the advantage of RDB to do cold reserve? Redis to control the fixed time generated snapshot file things, more convenient; AOF also needs to write some scripts to do this, various timing, to provide recovery data when faster than AOF blocks

(2) RDB has very little impact on the external read and write services provided by Redis, so that Redis can maintain high performance, because the main process of Redis only needs to fork a sub-process, let the sub-process perform disk IO operations to carry out RDB persistence

AOF writes to files every time. Although it can write to OS cache quickly, it still has some time overhead. RDB is definitely slower than RDBCopy the code

(3) Compared with AOF persistence mechanism, it is faster to restart and restore redis process directly based on RDB data files

AOF stores instruction logs. When data recovery is performed, all instruction logs need to be played back and executed to recover all data in memory. RDB is a data file, which can be directly loaded into memory during recoveryCopy the code

Disadvantages of the RDB persistence mechanism

(1) If you want to lose as little data as possible in the event of a Redis failure, then RDB is not as good as AOF. Generally speaking, RDB data snapshot files are generated every 5 seconds or more. At this point, you have to accept that if Redis goes down, you will lose the last 5 minutes of data

This problem is also the biggest disadvantage of RDB. It is not suitable for the first-priority recovery scheme. If you rely on RDB for the first-priority recovery scheme, more data will be lost

(2) Every time the RDB forks a child process to generate the RDB snapshot data file, if the data file is very large, the service to the client may be suspended for milliseconds or even seconds

Generally, do not allow the RDB interval to be too long. Otherwise, the RDB file generated each time is too large, which may affect the performance of Redis itselfCopy the code

Advantages of AOF persistence mechanism

AOF provides better protection against data loss. Generally, AOF executes fsync every second through a background thread to ensure that the data in OS cache is written to disks and the data is lost for at most one second
AOF log files are written in appends-only mode, so there is no overhead of disk addressing, write performance is very high, and files are not prone to damage, even if the tail of the file is broken, it is easy to repair
Rewrite log (AOF) will rewrite the AOF log file. This will rewrite the AOF log file. Rewrite log will rewrite the AOF log file. When the log files after the merge are ready, the old and new log files can be exchanged
The AOF log file commands are recorded in a very readable way, which is ideal for emergency recovery in the event of a catastrophic error, such as someone accidentally flushing all data. Rewrite in the background has yet to occur, so you can copy the AOF file immediately. Flush all deletes the last flush all command and then puts the AOF file back to restore all data automatically through the recovery mechanism

Disadvantages of AOF persistence mechanism

AOF log files are usually larger than RDB data snapshot files for the same data
When AOF is enabled, write QPS circumvents low RDB because AOF is typically configured to fsync log files once per second, although once per second performance is still high
AOF occur BUG before, it is through AOF record log, data recovery, not restore the same data, so similar AOF this more complex log/merge/playback based on command, every time than based on RDB persistence a complete snapshot of data files, more vulnerable to some, It’s buggy, but aOF is designed to avoid bugs in the rewrite process, so instead of merging the rewrite log, rewrite it based on the data in memory at the time, which is much more robust
The only big disadvantage, in fact, is to do data recovery, will be relatively slow, and do cold backup, regular backup is not very convenient

How to choose BETWEEN RDB and AOF

Don’t just use RDB, it will cause you to lose a lot of data
Also don’t just use AOF, because there are two problems with that. First, if you use AOF for cold backup, you can’t recover as fast as RDB for cold backup.
The second RDB is more robust by simply generating snapshots each time, avoiding the bugs of complex backup and recovery mechanisms such as AOF
A combination of RDB and AOF persistence mechanisms is used

Can you explain how redis cluster mode works? How is redis key addressed in clustered mode? What are the algorithms for distributed addressing? Do you know consistent hash algorithms?

Interviewer psychoanalysis

In the early years, if Redis wanted to build several nodes, each node would store a part of the data, and some middleware would be used to achieve this, such as CODIS or TwemProxy. There is some Redis middleware, you read and write redis middleware, and the middleware is responsible for distributing your data across multiple machines in redis instances.

Now redis is constantly evolving, now everyone is using the native Redis cluster, Redis cluster model, you can deploy multiple instances of Redis on multiple machines, each instance stores a portion of the data, and each instance of Redis mounts a slave node to make sure that the master node is hung, You can switch to a slave node.

Analysis of interview questions

Redis cluster architecture

The Redis cluster supports N Redis master nodes. Each master node can mount multiple slave nodes with separate read and write architectures. For each master, write to the master. Then read from the master to the slave due to high availability, because each master has slave nodes, so if the master fails, the redis cluster mechanism, It will automatically switch a slave to master Redis cluster (multi-master + read-write separation + high availability). We just need to build redis cluster based on redis cluster. There is no need to manually set up replication, master/slave architecture, read/write separation, sentinel cluster, and high availabilityCopy the code

redis cluster vs replication + sentinel

If you have a small amount of data, mainly for high concurrency and high performance scenarios, for example, your cache is usually a few gigabytes, a single machine is sufficient for replication, one master, multiple slaves, depending on the read throughput you need, and then build a Sentinel cluster yourself. To ensure the high availability of redis master-slave architecture, you can use Redis Cluster, mainly for massive data + high concurrency + high availability scenarios, massive data, if your data volume is very large, then it is recommended to use Redis CluterCopy the code

Data distribution algorithm: Hash + consistent hash+ Redis cluster hash slot

The oldest hash algorithm and its drawbacks:

Consistent Hash algorithm (automatic cache migration) + Virtual Node (automatic load balancing)

Hash Slot algorithm of redis cluster

Analysis of the core principles of redis Cluster: Gossip communication, Jedis Smart location, active/standby switchover

Centralized clustered metadata storage and maintenance:

The Gossip protocol maintains cluster metadata:

I. Internal communication mechanism between nodes

1. Basic communication principles

(1) The Redis Cluster nodes use gossip protocol to communicate

Rather than centrally storing cluster metadata (node information, failures, and so on) on a single node, it is kept in constant communication with each other

The data of all nodes in the cluster is complete

The metadata used to maintain a cluster, centralized, is called Gossip

Centralized: The advantage is that metadata update and read, very timely, once the metadata changes, immediately updated to the centralized storage, other nodes can immediately perceive when read

The downside is that all metadata update pressure is concentrated in one place, which can cause metadata storage pressure

Gossip: Updates to metadata are distributed, not centralized in one place. Update requests are sent intermittently to all nodes. There is a certain amount of delay

Disadvantages: Metadata updates are delayed, which may cause some cluster operations to lag

(2) Port 10000

Each node has a dedicated port for communication between nodes, that is, the port number of the service provided by the node +10000. For example, port 17001 is used for communication between nodes

Each node will send ping messages to several other nodes at regular intervals, and the other nodes will return pong after receiving the ping

(3) Information exchanged

Fault information, node addition and removal, Hash Solt information, and so on

2, Gossip protocol

Ping Message Depth

Second, the internal implementation principle of JEDis for cluster

1. Client based on slave direction

2.smart jedis

High availability and principle of active/standby switchover

Can you talk about how we deal with cache avalanches and cache breakdowns in general?

What is redis avalanche and penetration? What happens when Redis crashes? How does the system deal with this? How to handle redis penetration?

Interviewer psychological analysis

In fact, this is a must-ask when you ask about caching, because caching avalanche and penetrating, those are the two biggest caching questions, and either it doesn’t come up, it’s a fatal question when it comes up, so the interviewer will definitely ask you.

Analysis of interview questions

Cache avalanche sending symptoms:

Caching avalanches before and after solutions

Ex ante: Redis is highly available to avoid a total crash

Issue: Local EhCache + Hystrix limiting & downgrading to avoid redis being killed

After the fact: Redis persistence, fast recovery of cached data

Cache penetration symptoms and solutions:

How to ensure data consistency between the cache and the database in dual write?

Cache Aside Pattern

The most classic cache + database read/write pattern: cache aside pattern

1. Cache aside pattern

When reading, read the cache first, if the cache does not have, then read the database, and then fetch the data into the cache

To update, delete the cache first and update the database

2. Why delete cache instead of update cache

The reason for this is simple, because caches are sometimes not simply values fetched directly from the database

For example, a field of a table may be updated, and then the corresponding cache needs to query the data of the other two tables and perform operations to calculate the latest value of the cache

Updating the cache is expensive

If you frequently modify multiple tables in a cache, the cache will be updated frequently

But the question is, will the cache be accessed frequently?

For example, a table whose fields have been changed 20 times in a minute has been cached 20 times, but the cache has been read once in a minute

In fact, if you just delete the cache, the cache is recalculated within a minute, and the overhead is significantly reduced

In fact, deleting the cache, rather than updating it, is the idea of lazy computing. Instead of redoing a complex calculation every time, whether it’s needed or not, let it recalculate when it needs to be used

Analysis and solution design of cache + database dual write inconsistency in high concurrency scenario

For example, the cache of that piece of data with high real-time requirements: inventory service

The inventory may be modified, and each time the cache data is updated, the front-end Nginx service will send a request to the inventory service to obtain the corresponding data once the inventory data is expired or cleared in the cache

When writing to the database, update the Redis cache directly

Actually, it’s not that simple. This is actually a problem, database and cache double write, data inconsistency problem

1. The most elementary cache inconsistency problem and its solution

Problem: Modify the database first, then delete the cache, if the cache deletion fails, then the database is new data, the cache is old data, data inconsistency

If the data fails to be modified, the old data in the database will not be inconsistent, because the cache does not have the old data in the read database, and then update the old data in the cache

2. Analysis of complex data inconsistency

The data has changed, the cache has been deleted, and the database has not been modified

A request comes, reads the cache first, finds that the cache is empty, queries the database, finds and modifies the old data, and puts it in the cache

The data change procedure completes the database modification

The data in the database and cache is different………

3. Why does this problem occur when hundreds of millions of traffic are concurrent?

This problem can occur only when concurrent reads and writes are performed on a single piece of data

In fact, if you have a very low concurrency, especially if you have a very low read concurrency, 10,000 visits per day, then very rarely, you’re going to have the kind of inconsistencies that I just described

However, the problem is that if the daily traffic is hundreds of millions and the concurrent reads per second are tens of thousands, as long as there are data update requests per second, the above database + cache inconsistency may occur

After the high concurrency, there are still many problems

4. The database is asynchronously serialized for cache updates

5. Problems needing attention in this solution in high concurrency scenarios

Can you talk about how redis concurrency competition should be addressed?

The interview questions

What are redis’ concurrency competition issues? How to solve this problem? Do you know the CAS scheme for Redis transactions?

Interviewer psychological analysis

This is also a very common problem online, that is, when multiple clients simultaneously write a key, the data that should have arrived first may arrive later, resulting in the wrong version of data, or when multiple clients obtain a key at the same time, modify it and then write back, as long as the order is wrong, the data will be wrong

And Redis has its own Cas-like optimistic locking solution that naturally solves this problem

Analysis of interview questions

What is the deployment architecture of the Redis cluster in your company’s production environment?

The interview questions

How is Redis deployed in production?

Interviewer psychological analysis

If you don’t understand the deployment architecture of your company’s Redis production cluster, then indeed you are negligent. Is your Redis master-slave architecture? Cluster architecture? What kind of clustering scheme is used? Is there a high availability guarantee? Is persistence enabled to ensure data recovery? How many gigabytes of memory are given on redis? What parameters are set? How many QPS does your Redis cluster carry after pressure testing?

Analysis of interview questions

Redis cluster has 10 machines, 5 of which deploy the master instance of Redis, and the other 5 deploy the slave instance of Redis. Each master instance has a slave instance. 5 nodes provide read and write services externally, and the peak QPS of each node may reach 50,000 per second. The maximum for five machines is 250,000 read/write requests /s.

The machine is 32G memory + 8-core CPU, but the redis process is allocated 10G memory, the general online production environment, redis memory should not exceed 10G, more than 10G May have problems

Because each master instance has a slave instance, it is highly available. If any master instance goes down, it will automatically failover and Redis will automatically change from a real case to a master instance to continue providing read and write services

What data are you writing to memory? What is the size of each piece of data? Commodity data, each piece of data is 10KB, 100 pieces of data is 1MB, 100,000 pieces of data is 1G, the resident memory is 2 million pieces of commodity data, occupying 20 to 20, only less than 50% of the total memory.

Interview questions about distributed caching

How is caching used in a project? What happens if the cache is not used properly?

Interviewer psychological analysis

Analysis of interview questions

What’s the difference between Redis and memcached? What is the threading model of Redis? Why is single-threaded Redis more efficient than multi-threaded memcached (why is Redis single-threaded but still supports high concurrency?) ?

Interviewer psychological analysis

Analysis of interview questions

What data types do Redis have? In what scenarios is it suitable?

Can you introduce redis expiration policy? Why don’t you write an LRU by hand?

Interviewer psychological analysis

Analysis of interview questions

How to ensure high concurrency, high availability and persistence of Redis? Can you introduce the master slave replication principle of Redis? Can you introduce the Sentry principle of Redis?

Interviewer psychological analysis

Analysis of interview questions

The relationship between high concurrency in Redis and high concurrency in the whole system

Where is the bottleneck that Redis can’t support high concurrency?

What should redis do if it wants to support more than 100,000 concurrent requests

The core mechanism of Redis Replication

The implications of master persistence for master-slave architecture security

Principle of redis master-slave replication, resumable data transfer, diskless replication, expired key processing

A further in-depth look at redis Replication’s complete flow history and principles

How to achieve 99.99% high availability under the Master-slave architecture of Redis?

Redis Sentinel architecture related to the basic knowledge of the explanation

Data loss in redis Sentry active/standby switchover: asynchronous replication, cluster split

In-depth analysis of several core underlying principles of Redis Sentry (including slave election algorithm)

How to ensure that the data can be recovered after the Redis hangs?

The interview questions

Interviewer psychological analysis

Analysis of interview questions

Implications of Redis persistence machines for disaster recovery in production environments

Redis RDB and AOF persistence mechanism

The advantages and disadvantages of redis RDB and AOF persistence mechanism are compared

Advantages of RDB persistence mechanism

Disadvantages of the RDB persistence mechanism

Advantages of AOF persistence mechanism

Disadvantages of AOF persistence mechanism

How to choose BETWEEN RDB and AOF

Can you explain how redis cluster mode works? How is redis key addressed in clustered mode? What are the algorithms for distributed addressing? Do you know consistent hash algorithms?

Interviewer psychoanalysis

Analysis of interview questions

Redis cluster architecture

redis cluster vs replication + sentinel

Data distribution algorithm: Hash + consistent hash+ Redis cluster hash slot

Analysis of the core principles of redis Cluster: Gossip communication, Jedis Smart location, active/standby switchover

I. Internal communication mechanism between nodes

1. Basic communication principles

2, Gossip protocol

Ping Message Depth

Second, the internal implementation principle of JEDis for cluster

1. Client based on slave direction

2.smart jedis

High availability and principle of active/standby switchover

Can you talk about how we deal with cache avalanches and cache breakdowns in general?

Interviewer psychological analysis

Analysis of interview questions

How to ensure data consistency between the cache and the database in dual write?

Cache Aside Pattern

Analysis and solution design of cache + database dual write inconsistency in high concurrency scenario

1. The most elementary cache inconsistency problem and its solution

2. Analysis of complex data inconsistency

3. Why does this problem occur when hundreds of millions of traffic are concurrent?

4. The database is asynchronously serialized for cache updates

5. Problems needing attention in this solution in high concurrency scenarios

Can you talk about how redis concurrency competition should be addressed?

The interview questions

Interviewer psychological analysis

Analysis of interview questions

What is the deployment architecture of the Redis cluster in your company’s production environment?

The interview questions

Interviewer psychological analysis

Analysis of interview questions

Related Posts

GoFrame framework (RK-boot): quickly implement JWT authentication on the server

Common Linux service O&M scenarios and shell commands

Implement your own blocking queue from 0 to 1 (bottom)