Three paunchy, middle-aged men in plaid shirts walk up to you with three scratched Macs, looking at their bald hair and thinking they must be top fucking architects.

Hello young man, I asked you the basics and some common big questions about caching, can you tell me why Redis is so fast?

Hello, handsome and charming interviewer, let’s take a look at the essential difference between relational database and Redis.

Redis adopts KV database based on memory, which adopts single-process single-thread model and is written by C language. The official data can reach 100000+ QPS (query times per second).

  • Completely memory based, the vast majority of requests are pure memory operations, very fast. Its data is stored in memory, similar to a HashMap, which has the advantage of having O(1) time for both lookup and operation;
  • The data structure is simple and the operation of the data is also simple. The data structure in Redis is specially designed.
  • The use of single thread, avoid unnecessary context switch and competition conditions, there is no multi-process or multi-threading caused by the switch and CPU consumption, do not have to consider the problem of various locks, there is no lock release lock operation, there is no performance consumption due to the possibility of deadlock;
  • Use multiplex I/O multiplexing model, non-blocking IO;
  • The use of the underlying model is different, between them the underlying implementation and communication with the client between the application protocol is not the same, Redis directly built their own VM mechanism, because the general system call system function, will waste a certain amount of time to move and request;

May I ask what context switching is?

I can for example: I remember to have a friend WeChat asked me a context switch is what, why threads may not safe, so I said, just as you are reading an English book, you can see the first ten pages will not find a word to read, you add a bookmark, then go to the dictionary, after a while you come back to read from a bookmark, ok so far do not have what problem.

If you read the book by yourself, there will be no problem, but when you check it out, other friends wonder what you are reading, so they turn over your book and slip away. Oh oh, when you look at it again, you will find that the book is not the page you are reading.

I do not know that so far I have not explained clearly, and why the thread is not safe, is because you a person how to see all right, but more people to change the operation of a book data chaos. Maybe my explanation is crude, but the idea is the same.

So it’s single-threaded, we have multi-core servers now, isn’t that wasteful?

Yes, it is single-threaded, but we can open multiple instances of Redis on a single machine.

Since the bottleneck of single opportunity is mentioned, how do you solve the bottleneck?

We use the cluster deployment mode, namely Redis cluster, which is separated from master and slave synchronous read and write, similar to the master and slave synchronization of Mysql. The Redis cluster supports N Redis master nodes. Each master node can mount multiple slave nodes.

This allows Redis to scale horizontally. If you want to support a larger cache, horizontally expand the number of master nodes so that each master node can hold more data.

Oh? So the question is, how do they interact with each other? And how does Redis persist? Redis data are in memory, a power or restart will not be there?

Yes, persistence is an important part of Redis high availability, because Redis data in memory, persistence must have, I know there are two ways of persistence.

  • RDB: RDB persistence mechanism, which periodically persists data in Redis.
  • AOF: The AOF mechanism logs every write command to a log file in appends-only mode. Since this mode is append only, there is no disk addressing overhead, so it is very fast, a bit like Mysql’s binlog.

Two ways can the Redis persistence data in memory to disk, and then the data backup to go elsewhere, RDB is more suitable for cold standby, AOF is more suitable for hot standby, such as an electricity company in hangzhou I have these two data, I a backup to my hangzhou node, a backup to Shanghai again, if you can’t avoid the natural disasters happen, Also can’t hang two places together, this disaster is also remote disaster, he can’t destroy the earth.

When both mechanisms are enabled, Redis will default to using AOF to rebuild data when it restarts because AOF data is more complete than RDB.

What are the pros and cons of each?

Let me start with RDB

Advantages:

He can generate multiple data files, each data file respectively represents a moment Redis data inside, this way, is there a feel very suitable for cold standby, complete data set up operational timing tasks, timing synchronization to the remote server, such as ali’s cloud services, so that once the online hang up and you want to restore data, how many minutes ago Just go to the remote and copy the previous data.

RDB has very little impact on Redis’s performance because it only forks a subprocess to persist data while synchronizing data, and it recovers data faster than AOF.

Disadvantages:

RDBS are snapshot files that are generated every five minutes or more by default, which means that all data in the five minutes between one synchronization and the next is likely to be lost. AOF can lose one second of data at most.

If the file is too large, the client may pause for a few milliseconds or even a few seconds. When your company is doing a seckill, it just forks a child process to generate a large snapshot. Oh oh, big problem.

Let’s talk about AOF

Advantages:

As mentioned above, RDB takes snapshots every five minutes, but AOF takes snapshots every second through a background thread called fsync, which loses at most one second of data.

AOF writes to log files in appends-only mode. It only writes data in append mode, which naturally reduces the overhead of disk addressing. The write performance is amazing, and the file is not easy to damage.

AOF logs are recorded in a way called very readable, which makes them suitable for emergency recovery in case of catastrophic data deletion. For example, the company’s interns flushes all data in flushall, so you can make a copy of the AOF log file as soon as the backend rewrite doesn’t happen. Delete the last flushall command and we’re done.

If you want to buy a Redis server, don’t try it. If you want to buy a Redis server, don’t try it. If you want to buy a Redis server, don’t try it.

Disadvantages:

The same data, AOF file is larger than RDB.

[bug Mc-10899] – Redis will update the cache data asynchronously every second, fsync will update the cache data asynchronously, and the cache data will persist. If not, I will tell you that the performance is probably too low to use, and you can think about why.

How do you choose between the two?

Kids make choices. I want them all.

If you use RDB alone, you will lose a lot of data. If you use AOF alone, your data recovery will not be as fast as RDB. When will you use RDB in the first time? The combination of hot and cold backup is the king of a robust system in the Internet era.

By the way, I heard you mention high availability. Is there any other way Redis can ensure high availability of clusters?

!!!!!!!!! 4. Dig a hole in the ground for yourself (you’re actually saying this word in the morning, waiting for him to ask you, in case he doesn’t).

Pretend to think about it for a moment (not too long, lest you think you won’t). Oh, I remember, there’s sentinel.

Sentinels must have three instances to ensure their robustness, and sentinels + master/slave does not guarantee data loss, but high availability of the cluster.

Why do you have to have three instances? Let’s see what happens to two sentinels.

Master is down. If either sentry S1 or S2 thinks you are down, they switch and one sentry is elected to perform the failure, but most sentries need to be running at this point.

So what’s wrong with that? M1 is down, S1 is not down that’s actually OK, but the whole machine is down? The sentinels were left with S2 bare dials and no sentinels to allow failover. Although R1 was still on the other machine, the failover was not performed.

A classic Sentinel cluster looks like this:

M1 machine is down, there are two sentinels, two people a look he is not down, then we will elect one out to perform failover not good.

Let me briefly summarize the main functions of the Sentinel component:

  • Cluster monitoring: Monitors whether the Redis master and slave processes are working properly.
  • Message notification: If a Redis instance fails, the sentry is responsible for sending a message as an alarm notification to the administrator.
  • Failover: If the master node fails, it is automatically transferred to the slave node.
  • Configuration center: Notifies the client client of the new master address if failover occurs.

I remember you mentioned master-slave synchronization, can you talk about how data is synchronized between master and slave?

Interviewer: Your memory is so great that I almost forgot you remembered it. Thank you so much for mentioning it. It is more closely related to RDB and AOF of data persistence that I mentioned earlier.

Let me first say why we use master-slave architecture mode. As mentioned above, QPS on a single machine has a ceiling, and Redis is a feature that must support high read concurrency, so you can read and write on a single machine, who can stand this, not a person! But wouldn’t it be much better if you let the master do the writing, synchronize the data to the other slaves, and they all read it, distributing a lot of requests, and you can easily scale it horizontally when you expand it.

Anyway, how do they synchronize their data?

When you start a slave, it sends a psync command to the master. If the slave connects to the master for the first time, it triggers a full copy. The master will start a thread, generate a snapshot of the RDB, and cache all the new write requests in memory. When the RDB file is generated, the master will send the RDB file to the slave. The first thing the slave does when it gets the RDB file is write it to its local disk and load it into memory. The master then sends the new names cached in memory to the slave.

Data transmission when the network or the server hung how to do ah?

There is no network problem in the transmission process, it will be automatically reconnected, and the missing data will be made up after the connection.

It is important to remember that when RDB snapshot data is generated, the cache must also start accepting new requests. Otherwise, your old data will pass. What happens to your incremental data during synchronization? Isn’t it?

So with that said, can you talk about his memory flushing mechanism, and write the LRU code by hand?

Handwritten LRU? Do you want to just jump up and say: Are U F.. K $to me?

This is a question I was personally asked when I was on The third side of Ant Financial. I don’t know if you’ve ever been asked this question.

Redis expiration strategy, there are two kinds of periodic deletion + lazy deletion.

Regular good understanding, default 100ms randomly selected some set expiration time key, to check whether expired, expired delete.

Why not scan all keys with expiration dates?

If all keys in Redis have an expiration date, scan them. That’s horrible, and we almost always set an expiration date online. Full table scan is the same as when you go to check the database without where condition not walk index full table scan, 100ms once, Redis tired to death.

If there aren’t a lot of random keys, aren’t there a lot of invalid keys?

Good question, inert delete, see the name know meaning, inert, I do not take the initiative to delete, I am lazy, I wait for you to query I see you expired, expired on the deletion do not return to you, did not expire how how.

Finally is if of if, regular did not delete, I also did not query, that can do?

Memory elimination mechanism!

The official website to the memory elimination mechanism is the following:

  • Memory limit is reached and the client tries to execute commands that will allow more memory to be used (most write instructions, but DEL and a few exceptions)
  • Allkeys-lru: Attempts to reclaim the least-used key (LRU) to make room for newly added data.
  • Volatile – LRU: Attempts to reclaim the least-used key (LRU), but only the key in the expired set, so that newly added data has space to store.
  • Allkeys-random: Retrieves random keys to make room for newly added data.
  • Volatile -random: Retrievals random keys to make room for newly added data, but only for keys in expired collections.
  • Volatile – TTL: Retrievals the keys in the expired set, and prioritized the keys with a shorter TTL to make room for newly added data. The policies volatile- LRU, volatile- Random, and volatile- TTL are designed to be like noeviction if no key satisfies reclamation prerequisites.

As for LRU, I also briefly mention that handwriting is too long, you can go to Redis official website to see, I will show you the effect of myopia LUR

The reason why Redis doesn’t use a real LRU implementation is because it requires too much memory. However, the approximate LRU algorithm should be equivalent to the application. Using the real LRU algorithm can be compared with the approximate algorithm by the following image.

You can see three kinds of dots in the picture, forming three kinds of bands.

  • Light gray bands are objects that have been reclaimed.
  • Gray bands are objects that have not been collected.
  • The green band is the object being added.
  • In the theory of the LRU implementation, we hope that the first half in the old key will expire. Redis’s LRU algorithm is the outdated key of probability.

As you can see, Redis 3.0 is better than Redis2.8 at all five samples, and most objects in Redis2.8 remain between the last access. Redis 3.0 approximations using 10 sample sizes are already very close to theoretical performance.

Note that the LRU is just a model for predicting how keys will be accessed. Also, if your data access pattern is very close to the power law, most of the access will be concentrated in a set of keys, and LRU’s approximation algorithm will handle it well.

In fact, we are familiar with the LinkedHashMap also implemented Lru algorithm, the implementation is as follows:

When the capacity exceeds 100, the LRU policy begins: discard the least recently unused TimeoutInfoHolder object evict.

In a real interview, you will be asked to write a LUR algorithm, but you should not do the original one, which is really a lot of writing. You should either hate the above one or the following one. It is relatively easy to find a data structure to implement the Java version of LRU.

Young man, you do have something, HRBP will contact you, please make sure to keep your cell phone open, ok?

All right, thank you, interviewer. Great interviewer. I want to meet a few more times.

All right, let’s have fun, let’s have fun, let’s not make fun of the interview, I wrote this for the sake of the show, please take the interview seriously.

Wenyuan network, only for the use of learning, such as infringement, contact deletion.

I have collected quality technical articles and experience summary in my public account “Java Circle”.

In order to facilitate your learning, I also organized a set of learning materials, covering Java virtual machine, Spring framework, Java threads, data structures, design patterns and so on, free for students who love Java!