preface

Now that everyone’s working life is basically back on track, it’s job-hunting interview season and some of you have already received an offer or two.

During this period, we collected Java interview questions from Alibaba, Tencent, Baidu, JINGdong, Meituan, Bytedance and other companies, and summarized Redis series of high-frequency interview questions:

1. Redis persistence mechanism

2. Cache avalanche, cache penetration, cache warm-up, cache update, cache degradation and other problems

3. What are hot and cold data

4. What are the differences between Memcache and Redis?

5. Why is single-threaded Redis so fast

6. Data types of Redis and usage scenarios of each data type

7. Redis expiration strategy and memory elimination mechanism

8. Why is Redis single threaded

9, Redis common performance problems and solutions?

10, why Redis operation is atomic, how to ensure atomic?

11. Redis transactions

Redis series of high-frequency interview questions analysis

Redis persistence mechanism

Redis is an in-memory database that supports persistence. Through the persistence mechanism, data in memory is synchronized to disk files to ensure data persistence. When Redis is restarted, data can be recovered by reloading the hard disk files into memory.

Implementation: fork() a separate child process, copy the current parent’s database data into the child process’s memory, and then the child process

Write to a temporary file, the persistence process is finished, and then replace the last snapshot file with this temporary file, and then the child process exits,

Memory release.

RDB is the default Redis persistence mode. The data in the memory is saved as a snapshot to a disk at a specified period

Base file. The generated data file is dump. RDB, which is determined by parameter save in the configuration file

Indicates the period of snapshot. (A snapshot can be either a copy or a replica of the data it represents.)

AOF: Redis appends every received Write command to the end of the file using the Write function, similar to MySQL’s binlog. when

A Redis restart recreates the contents of the entire database in memory by re-executing the write commands saved in the file.

When both methods are enabled, Redis preferentially selects AOF for data recovery.

Cache avalanche, cache penetration, cache warming, cache update, cache degradation and other issues

1. Cache avalanche

We can simply understand: because the original cache is invalid, the new cache is not in the period

(for example, when we set the cache to the same expiration time, a large cache expiration occurred at the same time), all should have visited

Ask the cache requests to query the database, and the database CPU and memory caused a huge pressure, serious will cause database downtime. It creates a chain reaction that causes the whole system to collapse.

Solutions:

Most system designers consider locking (the most common solution) or queuing to ensure that there are not too many threads reading or writing to the database at once, thus avoiding a large number of concurrent requests falling on the underlying storage system in the event of a failure. Another simple solution is to spread out cache expiration times.

2. Cache penetration

Cache penetration is when the user queries for data that is not present in the database and certainly not present in the cache. This causes the user to query in the

It is not found in the cache, and every time the database is queried again, it returns null (equivalent to two useless queries). So please

To bypass the cache to directly check the database, which is often mentioned in the cache hit ratio problem.

Solutions;

The most common is to use a Bloom filter, which hashes all possible data into a bitmap large enough that none exists

The bitmap intercepts the existing data, thus avoiding the query pressure on the underlying storage system.

There is also a simpler, more crude approach, if a query returns null data (either the data does not exist, or the system is faulty)

We still cache the empty result, but its expiration time will be short, no more than five minutes. So let’s set it directly by this

The easiest way to do this is to store the default values in the cache so that they are retrieved from the cache the second time, without continuing to access the database

Rough.

5TB hard disk is full of data, please write an algorithm to rearrange the data. What if the data is 32bit? What if it’s 64-bit?

One extreme use of space is bitmaps and Bloom filters.

Bitmaps: Typically hash tables

The downside is that bitmaps can only record 1bit of information per element, so if you want to do extra functionality, you have to sacrifice more space

Bloom filter (recommended)

It is to introduce k(k>1) and K (k>1) mutually independent hash functions to ensure the completion of element weight judgment in a given space and misjudgment rate

Cheng.

Its advantage is that its space efficiency and query time are far more than the general algorithm, but its disadvantage is that it has certain error recognition rate and deletion difficulty.

The core idea of Bloom-Filter algorithm is to use multiple different Hash functions to resolve “conflicts”.

Hash has a collision problem, where two urls with the same Hash may have the same value. To reduce conflict,

We can introduce a few more hashes, and if we know from one of them that an element is not in the set, then that element must be

Not in the set. The element is in the collection only if all the Hash functions tell us that it is. this

That’s the basic idea behind the Bloom-filter.

Bloom-Filter is generally used to determine whether an element exists in a large amount of data set.

3. Cache preheating

Cache preheating this should be a common concept, I believe many partners should be easy to understand, cache preheating is the system

After the system is online, the relevant cache data is directly loaded into the cache system. This avoids the need to query the data first when the user requests it

Library and then cache the data again! Users directly query cached data that has been preheated in advance!

Solution:

(1) Write a cache refresh page directly and operate it manually when it goes online;

(2) The amount of data is not large and can be loaded automatically when the project is started;

(3) Periodically refresh the cache;

(4) Cache update

In addition to the cache invalidation policies that come with the cache server (Redis has six policies to choose from by default), we can also use the

Customized cache flushing is implemented based on business requirements. There are two common strategies:

(1) Periodically clean expired cache;

(2) When there is a user request, and then determine whether the cache used in this request is expired, expired to the underlying system to get a new number

Data and update the cache.

Both have their advantages and disadvantages. The first disadvantage is that it is troublesome to maintain a large number of cached keys, and the second disadvantage is that each time the user requests

To determine cache invalidation, logic is relatively complex! You can weigh which solution to use according to your own application scenario.

5. Cache degradation

When traffic surges, service problems (such as slow or unresponsive response times) occur, or non-core services affect the performance of the core process, you still need to ensure that the service is still available, even at the expense of the service. The system can automatically degrade according to some key data, or manually degrade by configuring switches.

The ultimate goal of a downgrade is to ensure that the core service is available, even if it is lossy. And some services can’t be downgraded (add to cart, checkout).

To reference log level setup plan:

(1) General: For example, some services can be automatically degraded due to timeout due to network jitter or online services;

(2) Warning: If the success rate of some services fluctuates within a period of time (such as between 95 and 100%), it can be automatically degraded or manually degraded, and an alarm can be sent;

(3) error: for example, the availability rate is lower than 90%, or the database connection pool was hit, or the number of visits suddenly jumped to the maximum threshold that the system can withstand, at this time can be automatically degraded or manually degraded according to the situation;

(4) Serious error: for example, due to special reasons, the data is wrong, and emergency manual downgrade is needed at this time.

The purpose of service degradation is to prevent the Redis service failure from causing an avalanche of database problems. Therefore, for unimportant cached data, service degradation strategy can be adopted. For example, a common practice is that Redis does not query the database, but directly returns the default value to the user.

What is hot data and cold data

Hot data, cache is valuable

For cold data, most of the data may be squeezed out of the memory before it is accessed again, which not only takes up memory, but also has different value.

For hot data, such as one of our IM products, birthday greeting module, and the list of birthday stars of the day, the cache may read hundreds of thousands of times.

Another example is a navigation product where we cache the navigation information and then read it millions of times.

To make sense, the cache must be read at least twice before the data is updated. This is the basic strategy, if the cache fails before it can work, then

It doesn’t have much value.

What about scenarios that exist and have a high frequency of changes, but you have to worry about caching? There are! For example, the pressure of the read interface on the database

The force is very large, but it is hot data, this time we need to consider caching means to reduce the pressure of the database, such as our some help

The likes, favorites and shares of hand products are typical hot data, but they are constantly changing. In this case, the data need to be the same

Step save to Redis cache, reduce database stress.

What are the differences between Memcache and Redis?

1. Memecache Stores all data in memory. It will hang up after power failure. Redis

Some of it is stored on hard disk, and Redis can persist its data

Memcached All values are simple strings. Redis is an alternative that supports richer data

Type to store data structures such as list, set, zset, hash, etc

3. Different underlying models are used. The underlying implementation modes between them and the application protocols for communication with clients are different. Redis directly from

The VM mechanism has been built because normal system calls to system functions would waste a certain amount of time moving and requesting.

4. Different value sizes: Redis can be up to 1GB; Memcache is only 1MB.

Redis is much faster than memcached

6. Redis supports data backup, namely data backup in master-slave mode.

Why is single-threaded Redis so fast

1. Pure memory operation

2, single thread operation, avoid frequent context switch

3, the use of non-blocking I/O multiplexing mechanism

Redis data types and usage scenarios for each data type

Answer: Five in all

1, the String

The most common set/get operation, value can be either a String or a number. Usually do some complicated calculations

Number function cache.

2, the hash

Here value stores a structured object, and it is convenient to manipulate a field within it. When bloggers do single sign-on,

This data structure is used to store user information, with cookieId as the key and 30 minutes as the cache expiration time, which can be simulated well

Produces a session-like effect.

3, the list

Using the List data structure, you can do simple message queue functions. The other thing is, you can use the lrange command to do the base

Redis paging function, excellent performance, good user experience. I also use a scene, very appropriate – take market information. It’s also a producer and consumer scenario. LIST can be a good way to complete the queuing, first-in, first-out principle.

4, the set

Because a set is a collection of non-repeating values. So you can do global deduplication function. Why not use a Set that comes with the JVM

Heavy? Because our system is generally clustered deployment, using the JVM’s own Set, more trouble, do a global to do a

Again, another public service. Too much trouble.

In addition, it is the use of intersection, union, difference set and other operations, can calculate common preferences, all preferences, their own unique preferences and other functions.

5, sorted set

Redis expiration strategy and memory flushing mechanism

Redis uses a periodic delete + lazy delete strategy.

Why not use a timed deletion policy?

A timer is used to monitor the key. If the key expires, it will be deleted automatically. Although memory is released in time, it consumes CPU resources.

In the case of large concurrent requests, the CPU spends time processing the request, not deleting the key, so this strategy is not used.

How does periodic deletion + lazy deletion work?

Delete periodically. By default, Redis checks every 100ms for expired keys and deletes expired keys. To be clear, Redis is not

All keys are checked once every 100ms, but randomly selected for checking (if every 100ms, all keys are checked, redis is not

Not stuck). Therefore, if you only use the periodic deletion policy, many keys will not be deleted.

Here, lazy delete comes in handy. This means that when you get a key, Redis will check if the key is set

Expiration Date So is it expired? If it is out of date it will be deleted.

Is there no other problem with periodic deletion + lazy deletion?

No, if the key is not deleted regularly. Then you didn’t ask for the key immediately, so lazy deletion didn’t take effect either. In this way,

Redis memory will get higher and higher. Then memory flushing should be used.

There is a line of configuration in redis.conf

maxmemory-policy volatile-lruCopy the code
This configuration is configured with a memory elimination policy (what, you haven’t? Take a good look at yourself.

Volatile – lRU: Selects the least recently used expires data from a set with an expiration date (server.db[I].expires)

Volatile – TTL: Selects expired data from a set (server.db[I].expires) to be discarded

Volatile -random: Selects any data from a set with an expiration date (server.db[I].expires) to be discarded

Allkeys-lru: Culls the least recently used data from the dataset (server.db[I].dict)

Allkeys-random: Random selection of data from a dataset (server.db[I].dict)

No-enviction: forbids the expulsion of data. New write operations will report an error

Ps: Prerequisites are not met if the EXPIRE key is not set. So volatile- LRU, volatile-random and volatile

Volatile – TTL policy behavior is essentially the same as Noeviction policy behavior.

Why is Redis single threaded

According to the official FAQ, because Redis is a memory-based operation, CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of machine memory or network bandwidth. Since single-threading is easy to implement and the CPU is not a bottleneck, it makes sense to go with a single-threaded solution (there is a lot of trouble with multi-threading after all!). Redis uses queue technology to turn concurrent access into serial access

1. Most requests are pure memory operations (very fast)

2, the use of single thread, avoid unnecessary context switch and competition conditions

3, non-blocking IO advantages:

(1) Fast, because the data is stored in memory, which is similar to HashMap. The advantage of HashMap is that the time complexity of search and operation is O(1).

(2) Support rich data types, support string, list, set, sorted set, hash

(3) Support transactions, operations are atomicity, the so-called atomicity of the data is either all executed, or not all executed

(4) Rich features: can be used for cache, message, according to the key set expiration time, expiration will be automatically deleted how to solve the redis concurrent competition key problem

Multiple subsystems set a key simultaneously. What should we pay attention to at this time? Redis transactions are not recommended. Because our production environment, basically redis cluster environment, do data sharding operation. If you have multiple key operations in a transaction, they may not all be stored on the same Redis-server. Therefore, redis transaction mechanism, very weak.

(1) If the key operation does not require sequence: prepare a distributed lock, everyone to grab the lock, grab the lock to do the set operation can be done. And so on.

(2) using the queue, the set method into serial access can also be redis encounter high concurrency, if you ensure the consistency of the read and write key to redis operations are atomic, is thread-safe operation, you do not have to consider the concurrency problem, Redis has helped you deal with the problem of concurrency.

Redis Common performance issues and solutions?

1. It is best for the Master not to do any persistent work, such as RDB memory snapshots and AOF log files

2. If the data is important, a Slave enables AOF backup and synchronizes data once per second

3. To ensure the replication speed and connection stability, it is recommended that the Master and Slave reside in the same LAN

4. Try to avoid adding slave libraries to stressed master libraries

5, Master < -slave1 < -slave2 < -slave3…

Why are Redis operations atomic and how are they guaranteed to be atomic?

For Redis, the atomicity of a command means that an operation is not separable and that the operation is either executed or not executed.

Redis operations are atomic because Redis is single-threaded.

All apis provided by Redis are atomic operations, and transactions in Redis are meant to be atomic for batch operations.

Are multiple commands atomic in concurrency?

Not necessarily, change get and set to single command operation, incr. Transactions using Redis, or Redis+Lua==.

Redis transactions

Redis transaction functionality is implemented through four primitives: MULTI, EXEC, DISCARD and WATCH

Redis serializes all the commands in a transaction and executes them sequentially.

Redis does not support rollback. Redis does not roll back a transaction when it fails, but continues to execute the remaining commands

To keep it simple and fast.

2. If a command in a transaction fails, none of the commands are executed.

3. If a runtime error occurs in a transaction, the correct command is executed.

The MULTI command is used to start a transaction and always returns OK. MULTI after execution, the client can send any number of commands to the server. These commands are not executed immediately, but are placed in a queue. When the EXEC command is invoked, all commands in the queue will be executed.

(2) EXEC: execute all the commands in the transaction block. Returns the return value of all commands in the transaction block, in the order in which they were executed. Returns nil when the operation is interrupted.

(3) By calling DISCARD, the client can empty the transaction queue and abandon the transaction, and the client will exit from the transaction state.

(4) The WATCH command can provide check-and-set (CAS) behavior for Redis transactions. One or more keys can be monitored, and once one of them is modified (or deleted), subsequent transactions are not executed, up to the EXEC command.

The last

Welcome to pay attention to the public number: programmers chasing wind, get a 300 page PDF document Java core knowledge summary!

These are some of the things that the interviewer should ask during the interview. These include basics, Java collections, JVMS, multi-threaded concurrency, Spring principles, microservices, Netty and RPC, Kafka, diaries, design patterns, Java algorithms, databases, Zookeeper, distributed caching, data structures, and more.