Today, I was in an interview for a Java development position in a large factory, and a middle-aged man walked towards me with a dusty man holding a Mac with a bright screen in his hand. He gave me a polite smile and said, “I’m sorry to keep you waiting”. Then he motioned me to sit down and said, “Let’s start. After reading your resume, I think you have a good command of Redis. . I thought, “If you come, you will come.”

What is the Redis


Interviewer: Why don’t you tell me what Redis is first?

Redis is an open source (BSD protocol compliant) high performance key-value (key-value) in-memory database developed in C language. It can be used as database, cache, message middleware, etc.

It is a kind of NoSQL (not only SQL, generally refers to non-relational database) database.

I paused, then added, Redis as an in-memory database:

  • Excellent performance, data in memory, read and write speed is very fast, support concurrent 10W QPS.

  • Single process single thread, is thread safe, using IO multiplexing mechanism.

  • Rich data types, supporting strings, hashes, lists, sets, sorted sets, etc.

  • Data persistence is supported.

    You can save the data in memory to disk and load it upon restart.

  • Master slave replication, sentry, high availability.

  • Can be used as a distributed lock.

  • It can be used as messaging middleware, supporting publish and subscribe.

Five data types


Interviewer: That’s a good summary. It seems that you have prepared for it. I just heard you mentioned that Redis supports five data types. Can you briefly explain these five data types?

Me: Sure, but before I do, I think it’s worth taking a look at how Redis internal memory management describes these five data types.

With that, I drew a picture for the interviewer:

Me: First, Redis internally uses a redisObject to represent all keys and values.

The main information of a redisObject is shown in the figure above: Type specifies the data type of a value object, and Encoding specifies how different data types are stored in Redis.

For example: type=string specifies that value is a normal string, so encoding can be raw or int.

I paused for a moment, then said, let me briefly describe the five data types:

String is the most basic Redis type. It can be interpreted as the same as Memcached. A Value can be a number as well as a String.

The String type is binary safe, meaning that Redis strings can contain any data, such as JPG images or serialized objects. A String value can store a maximum of 512 MB.

②Hash is a collection of key-values. Redis Hash is a mapping of String keys and values, and Hash is especially good for storing objects. Common commands: hget, hset, hgetall, etc.

A List is a simple List of strings, sorted by insertion order. You can add an element to the head (left) or tail (right) of the list. Common commands: lpush, rpush, lPOP, rpop, lrange (get list fragments), etc.

Application scenario: List application scenario is very many, it is also one of the most important data structure of Redis, such as Twitter follow List, fan List can be implemented with the List structure.

Data structures: Lists are linked lists that can be used as message queues. Redis provides Push and Pop operations on a List, as well as an API for manipulating a segment, allowing you to query or delete elements of a segment directly.

Implementation: Redis List is a two-way linked List, which can support reverse lookup and traversal, more convenient operation, but brings additional memory overhead.

④Set is an unordered Set of String types. Collections are implemented via HashTable. The elements in a Set are non-sequential and non-repetitive. Common commands: SDD, spop, smembers, sunion, etc.

Application scenario: A Redis Set provides the same functions as a List, except that the Set is automatically de-duplicated and provides the function to determine whether a member is in a Set.

⑤Zset, like Set, is a Set of elements of type String, and no duplicate elements are allowed. Common commands include zadd, zrange, zrem, and zcard.

Usage scenario: The Sorted Set can be Sorted by the user providing an additional priority (score) parameter, and is inserted in order, that is, automatic sorting.

When you need an ordered and non-repetitive list of collections, choose the Sorted Set structure.

Compared with Set, Sorted Set is associated with a parameter Score with the weight of Double type, so that the elements in the Set can be Sorted in order according to Score. Redis sorts the members of the Set from smallest to largest by Score.

Redis Sorted Set internally uses HashMap and skipList to ensure that data is stored and ordered. HashMap is a mapping of members to Score.

The skip list stores all the members and sorts according to the Score stored in HashMap. The structure of skip list can obtain relatively high search efficiency and is relatively simple to implement.

Summary of application scenarios of data types:

Interviewer: I can’t believe that you have spent a lot of time on it. You must have used Redis cache, right? Me: Yes. Interviewer: Can you tell me how you use it? I use it in conjunction with Spring Boot. There are generally two ways to use it, either directly with RedisTemplate or by integrating Redis with Spring Cache (i.e. annotations).

Redis cache


To use it directly with RedisTemplate, add the following dependencies to Redis Pop.xml using Spring Cache:

<dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-redis</artifactId> </dependency> <dependency> <groupId>org.apache.commons</groupId>  <artifactId>commons-pool2</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.session</groupId> <artifactId>spring-session-data-redis</artifactId> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>
Copy the code

Spring-boot-starter-data-redis: after Spring Boot 2.x, the underlying user should no longer use Jedis, but should instead use Lettuce. Commons-pool2: used as a Redis connection pool. An error is reported if a start is not introduced. Spring-session-data-redis: indicates that spring sessions are imported and used as shared sessions.

Config file application.yml:

server:
  port: 8082
  servlet:
    session:
      timeout: 30ms
spring:
  cache:
    type: redis
  redis:
    host: 127.0.0.1
    port: 6379
    password:
    # redis specifies the shard to be used. The default shard is 0
    database: 0
    lettuce:
      pool:
        The maximum number of connections in the pool (using negative numbers to indicate no limit) is 8 by default
        max-active: 100
Copy the code

Create entity class user.java:

public class User implements Serializable{

    private static final long serialVersionUID = 662692455422902539L;

    private Integer id;

    private String name;

    private Integer age;

    public User() {
    }

    public User(Integer id, String name, Integer age) {
        this.id = id;
        this.name = name;
        this.age = age;
    }

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Integer getAge() {
        return age;
    }

    public void setAge(Integer age) {
        this.age = age;
    }

    @Override
    public String toString() {
        return "User{" +
                "id=" + id +
                ", name='" + name + '\'' + ", age=" + age + '}'; }}Copy the code

The use of RedisTemplate


By default, templates can only support RedisTemplate<String, String>, which can only store strings, so custom templates are necessary.

Add the configuration class RedisCacheConfig. Java:

@Configuration
@AutoConfigureAfter(RedisAutoConfiguration.class)
public class RedisCacheConfig {

    @Bean
    public RedisTemplate<String, Serializable> redisCacheTemplate(LettuceConnectionFactory connectionFactory) {

        RedisTemplate<String, Serializable> template = new RedisTemplate<>();
        template.setKeySerializer(new StringRedisSerializer());
        template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
        template.setConnectionFactory(connectionFactory);
        returntemplate; }}Copy the code

The test class:

@RestController
@RequestMapping("/user")
public class UserController {

    public static Logger logger = LogManager.getLogger(UserController.class);

    @Autowired
    private StringRedisTemplate stringRedisTemplate;

    @Autowired
    private RedisTemplate<String, Serializable> redisCacheTemplate;

    @RequestMapping("/test")
    public void test() {
        redisCacheTemplate.opsForValue().set("userkey", new User(1, "Zhang", 25));
        User user = (User) redisCacheTemplate.opsForValue().get("userkey");
        logger.info("Currently acquired object: {}", user.toString());
    }
Copy the code

Then in the browser to access, observe the background log http://localhost:8082/user/test

Integrate Redis with Spring Cache


Spring Cache has great flexibility. It not only uses SPEL (Spring Expression Language) to define the Cache Key and various conditions, but also provides a temporary storage solution for the Cache out of the box. It also supports integration with major professional caches such as EhCache, Redis and Guava.

Define the interface userservice.java:

public interface UserService {

    User save(User user);

    void delete(int id);

    User get(Integer id);
}
Copy the code

Interface implementation class UserServicePl.java:

@Service
public class UserServiceImpl implements UserService{

    public static Logger logger = LogManager.getLogger(UserServiceImpl.class);

    private static Map<Integer, User> userMap = new HashMap<>();
    static {
        userMap.put(1, new User(1, "Xiao war", 25));
        userMap.put(2, new User(2, "Wang Yibo", 26));
        userMap.put(3, new User(3, "Andy", 24));
    }


    @CachePut(value ="user", key = "#user.id")
    @Override
    public User save(User user) {
        userMap.put(user.getId(), user);
        logger.info("Enter the save method, currently stored object: {}", user.toString());
        return user;
    }

    @CacheEvict(value="user", key = "#id")
    @Override
    public void delete(int id) {
        userMap.remove(id);
        logger.info("Enter delete method, delete succeeded");
    }

    @Cacheable(value = "user", key = "#id")
    @Override
    public User get(Integer id) {
        logger.info("Enter get method, currently get object: {}", userMap.get(id)==null? null:userMap.get(id).toString());returnuserMap.get(id); }}Copy the code

To illustrate the operation of the database, we define a Map<Integer,User> userMap directly.

At the heart of this are three notes:

  • @Cachable
  • @CachePut
  • @CacheEvict

Test class: UserController

@RestController
@RequestMapping("/user")
public class UserController {

    public static Logger logger = LogManager.getLogger(UserController.class);

    @Autowired
    private StringRedisTemplate stringRedisTemplate;

    @Autowired
    private RedisTemplate<String, Serializable> redisCacheTemplate;

    @Autowired
    private UserService userService;

    @RequestMapping("/test")
    public void test() {
        redisCacheTemplate.opsForValue().set("userkey", new User(1, "Zhang", 25));
        User user = (User) redisCacheTemplate.opsForValue().get("userkey");
        logger.info("Currently acquired object: {}", user.toString());
    }


    @RequestMapping("/add")
    public void add() {
        User user = userService.save(new User(4, "Lee is now", 30));
        logger.info("Added user information: {}",user.toString());
    }

    @RequestMapping("/delete")
    public void delete() {
        userService.delete(4);
    }

    @RequestMapping("/get/{id}")
    public void get(@PathVariable("id") String idStr) throws Exception{
        if (StringUtils.isBlank(idStr)) {
            throw new Exception("Id is empty." ");
        }
        Integer id = Integer.parseInt(idStr);
        User user = userService.get(id);
        logger.info("Acquired user information: {}",user.toString()); }}Copy the code

Start the class with a comment to enable caching:

@SpringBootApplication(exclude=DataSourceAutoConfiguration.class) @EnableCaching public class Application { public static void main(String[] args) { SpringApplication.run(Application.class, args); }}Copy the code

(1) calls to add an interface first: http://localhost:8082/user/add

Select * from user where id=4;

As you can see, the data is already fetched from the cache because the add method in the previous step put the user data with id=4 into the Redis cache

Select * from user where id=4;

Select * from user where id=4;

There is no cache, so we go to the get method and get it from userMap.

Cache annotations


① @cacheable caches the result of a method based on its request parameters:

  • Key: The Key of the cache, which can be empty. If specified, it is written as an SPEL expression. If not specified, all the parameters of the method are combined.
  • Value: The name of the cache. At least one must be specified (e.g. @cacheable (Value =’user’) or @cacheable (Value ={‘user1′,’user2’})).
  • Condition: The Condition of the cache, which can be empty, written in SPEL, returns true or false, and only true is cached.

(2) @ CachePut

The result of a method is cached based on its request parameters, and unlike @Cacheable, it triggers a call to the real method every time. For parameter description, see above.

③ @cacheevict clears the cache based on conditions:

  • Key: Same as above.
  • Value: Same as above.
  • Condition: Ditto.
  • AllEntries: Whether to empty all cache contents. The default is false. If true, all cache will be empty immediately after a method call.
  • BeforeInvocation whether the invocation is cleared before the method executes. Default is false, if true, the cache is cleared when the method is not yet executed. By default, if a method throws an exception, the cache is not cleared.

The cache problem


Interviewer: I’ve looked at your Demo. It’s easy to understand. Do you know if you have any problems or problems with using caching in real projects?

Me: Cache and database data consistency problem: It is very easy to have data consistency problem between cache and database in distributed environment. For this, if the project’s requirement for cache is strong consistency, then do not use cache.

We can only adopt appropriate policies to reduce the probability of data inconsistency between the cache and the database, but can not guarantee strong consistency between the two.

Appropriate policies include appropriate cache update strategy, timely update of the cache after updating the database, and retry mechanism when the cache fails.

Interviewer: What about Redis Avalanche?

Me: I understand that at present, the e-commerce home page and hot data will be cached. Generally, the cache is scheduled task to refresh, or to update the cache after not being found. There is a problem with the scheduled task refresh.

For example: if all the keys on the home page expire for 12 hours and refresh at 12:00 PM, I have a big rush of users flooding in at midnight, let’s say 6000 requests per second, the cache can withstand 5000 requests per second, but all the keys in the cache are invalid.

At this point, all 6000 requests per second fall on the database, the database must not be able to withstand, the real situation may not react to the DBA directly hang. At this point, if there is no special solution to handle, the DBA is anxious to restart the database, but the database is immediately killed by new traffic. This is what I understand as a cache avalanche.

I thought to myself: a large area of failure at the same time, instantly Redis is not the same, then this order of magnitude of requests directly into the database is almost catastrophic.

If you think about it, if you hang a user service library, then almost all interfaces of other libraries that depend on it will report an error.

If you did not do the strategy such as circuit breaker is basically an instant hanging rhythm, how you restart the user will put you hit hanging, when you restart good, the user went to sleep early, before sleep, swear “what garbage products”.

The interviewer stroked his hair and said, “Well, not bad. How did you handle that?”

Me: Handling cache avalanche is simple. When storing data in Redis in batches, it is good to add a random value to the expiration time of each Key, so as to ensure that data will not fail in a large area at the same time.

setRedis (key, value, time+ math.random ()*10000);Copy the code

If Redis is clustered, evenly distributing hotspot data among different Redis libraries can also prevent all failures.

Or set hot data will never expire, update the cache is good (such as operation and maintenance updated the home page goods, then you brush the cache, do not set the expiration time), e-commerce home page data can also use this operation, insurance.

Interviewer: Do you know anything about cache penetration and breakdown, and can you tell me the difference between them and avalanche?

Me: Yeah, I see. Let’s talk about cache penetration first. Cache penetration is data that is not in the cache or in the database, and the user (hacker) keeps making requests.

For example: the ID of our database is incremented from 1. If we launch data with ID =-1 or data with a particularly large ID that does not exist, such continuous attacks will lead to great pressure on the database, which will seriously crash the database.

I went on to say: As for cache breakdown, this is a bit like a cache avalanche, but a bit different. A cache avalanche is a massive cache failure that destroys the DB.

The difference between cache breakdown and cache breakdown is that a Key is very hot, and it is constantly carrying a large number of requests and a large number of concurrent accesses to this point. When the Key fails at the moment, the continuous large concurrent accesses directly fall on the database, and the cache is broken at this Key point.

The interviewer show gratified vision: that they respectively how to solve?

I: for cache penetration, I will add verification in the interface layer, such as user authentication, parameter verification, illegal verification directly return, such as id basic verification, ID <=0 directly intercept.

Interviewer: Do you have any other methods?

Me: I remember Redis has an advanced Bloom Filter which is also good for preventing cache penetration.

Its principle is also very simple, is to use efficient data structure and algorithm to quickly determine whether your Key exists in the database, if not, you return, you go to check DB refresh KV return.

In case of cache breakdown, set hotspot data to never expire, or add a mutex. As a warm guy, the code is ready for you. Take it.

Public static String getData(String key) throws InterruptedException {// Query data from Redis. String result = getDataByKV(key); // Check parametersif(stringutils.isblank (result)) {try {// get the lockif(reenlock. tryLock()) {result = getDataByDB(key); / / checkif(stringutils.isnotBlank (result)) {// Inserts into the cachesetDataToKV(key, result); }}elseThread.sleep(100L); thread.sleep (100L); result = getData(key); }} finally {// unlock reenlock. unlock(); }}return result;
    }
Copy the code

Interviewer: Uh-huh, not bad.

Why is Redis so fast


Interviewer: Redis is used as a cache, so Redis must be fast?

Me: Of course, the official data is 100,000 + QPS (number of queries per second), which is no worse than Memcached!

Interviewer: Redis is so fast, do you understand its “multi-threaded model”? (Evil smile)

Me: You are asking why Redis is single threaded so fast. Redis is indeed a single-process, single-thread model, because Redis is entirely memory-based, and the bottleneck of Redis is not the CPU, but most likely the size of the machine’s memory or network bandwidth.

Since single-threading is easy to implement and the CPU is not a bottleneck, it makes sense to adopt a single-threaded solution (there are a lot of complications with multi-threading).

Interviewer: Well, yes. Can you explain why Redis is so fast because it is single-threaded?

Me: Well, to sum it up, there are four points:

  • Redis is completely memory based, most requests are pure memory operations, very fast, the data is stored in memory, similar to HashMap, HashMap has the advantage that the time complexity of lookup and operation is O(1).

  • The data structure is simple and the manipulation of the data is simple.

  • The use of single thread, avoid unnecessary context switch and competition conditions, there is no multi-threading caused by CPU switch, do not have to consider the problem of various locks, there is no lock release lock operation, no deadlock caused by the performance consumption.

  • Use multiplex IO model, non-blocking IO.

The difference between Redis and Memcached


Interviewer: Yes, that’s very detailed. So why did you choose Redis over Memcached?

Me: There are four reasons:

  • Storage mode: Memcache stores all data in memory. After power failure, it will hang up. Data cannot exceed the size of memory. Redis stores some data on hard disks to ensure data persistence.

  • Data support types: Memcache supports simple data types, only supporting simple key-value, while Redis supports five data types.

  • Use different underlying models: the underlying implementation methods and application protocols used to communicate with clients are different. Redis directly built the VM mechanism itself, because normal system calls to system functions would waste a certain amount of time moving and requesting.

  • Value size: Redis can be up to 1GB while Memcache is only 1MB.

Elimination strategy

Interviewer: What elimination strategies do you know about Redis?

Me: Redis has six elimination strategies, as shown below:

Redis 4.0 adds the least frequency use (LFU) elimination strategy, including volatile- LFU and AllKees-LFU. By counting the access frequency, the least frequently used KV will be eliminated.

persistence


Interviewer: Do you know anything about Redis persistence? Can you talk about it?

Me: Redis caches data in memory to ensure efficiency, but periodically writes updated data to disk or writes modification operations to additional record files to ensure data persistence.

Redis has two persistence strategies:

  • RDB: In snapshot mode, memory data is directly saved to a dump file and saved periodically to save policies.

  • AOF: Store all Redis server modification commands in a file, a collection of commands. Redis is the snapshot RDB persistence mode by default.

When Redis restarts, it will use AOF files to restore datasets in preference, because AOF files usually hold more complete datasets than RDB files. You can even turn off persistence so that data is stored only while the server is running.

Interviewer: How does an RDB work?

Me: By default Redis is a binary file dump. RDB that persists data to disk in the form of snapshot “RDB”.

Here’s how it works: When Redis needs to persist, it forks a child process that writes data to a temporary RDB file on disk.

When the child process finishes writing the temporary file, it replaces the original RDB, which has the advantage of copy-on-write.

Me: The good thing about RDB is that it’s great for backups: for example, you can back up an RDB file every hour for the last 24 hours and every day of the month.

This way, if you run into problems, you can always revert the dataset to a different version. RDB is perfect for disaster recovery.

The downside of RDB is that if you need to minimize data loss in the event of a server failure, RDB is not for you.

Interviewer: Why don’t you say AOF again?

Use AOF for persistence, where each write command is appended to appendone.aof via write as follows: appendone.aof

appendfsync yes   
appendfsync always     The AOF file is written every time a data change occurs.
appendfsync everysec   This policy is the default policy of AOF.
Copy the code

AOF can be persistent throughout by enabling appendOnly Yes in the configuration. Every time Redis executes a command to modify data, it will be added to the AOF file. When Redis restarts, the AOF file will be read and played back to the last moment before Redis was shut down.

I paused, then continued: The advantage of using AOF is that it makes Redis very durable. Different Fsync policies can be set. The default policy for AOF is Fsync once per second, and with this configuration, data loss is limited to one second in the event of an outage.

The disadvantage is that the AOF file size is usually larger than the RDB file size for the same data set. Depending on the Fsync strategy used, AOF may be slower than RDB.

The interviewer then asked, With all that, which one should I use?

Me: If you really care about your data, but can still afford to lose it for a few minutes, use RDB persistence only. AOF appends every command executed by Redis to disk. Handling large writes will slow down Redis performance.

Database backup and disaster recovery: Periodically generating RDB snapshots is very convenient for database backup, and RDB can recover data sets faster than AOF.

Of course, Redis can enable RDB and AOF at the same time. After the system restarts, Redis will use AOF to recover data first to minimize data loss.

A master-slave replication


In order to solve the single point of failure problem, it is usually necessary to configure the slave node of Redis, and then use the sentinel to monitor the survival status of the master node. If the master node fails, the slave node can continue to provide caching function. Can you talk about the process and principle of the slave node replication of Redis?

I’m a little confused. It’s a long story. But it’s good to be prepared: master-slave configuration combined with Sentinel mode solves single points of failure and improves Redis availability.

The secondary node provides only read operations, while the primary node provides write operations. In the case of too many reads and too few writes, you can configure multiple secondary nodes for the primary node to improve response efficiency.

I paused, then said, Here’s the thing about the copying process:

  • Run slaveof[masterIP][masterPort] on the secondary node to save information about the primary node.
  • Discover the master node information from the scheduled task in the node, and establish a Socket connection with the master node.
  • The slave node sends a Ping signal, the master node returns Pong, and the two sides can communicate with each other.
  • After the connection is established, the master sends all data to the slave (data synchronization).
  • After the master node synchronizes the current data to the slave node, the replication process is completed. The master node then continuously sends write commands to the slave node to ensure data consistency between the master and slave nodes.

Interviewer: Can you elaborate on the data synchronization process?

(I thought: That’s too detailed.) Me: Yes. Before Redis 2.8, sync[runId][offset] is used. After Redis 2.8, psync[runId][offset] is used.

The difference is that the Sync command supports only full replication. Psync supports full and partial replication.

Before introducing synchronization, introduce some concepts:

  • RunId: A unique UUID is generated for each Redis node startup. The runId changes after each Redis restart.

  • Offset: The master node and the slave node maintain their own master/slave replication offset. If the master node has a write command, offset=offset+ the length of the command in bytes.

    After receiving the command from the master node, the slave node also adds its offset and sends its offset to the master node.

    In this way, the master node saves its own offset and the offset of the slave node at the same time, and determines whether the data of the master node and slave node are consistent by comparing the offset.

  • Repl_backlog_size: A fixed-length FIFO queue, 1MB by default, saved on the primary node.

When the master node sends data to the slave node, the master node also performs some write operations, and the data is stored in the replication buffer.

After synchronizing data from the secondary node to the primary node, the primary node sends the data in the buffer to the secondary node for partial replication.

When the master node responds to a write command, it not only sends the name to the slave node, but also writes the replication backlog buffer to recover data lost by the replication command.

Psync [runId][offset] command is sent from the node. The primary node responds in three ways:

  • FULLRESYNC: first connection, full replication
  • CONTINUE: Partial replication is performed
  • ERR: The psync command is not supported for full replication

Interviewer: Good. Can you tell us more about the process of full copy and partial copy?

Me: Yes!

This is the full copy process. There are mainly the following steps:

  • Send psync from the node? -1 command (runId of primary node is not known because it is sent for the first time, so it is? Offset =-1 because it is the first copy.

  • FULLRESYNC {runId} {offset} is returned when the primary node discovers that the secondary node is the first replication. RunId is the runId of the primary node, and offset is the current offset of the primary node.

  • After receiving the primary node information from the node, save the information to info.

  • After sending FULLRESYNC, the primary node starts the BGsave command to generate the RDB file (data persistence).

  • The primary node sends the RDB file to the secondary node. Write commands from the master node are put into the buffer between the time the data is loaded from the slave node.

  • Cleans up its own database data from the node.

  • Load the RDB file from the node and save the data to your own database. If AOF is enabled on the slave node, the slave node asynchronously overwrites the AOF file.

Here are some notes on partial replication:

① Partial replication is an optimization measure made by Redis for the high cost of full replication, using psync[runId][offset] command to achieve.

When the secondary node is replicating the primary node, the secondary node requests the primary node to send the lost command data to the secondary node if network disconnection or command loss occurs. The replication backlog buffer of the primary node directly sends the lost command data to the secondary node.

This keeps the replication consistency between the master and slave nodes. This part of the data is generally much smaller than the full amount of data.

② The primary node still responds to the command when the primary node is disconnected, but the command cannot be sent to the secondary node because the replication connection is interrupted. However, the replication backlog buffer in the primary node can still store the write command data in the recent period.

(3) After the master/slave connection is restored, the slave node has saved its duplicated offset and the running ID of the master node. They are therefore sent to the master node as psync parameters, requesting partial replication.

(4) After receiving the psync command, the primary node checks whether parameter runId is consistent with its own. If so, it indicates that the primary node is copied.

Then, the replication backlog buffer is searched according to the offset parameter. If the data after the offset exists, the +COUTINUE command is sent to the slave node, indicating that partial replication can be performed. Because the buffer size is fixed, full copy is performed if buffer overflow occurs.

⑤ The master node sends the data in the replication backlog buffer to the slave node according to the offset to ensure that the master/slave replication enters the normal state.

The sentry


Interviewer: What are the problems with master-slave replication?

Me: Master slave replication has the following problems:

  • Once the master node is down, the master node needs to be promoted from the master node to the master node, and the master node address of the application needs to be changed. All the slave nodes need to be ordered to copy the new master node. The whole process requires manual intervention.

  • The write capability of the primary node is limited by the stand-alone node.

  • The storage capacity of the primary node is limited by the single node.

  • The disadvantages of native replication are also highlighted in earlier versions, such as the slave node initiating psync after a Redis replication break.

    If the synchronization fails, full synchronization is performed on the primary database. When the primary database performs full backup, delay of milliseconds or seconds may occur.

Interviewer: What’s the mainstream solution?

Me: The sentry, of course.

Interviewer: So here we go again. What can you tell us about the sentry?

Me: This is the architecture of Redis Sentinel. Redis Sentinel features include master node survival detection, master/slave health detection, automatic failover, and master/slave switchover.

Redis Sentinel minimum configuration is one master, one slave. Redis’ Sentinel system can be used to manage multiple Redis servers.

The system can perform the following four tasks:

  • Monitoring: Continuously check whether the primary and secondary servers are running properly.

  • Notification: Sentinel notifies administrators or other applications through API scripts when a monitored Redis server has a problem.

  • Automatic failover: When the primary node does not function properly, Sentinel starts an automatic failover operation. It upgrades one of the secondary nodes that has a master-slave relationship with the failed primary node to the new primary node and points the other secondary nodes to the new primary node so that manual intervention is not required.

  • Configure provider: In Redis Sentinel mode, the client application initializes with a Sentinel node collection to obtain the master node information.

Interviewer: Can you tell me how sentries work?

Me: Without further ado, get straight to the picture above:

① Each Sentinel node needs to perform the following tasks periodically: Each Sentinel sends a PING command to the master server, the slave server and other Sentinel instances it knows once a second. (As shown above)

If an instance has taken longer than the value of down-after-milliseconds, it will be flagged as subjective offline by Sentinel. (As shown above)

(3) If a primary server is marked as subjectively offline, all Sentinel nodes that are monitoring the server confirm that the primary server is subjectively offline at a rate of once per second.

④ If a master server is marked as subjective offline and a sufficient number of Sentinels (at least as many as specified in the profile) agree with this judgment within the specified time frame, the master server is marked as objective offline.

⑤ In general, each Sentinel will send INFO commands to all its known master and slave servers every 10 seconds.

When a primary server is marked as objective offline, Sentinel sends the INFO command once per second to all secondary servers of the offline primary, instead of once every 10 seconds.

⑥Sentinel and other sentinels negotiate the status of the offline primary node. If the primary node is in THE SDOWN state, a new primary node is automatically selected by voting, and the remaining secondary nodes are pointed to the new primary node for data replication.

⑦ The objective offline status of the primary server is removed when there is not enough Sentinel consent for the primary server to go offline. When the primary server returns a valid reply to the PING command for Sentinel, the subjective offline status of the primary server is removed.

Interviewer: Yes, you did a lot of work before the interview. You can finish the Redis today, and we can talk about other things tomorrow. (Smiling)

Me: No problem.

conclusion


In the course of an interview, this article describes what Redis is, the features and functions of Redis, the use of Redis cache, why Redis is so fast, the Redis cache elimination strategy, two ways of persistence, the basic principles of Master slave replication and sentry of Redis high availability parts.

As long as kung fu deep, iron pestle ground into a needle, usually ready, interview need not panic. It doesn’t have to be the interview question, but it does.

Reprinted from wechat official account

The original link: mp.weixin.qq.com/s/eI2yXPKOn…


Make progress together, learn and share

Welcome to pay attention to my public number [calm as code], massive Java related articles, learning materials will be updated in it, sorting out the data will be placed in it.

If you think it’s written well, click a “like” and add a follow! Point attention, do not get lost, continue to update!!