Distributed cache

Redis technology is often used in distributed cache.

Problems and solutions of single-node Redis

  • Data loss problem
    • Solution: Implement Redis persistence
  • Concurrency problem
    • Solution: Set up primary and secondary clusters to separate read and write data
  • Fault recovery problem
    • Solution: Health detection and automatic recovery with Redis Sentry
  • Storage capacity problem
    • Solution: Build fragmented clusters and implement dynamic capacity expansion by slot mechanism

Redis persistence

RDB persistence

RDB stands for Redis Database Backup File, also known as Redis data snapshot. Simply put, all the data in memory is recorded to disk. When the Redis instance is restarted, the snapshot file is read from the disk to restore data.

Snapshot files are called RDB files and are saved in the current running directory by default

sava # The main Redis process executing RDB blocks all commands
Copy the code
bgsava RDB is executed by a child process
Copy the code

Bgsava starts by forking the main process into its children, which share the main process’s memory data. After the fork, the memory data is read and the RDB file is written

Fork uses copy-on-write technology:

  • When the main process performs a read operation, it accesses the shared memory.

  • When the main process performs a write operation, it copies data and performs the write operation

Note: An RDB is automatically executed when Redis is turned off

Is RDB executed only when Redis is turned off? Of course not!

Redis has an RDB trigger mechanism, which can be found in the Redis. Conf file in the following format:

# savA Time (in seconds) Number of key changes
sava 900 1 Bgsava is executed if a key is changed within 900 seconds
Copy the code

Note: If it is SAVA “”, RDB is turned off

Other configurations of the RDB can also be set in the redis.conf file

It is recommended not to enable compression because compression consumes CPU and disk resources are relatively cheap
rdbcompression yes
	
RDB file name
dbfilename dump.rdb

# Directory where files are saved
dir ./
Copy the code

Suggestion: Do not modify the REDis in the production environment to avoid data loss

AOF persistence

AOF stands for Append Only File. Every write command processed by Redis is recorded in an AOF file, which can be viewed as a command log file.

AOF is off by default, you need to modify the redis.conf configuration file to enable AOF:

# Whether to enable AOF. Default is no
appendsync yes

#AOF file name
appendfilename "appendonly.aof"
Copy the code

The frequency recorded by the AOF command can also be configured in the redis.conf file

# Indicates that each write command is immediately recorded in the AOF file
appendsync always

The default is to put the write command into the buffer and then write the buffer data to the AOF file every second
appendsync everysec

The system decides when to write the contents of the buffer back to disk
appendsync no
Copy the code
Configuration items Brush set time advantages disadvantages
Always Synchronous brush set High reliability, almost no data loss High performance impact
everysec No second brush plate The performance is moderate Data loss for a maximum of 1 second
no Operating system control The best performance Poor reliability and large amounts of data may be lost

Because AOF is a logging command, AOF files are much larger than RDB files. AOF records multiple writes to the same key, but only the last write is meaningful. By executing the bgrewriteaof command, you can make the AOF file perform the rewrite function with a minimum of commands to achieve the same effect.

Redis also automatically overwrites the AOF file when the threshold is set, which can also be configured in redis.conf

The #AOF file grows by more than a percentage of the last file to trigger rewriting
auto-aof-rewrite-percentage 100
#AOF triggers rewriting if the size of the file exceeds it
auto-aof-rewrite-min-size 64mb
Copy the code

RDB versus AOF

RDB and AOF each have their own advantages and disadvantages. If the data security requirements are high, AOF is used; otherwise, RDB is used. In practical development, the two are often combined.

RDB AOF
Persistent mode Periodically snapshots are taken for the entire memory Record each write command
Data integrity Incomplete, lost between backups Relatively complete, depending on the flush strategy
The file size There will be compression, smaller file size Record the command, the file size is very large
Downtime recovery speed soon slow
Data recovery priority Low because data integrity is not as good as AOF High, because data integrity is higher
System Resource Usage High, large CPU and memory consumption Low, mainly disk I/O resources, but AOF rewriting takes up a lot of CPU and memory resources
Usage scenarios Can tolerate several minutes of data loss, pursuit of faster startup speed High requirements on data security

Redis master-slave

The concurrency capability of single-node Redis has an upper limit. To further improve the concurrency capability of Redis, it is necessary to build a master-slave cluster to achieve read and write separation

Data synchronization

How does the master determine whether the slave is synchronizing data for the first time? Two concepts are used here:

  • Replication Id: replID for short. Replication Id is the mark of the data set. If the data set is identical, the data set is the same. Each master has a unique REPLID, and the slaves inherit the replID of the master
  • Offset: The offset that increases as more data is recorded in the rep_baklog. The slave completes and records the offset of the current synchronization. If the offset of the slave is smaller than that of the master, it indicates that the slave data lags behind the master and needs to be updated. Therefore, to synchronize data, the slave must declare its replication ID and offset to the master

Full synchronization steps

  • Slave Node requests incremental synchronization
  • The master node checks the ReplID and rejects incremental synchronization if the replID is inconsistent
  • The master generates an RDB of complete memory data and sends the RDB to the slave
  • Slave Clears local data and loads the MASTER RDB
  • The master records the commands during the RDB to rep_baklog and continuously sends the commands in the log to the slave
  • Slave Executes received commands to keep the synchronization with the master

Incremental synchronization steps

  • The slave node requests incremental synchronization with replID and offset
  • The master node evaluates replID and offset, replID is the same, and offset is behind MASet and replies continue
  • Maset obtains the offset data from repl_baklog and sends it to the slave
  • Slave Executing commands

Note: Repl_baklog has an upper limit on size and will overwrite the earliest data when full. If the slave is disconnected for a long time and the data that has not been backed up is overwritten, incremental synchronization cannot be performed based on log and full synchronization can only be performed again

Redis optimized for master/slave data synchronization

  • Configure repl-diskless-sync on master yes Enable diskless replication to avoid disk I/O during full synchronization (used when network performance is affected and bandwidth is high)
  • The memory footprint on a single Redis node should not be too large to reduce excessive memory consumption caused by RDB
  • Properly increase the size of the Repl-baklog to recover the slave as soon as possible and avoid full synchronization as much as possible
  • Limit the number of slave nodes on a master. If there are too many slave nodes, the master – slave – master chain structure can be adopted to reduce the master pressure

Redis sentry

In the master-slave structure, if the slave goes down, it can recover data from the master. What about if the master goes down?

This is where redis sentry is used for failure recovery, node election, and service monitoring

  • Monitoring: Sentinel constantly checks to see if your master and slave are working as expected
  • Automatic fault recovery: If the master fails,Sentinel promotes a slave to master. When the faulty instance recovers, the new master is also used as the primary instance
  • Notification: Sentinel acts as a source of service discovery for The Redis client, pushing the latest information to the Redis client in the event of a cluster failover

Sentinel detects service status based on heartbeat mechanism and sends ping command to each instance of cluster every one second:

  • Subjective offline: If a Sentinel node finds that an instance does not respond within the specified time, the instance is considered to be subjective offline
  • Objective offline: If more than a specified number of sentinels consider the instance subjectively offline, then the instance is objectively offline.

The value of quorun should preferably be more than half the number of Sentinel instances

Voting principle

If a single master fault is detected,sentinel needs to select a new master in salve:

  • Check the interval between the slave node and the master node. If the interval exceeds the specified value (down-after-milliseconds*10), the slave node is excluded
  • The smaller the slave-priority value is, the higher the priority is. If the value is 0, the slave node will never participate in the election
  • If the slave-proity values are the same, check the offset value of the slave node. The larger the offset value is, the newer the data is and the higher the priority is
  • Finally, check the running ID of the slave node. The smaller the ID, the higher the priority

Fault migration

If one slave is selected as the new master (for example, Slave1), the failover procedure is as follows:

  • Sentinel sends the slaveof no one command to the alternative Slave1 node to make it master
  • Sentinel sends the slaveof new master IP command to all other slaves to become slave nodes of the new master and start synchronizing data from the new master
  • Finally, sentinel marks the failed node as slave and automatically becomes a new slave node when the failed node recovers

RedisTemplate integrates sentinel mode

Rely on

<dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-redis</artifactId>
        </dependency>
Copy the code

configuration

spring:
  redis:
    sentinel:
      master: mymaster
      nodes:
        - 127.0. 01.: 27001
        - 127.0. 01.: 27002
        - 127.0. 01.: 27003

Copy the code

Configure primary/secondary read/write separation

@Bean
    public LettuceClientConfigurationBuilderCustomizer configurationBuilderCustomizer(a){
        return configBuilder -> configBuilder.readFrom (ReadFrom.REPLICA_PREFERRED);
    }
Copy the code

ReadFrom Read policy

•MASTER: reads data from the MASTER node

•MASTER_PREFERRED: reads data from the master node preferentially. Read data from the replica only when the master is unavailable

•REPLICA: reads data from the slave (REPLICA) node

•REPLICA _PREFERRED: read data from the slave (REPLICA) node first. Read data from the master only when all slaves are unavailable

Shard cluster

Master slave and Sentry solve the problem of high availability and high concurrent reads, but two problems remain:

  • Massive data storage problems
  • High concurrency write problem

Using a sharded cluster solves this problem

  • There are multiple masters in a cluster, and each master holds different data
  • Each master can have multiple slave nodes
  • The master can ping each other to monitor each other’s health status
  • Client requests can access any node in the cluster and are eventually forwarded to the correct node

Hash slot

Redis maps each master node to 16,384 Hash slots ranging from 0 to 16,383

Data keys are bound to slots. Redis calculates the slot value based on the kety’s valid part in two cases:

The key contains “{}” and “{}” contains at least one character. The part in “{}” is a valid part

The key does not contain {}. The entire key is a valid part

For example: {typeId}typeName typeId is the valid part that calculates slot values. Key is typeIdtypeNema