The introduction

Hello, everyone, I am South orange, from contact with Java to now also have almost two years, two years, from a Java have several data structures do not understand super small white, to now understand a little bit of advanced small white, learned a lot of things. The more knowledge is shared, the more valuable, I this period of time summary (including from other big guy over there to learn, quote) some of the focus in the ordinary study and interview (self think), hope to bring some help to everyone

The appearance of this article, first of all, I would like to thank a person, the third prince Aobin, whose article made me find that Redis’ knowledge is so colorful. Yeah, well, I read his articles every week

This is the mind map of this article, because it is a free version of the software, so there are many watermarks, if you need the original version, you can add my wechat:

Students in need can add my public account, the latest articles in the future are in the first time, you can also ask me for mind mapRedis, because of the time and length of the reason, did not finish at one time, so, divided into two parts, did not read the first half of the friends can go to read ~

Some overlooked points in the index
Redis basic knowledge of two articles to meet (1)
Redis basic knowledge to meet two (2)
Everything — Locks in JAVA
Conquering the JVM — JVM Objects and object access location (1)
Conquer the JVM — Garbage collection for the JVM (Part 2)
Conquering the JVM — The JVM’s Garbage collector (part 3)

One, cache avalanche, breakdown, penetration

This time, it starts with one of Redis’ most talked-about (and frequently asked about) cache crashes

Cache avalanche

As the name suggests, you should have seen an avalanche, Southern orange I have seen far away, that scene, quite a sense of the sky and the earth, and for the database, cache avalanche, also can be said to be a sky and the earth. At the same time, a large part of the Redis cache is invalid. At that moment, Redis does not exist. At this time, data is directly requested to the database. If you think about it, the whole point of caching is to reduce DB, and if the cache is gone, the number of requests doesn’t just blow up the database, right?

How does a cache avalanche occur?

A large number of caches are invalid at the same time, perhaps generated at the same time and expire at the same time
The cache is also deleted
There is an error in the cache layer and it is not working properly

The solution:

1. When data is stored in batch network Redis, the failure time of each KEY is increased by random value to ensure that it will not fail at the same time
2, set the hotspot data to never expire, and update the cache if there is an update operation (but this method is not good, never expire will cause a large amount of cache, many caches may not be useful)
3. Redis cluster deployment: Hotspot data is evenly distributed in different Redis libraries to avoid total failure and cache avalanche caused by Redis problems

Cache breakdown

There is a Key that is very hot, and it is constantly carrying large concurrency. Large concurrency sets access to this point. When the Key fails, a large number of concurrent accesses the cache and directly access the database.

In fact, cache breakdown, really is not a particularly big problem, after all, not every company on the same Key have so big hot spots, just need to set the expiration time, stable Redis cluster, cache breakdown is not difficult to avoid.

The solution:

1. Set the hotspot data to never expire
ADD a mutex key to the mutex key when the cache is invalid. Instead of loading the mutex key immediately, use one of the cache’s operations with a success value (such as Redis’s SETNX or Memcache’s ADD) to set a mutex key. When the operation returns success, load the DB and return the cache.

In my experience, setting mutex locks is clearly not necessary. Why use a lock when a hot spot never expires? Isn’t that just adding complexity? Maybe in a special situation, but for me, this view can only be seen in your blog.

The cache to penetrate

In name, cache penetrations are similar to cache penetrations. In fact, the pages are similar, but since they are distinguished, there are some differences

The concept of cache penetration is simple: the user wants to query a data, but the redis memory database does not have a hit, so the cache does not have a hit, so the user queries the persistence layer database. Found no, so this query failed. When there are too many users, the cache is not hit, so all requests to the persistence layer database. This puts a lot of pressure on the persistence layer database, which is the equivalent of cache penetration.

But what cache penetration really protects against is hackers.

If a hacker intentionally queries data that does not necessarily exist in the cache each time, each request would have to be queried in the storage tier, and the cache would be meaningless. If the database can fail under heavy traffic, this is also a cache breakdown. The solution:

1. Add parameter verification
2. Add a configuration item to Nginx at the gateway layer to mask a single IP address whose access times per second exceed the threshold
3. Bloom Filter can effectively prevent cache penetrating. Its principle is also very simple: it uses efficient data structure and algorithm to quickly determine whether your Key is in the database

Use bloom filter to prevent the cache breakdown, mainly by putting the existing cache into the Bloom filter, when the hacker access the non-existent cache quickly return to avoid the cache and DB hang.

As for what is a Bloom filter

The Bloom filter is a data structure, a clever probabilistic data structure, characterized by efficient inserts and queries that can be used to tell you “something must not exist or may exist.”

The Bloom filter consists of a long array of bits and a series of hash functions.

Each element of an array takes up only 1bit of space, and each element can only be 0 or 1.

The Bloom filter has k hash functions. When an element is added to the Bloom filter, k hash functions are used to evaluate it k times to obtain K hash values. According to the obtained hash values, the value of the corresponding subscript is set to 1 in the array.

To determine whether a number is in the Bloom filter, the element is hashed k times, and the resulting value is determined whether each element in the array is 1. If each element is 1, it means that the value is in the Bloom filter.

When more and more elements are inserted, when an element that is not in the Bloom filter is hashed in the same way, the resulting value is queried in the array. It is possible that these positions are set to 1 because other elements were set to 1 first.

Therefore, there is misjudgment in the Bloom filter, but if the Bloom filter determines that an element is not in the Bloom filter, then the value must be absent.

In this way, the cache can be effectively prevented from being penetrated by hackers.

Second, Redis cluster implementation

1. Traditional master-slave mode

It’s not very traditional, but I feel like all clusters have a master-slave mode orZ

One purpose of the master-slave mode is to back up data, so that if a node is damaged (that is, unrecoverable hardware is damaged), the data can be easily recovered because of the backup. Another function is load balancing. All clients accessing one node will definitely affect Redis’ efficiency. With master and slave nodes, queries can be done by querying the slave nodes.

In master-slave mode, a Master can have multiple Slaves. By default, the Master node can perform read and write operations while the slave node can only perform read operations but cannot perform write operations.

If you change the default configuration, you can write to the slave, but this makes no sense because the data is not synchronized to the other slaves, and if you change the master node, the data on the slave will be overwritten immediately.

If the slave node fails, the read and write operations on the slave node and master node are not affected. After the slave node is restarted, data is synchronized from the master node. If the master node fails, Redis does not provide write services to the slave node. If the master node starts, Redis will provide write services again.

So, we can find that Redis master slave and Zookeeper master slave are completely different! It doesn’t vote!

This disadvantage is very significant, especially for the production environment, is not a moment to stop the service, so the general production environment is not only the master-slave mode. So here’s the Sentinel pattern.

2. Sentinel Mode

Sentinel mode needs to be used together with master and slave mode. The master and slave can not be elected by themselves, so we add a sentinel. When Sentinel finds that the master node is down, Sentinel will elect another master from the slave.

The sentinel’s job is to monitor the health of the Redis system. Its functions include the following two.

(1) Monitor whether the primary and secondary servers are running properly. (2) When the primary server fails, the secondary server is automatically converted to the primary server.Copy the code

Wouldn’t that make everyone happy?

How sentries work:

1. Each Sentinel process sends a PING command once per second to the Master, Slave, and other Sentinel processes in the entire cluster.
If an instance has responded to a PING packet for longer than the down-after-milliseconds value, it will be marked down by the Sentinel process.
3. If a Master server is marked as SDOWN, all Sentinel processes that are monitoring the Master server should confirm that the Master server is indeed SDOWN at a rate of once per second

The Master server is marked as ODOWN when a sufficient number of Sentinel processes (greater than or equal to the value specified in the configuration file) confirm that the Master server is subjectively down (SDOWN) within the specified time range.

4. In general, each Sentinel process will send the INFO command to all the Master and Slave servers in the cluster every 10 seconds.

When the Master server is marked as ODOWN objectively by the Sentinel process, the frequency at which the Sentinel process sends the INFO command to all Slave Slave servers of the offline Master server will be changed from once every 10 seconds to once every second.

5. The objective offline status of the Master server will be removed if a sufficient number of Sentinel processes agree to take the Master server offline. If the Master server sends a valid PING response to the Sentinel process, the Master server’s subjective offline status will be removed.

Sentinel mode can basically meet the general production needs, with high availability. However, when the amount of data is too large to be stored on one server, the primary/secondary mode or Sentinel mode cannot meet the requirements. In this case, the data needs to be fragmented and stored in multiple Redis instances, which is the cluster mode.

3. Cluster mode

The emergence of cluster is to solve the problem of limited capacity of single Redis, and Redis data is allocated to multiple machines according to certain rules.

Redis-cluster adopts a central-free structure, and its characteristics are as follows:

All The Redis nodes are connected to each other (the PING PONG mechanism), internally using binary protocols to optimize transmission speed and bandwidth.
A node failure takes effect only when more than half of the nodes in the cluster fail to be detected.
The client is directly connected to the Redis node without the need for an intermediate agent layer. The client does not need to connect to all nodes in the cluster. The client can connect to any available node in the cluster.

Cluster can be said to be a combination of Sentinel and master-slave mode. The function of master/slave and master re-selection can be realized through cluster. Therefore, if three replicas and three fragments are configured, ninety-six Redis instances are needed. Because the data of Redis is allocated to different machines in the cluster according to certain rules, when the data volume is too large, new machines can be added for capacity expansion. This mode is suitable for the caching requirements of huge data volume, and Sentinel can be used when the data volume is not too large.

What is a Redis-cluster

Each request to access the Redis-cluster is routed. The route can be randomly shred using hashes (or other methods), but completely hashing will likely result in the shards dying. Therefore, a method of ** consistent hash (automatic cache migration)+ virtual node (automatic load balancing)** is proposed to solve the problem

You can see the specific content of this article, write more detailed, see also very enjoyable

www.jianshu.com/p/49c9e03ee… A few things about Redis (10) Redis Cluster mode

Consistent hashing: Place all the master nodes in a ring, and then a key comes along. The hash value is the same as the hash value, and then the hash value is compared on each point corresponding to the circle (each point has a hash value). When the hash value falls on the circle, it rotates clockwise to find the nearest node, and data is stored and read at this node.

The advantage of consistent hashing is that if a master goes down, only the data on the previous master will be affected. If the master goes down, the data on the previous master will be affected. If the master goes down, the data on the previous master will be affected. Only a fraction of the data will be lost.

Three, memory elimination mechanism

Since Redis can store data, it is natural to delete the redundant data, otherwise, the space is full, where to put the new content? Clever programmers have come up with three solutions to Redis’ memory problems.

1. Delete periodically
2, lazy delete — if the query expires, it will not return, expired is deleted. In this case, there is a lot of redundant data, which takes up a lot of space

1. Delete periodically

Create a timer, and when the expiration time of key is set, the timer task will immediately delete the key. By default, 100ms will randomly select some keys to determine whether they are expired. If they are expired, they will delete them, and exchange the processor performance for storage space (exchange time for space).

Advantages: save memory, then delete, quickly release unnecessary memory occupancy
Disadvantages: High CPU pressure, no matter how high the CPU load is, the CPU will be occupied, which will affect the response time and instruction throughput of The Redis server

2. Lazy deletion

When the data in Redis reaches its expiration date, we do not process it. If the data is not expired, the data will be returned normally. If the data is expired, the data will be deleted immediately, and then the data will not exist. The data will be exchanged for processor performance with storage space (space for time).

Advantage: Saves CPU performance. You can delete the vm only when the VM must be deleted
Disadvantages: Memory pressure is high, and there is data occupying memory for a long time

I don’t know if you’ve seen this, but most of the algorithms, it’s either time for space, or space for time. This is one of the ultimate mysteries of humanity, and if we only knew this, we could solve most of our problems.

3. Elimination mechanism

You can also set the elimination mechanism by searching for maxmemory-policy in Redis’ Redis.config file

Noeviction: All commands causing memory allocation will report errors when memory usage reaches the threshold. Allkeys-lru: in the primary key space, remove keys that have not been used recently first. Volatile -lru: Removes keys that have not been used recently from the key space with expiration time set. Allkeys-random: Removes a key randomly from the primary key space. Volatile -random: Randomly removes a key from the key space with an expiration time set. Volatile - TTL: In the key space with an expiration time, the key with an earlier expiration time is removed first.Copy the code

conclusion

Redis article finished, the feeling is Redis is really have a lot of content, feel all appeared to be mastered before them, looked back and found that there are still many don’t understand ~ ~ ~ ~ ~ very happy can be here to share my harvest, I know that my technology stack than you big bosses still has a gap, but people don’t work hard how to know they can’t? I hope you enjoy my article, and I hope this article can help you.

At the same time, if you need a mind map, you can contact me, after all, the more knowledge is shared, the more fragrant!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

【 advanced path 】Redis basic knowledge (2)

The introduction