Redis knowledge collation

Study the book “Redis In-depth Adventure: Core Principles and Application Practice”, systematically organize the knowledge related to Redis that you are interested in, just as reading notes.

1. Data structure

1.string

A dynamic string with an internal structure similar to ArrayList is actually a bitmap. It adopts a pre-allocated redundant space strategy to reduce the frequent allocation of memory. The maximum length is 512 MB.

Instructions:

Set, GET, expire,setex,setnx

Capacity Expansion policy:

If the string length is less than 1 MB, the capacity expansion doubles the existing space. If the capacity exceeds 1 MB, add 1 MB for each capacity expansion.

USES:

2.list

As a LinkedList, it is actually a quicklist with very fast insertions and deletions. The time complexity is O(1), but the index location is slow and the time complexity is O(n). When the last element of the list pops up, the data structure is automatically deleted and the memory is reclaimed.

Instructions:

Rpush, rpop

Data structure:

When a list has fewer elements, it uses contiguous memory storage. This structure is zipList, or compressed list. It stores all the elements right next to each other, allocating a contiguous chunk of memory. Quicklist is used when there is a large amount of data. Because ordinary linked lists require too much additional pointer space, it will be a waste of space, and will increase the fragmentation of memory.

For example, the list contains only int data, and the structure requires two additional Pointers, prev and next

So Redis combines linked lists with Ziplist to make quickList. That is, multiple Ziplists using a bidirectional pointer string to use. This not only meets the fast insert and delete performance, but also does not appear too large space redundancy.

USES:

It is often used for asynchronous queues. The task structure that needs to be deferred is serialized as a string into the Redis list, from which another thread polls the data for processing, similar to a stack.

3.hash

The dictionary

Equivalent to Hashmap, unordered dictionary. The data structure is array + linked list.

Data structure:

Data + linked list, Redis dictionary values can only be strings, and when the dictionary value is too large, its rehash process is not consistent with hashMap. Redis uses a progressive Rehash strategy because it can’t block services for high performance.

Progressive rehash preserves the old and new hash structures while rehashing, queries both hash structures at the same time, and then migrates the contents of the old hash to the new hash structure gradually in subsequent scheduled tasks and subdirectives of the hash. When the hash removes the last element, the data structure is automatically deleted and the memory is reclaimed.

USES:

The hash structure can also be used to store user information. Unlike strings, which serialize the entire object at once, hash can be stored separately for each field in the user structure. This allows us to partially retrieve user information when we need it. However, if the user information is stored in the form of a whole string, it can only be read at a time, which will waste network traffic.

Disadvantages:

The storage cost of a hash structure is higher than that of a single string.

4.set

Similar to a HashSet, its internal key-value pairs are unordered and unique. Its internal implementation is equivalent to a special dictionary in which all values are a single value, NULL. When the last element in the collection is removed, the data structure is automatically deleted and the memory is reclaimed.

USES:

The set structure can be used to store user ids that win prizes in an activity because of the de-duplication feature, which ensures that the same user will not win twice.

5.zset

An ordered list

Similar to sortedSet + Hashmap, on the one hand, because it is a set, it ensures the uniqueness of the internal value. On the other hand, it can assign a score to each value, representing the ranking weight of the value. After the last value in the zset is removed, the data structure is automatically deleted and the memory is reclaimed.

Data structure:

Jump list.

use

  • Zset can be used to store a list of fans. The value is the user ID of the fans and the score is the following time. We can sort the list of followers by time of following.
  • Zset can also be used to store a student’s scores, with value being the student’s ID and score being his test scores. We can sort the results by points to get his ranking.

Second, the purpose

1. Distributed lock

The goal of distributed locking is essentially to occupy a “pit” in the Redis, and when other processes try to occupy it, they either give up or try again later. Redis distributed locks should not be used for long duration tasks

* * * * instructions

Setnx, del, expire

2. Delay queue

Redis’s message queue is not a professional message queue, it does not have many advanced features, it does not have ACK guarantees, and it is not suitable for use if there is an extreme pursuit of message reliability.

To solve the problem of empty pop reads, you can first add sleep time. It is best to use blPOP/BRPOP, which blocks reads. Blocking reads go to sleep when there is no data in the queue and wake up as soon as the data arrives. Message latency is almost zero. Of course, if the thread is blocked all the time, the Redis client connection will become idle connection, idle for a long time, the server will generally take the initiative to disconnect, reduce idle resources. At this point blPOP/BRPOP will throw an exception.

Delayed queues can be implemented through Redis’ Zset (ordered list). We serialize the message into a string as the value of zset, and the expiration processing time of the message as score. Then, multiple threads poll Zset to obtain the expired task for processing.

3. The bitmap

Bitwise access value to save space.

4.HyperLogLog

HyperLogLog provides two instructions, pfAdd and pfcount, which are understood literally: one is to increase the count and the other is to get the count. Pfadd is used the same way as sadd for set. When you get a user ID, you insert it. Pfcount is used the same way as scard, directly fetching the count value.

5. Bloom filter

An imprecise set structure that may misjudge the existence of an object when you use its contains method. When a Bloom filter says a value exists, it may not exist; When it says it doesn’t exist, it definitely doesn’t exist.

Third, the principle of

Why is Redis so good as a single-threaded model? The answer is in the details, based on full memory level operations, using a multiplexed threading model, parsing extremely high performance text communication protocols, pipes that greatly reduce the number of communication interactions, snapshots and AOF double persistence for different scenarios, five excellent basic data structures, and so on. All in all, it’s so damn good.

1. Thread I/O model

Redis is a single-threaded program. All the data is in memory, all the operations are memory level operations.

None-blocking + event polling (multiplexing). Non-blocking reads solve the problem of traditional I/O read blocking, event polling solve the problem of when to read. The multiplexing apis of modern operating systems no longer use SELECT system calls in favor of epoll(Linux) and KQueue (FreeBSD & MacOSX) because the performance of select system calls can be very poor when there are too many descriptors.

2. Communication protocol

RESP is short for Redis Serialization Protocol. It is an intuitive text protocol with the advantage of exceptionally simple implementation and excellent parsing performance.

3. The pipeline

Read -> Write -> Read -> Write converts to read -> read -> write -> write, saving network overhead. In fact, it is implemented by the Redis client. Clients can save a lot of IO time by changing the read and write order of the list of instructions in the pipe. The more instructions in the pipe, the better. Essentially, the client overwrites the read and write sequence.

4. The transaction

To ensure the atomicity of consecutive operations, a mature database usually has transaction support, and Redis is no exception. Redis transactions are very simple to use, and unlike relational databases, we don’t need to understand so many complex transaction models to use them directly. However, because of this simplicity, its transaction model is very lax, which requires that we do not use Redis in the same way as transactions with relational databases.

5. Gossip Pubsub

Pubsub is mainly used to support message multicast, in which a producer produces a message once and the middleware is responsible for copying the message to multiple message queues, each message queue is consumed by the corresponding consumer group. The five basic types of redis do not support multicast, but use a separate module to support message multicast, which is called PubSub, PublisherSubscriber model.

6. Synchronize backup

Incremental synchronization and snapshot synchronization

Incremental synchronization (AOF)

Incremental synchronization is an instruction stream that has a modifiable effect on state. It is synchronized to the local Buffer first, and then asynchronously transmitted to the slave node. The slave node generally synchronizes the state of the master node, and generally wants to feedback the offset of synchronization to the master node.

Redis’s copied buffer is a fixed-length, circular array that overwrites everything from the beginning if the array is full. Therefore, in the case of poor network communication, data that has not been synchronized may be overwritten. Therefore, snapshot synchronization is required.

Snapshot Synchronization (RDB)

Snapshot synchronization is a bgSave snapshot of the current memory data into a disk file on the master, and then the contents of the snapshot file are transferred to the slave node. After receiving data from a node, load the data in full mode immediately. Before loading the data, clear the current memory. Notify the master node to continue incremental synchronization after loading.

Snapshot synchronization is prone to infinite loops. Therefore, you must set an appropriate buffer size to avoid infinite loops.