As a server engineer, you must have worked with Redis. You must know why Redis is fast, at least for the interview also prepared. Many people know that Redis is fast only because it is implemented in memory, but it is ambiguous for other reasons.

So let’s take a look today with Lee:

Figure Note: – Mind mapping –

Memory-based implementation

I mentioned this at the beginning, but I’ll talk about it briefly.

Redis is a memory-based database, which inevitably has to be compared to a disk-based database. For disk databases, data needs to be read into memory, and this process is limited by disk I/O.

For an in-memory database, the data itself exists in memory and there is no overhead in this respect.

Efficient data structures

There are multiple data types in Redis, and the underlying layer of each data type is supported by one or more data structures. Because of these data structures, Redis is able to store and read data at an unimpeded speed. What’s so special about these data structures?

Simple dynamic strings

This term may not be familiar to you, but SDS will certainly know. This is what you do with strings. As anyone who knows C knows, it has methods for handling strings. Redis is implemented in C, so why reinvent the wheel? Let’s look at it from the following points:

(1) String length processing

This graph is how strings are stored in C. To get the length of “Redis”, you need to iterate from the beginning until you hit ‘\0’.

How do you do that in Redis? Records the length of the current string with a len field. To get the length, simply get the Len field. You see, the gap speaks for itself. The time complexity of the former traversal is O(n), which can be obtained by O(1) in Redis, and the speed is significantly improved.

(2) Memory reallocation

C reallocates memory when modifying strings. The more frequently you modify, the more frequently you allocate memory. Memory allocation is a performance drain, so performance degradation is inevitable.

However, Redis involves frequent string modification operations, so this kind of memory allocation is obviously not suitable. So SDS realizes two optimization strategies:

  • Space preallocation

For SDS modification and space expansion, unused space will be allocated in addition to necessary space.

The specific allocation rule is as follows: after SDS modification, len length is less than 1M, then additional unused space of the same length as LEN will be allocated. If the length is greater than 1 MB, 1 MB space is allocated.

  • Inert space release

And of course, if there’s space allocated, there’s space freed.

When SDS shortens, it does not reclaim the excess memory space, but uses the free field to record the extra space. In case of subsequent changes, the space recorded in FREE is directly used, reducing the memory allocation.

(3) Binary security

You already know that Redis can store a variety of data types, so binary data must be no exception. Binary data, however, is not a regular string format and may contain special characters such as ‘\0’.

As we mentioned earlier, the string in C ends at ‘\0’, so the data after ‘\0’ cannot be read. But in SDS, the end of the string is judged by len length.

See, the binary security problem is solved.

2. Double-endian linked lists

Lists are more often used as queues or stacks. The queue and stack features one fifO and one fifO. Double-entailed lists support these features well.

Figure note: – Double – ended linked list –

(1) Front and rear nodes

Each node in the list has two Pointers, prev to the front node and next to the back node. In this way, the front and back nodes can be obtained in time complexity O(1).

(2) head and tail nodes

You may have noticed that the head node has two arguments, head and tail, pointing to the head node and tail node respectively. Such a design can reduce the processing time complexity to O(1) for both nodes, which is ideal for queues and stacks. Simultaneous list iteration can be done from both ends.

(3) The length of the list

The header node also contains len, which is used to keep track of the length of the list, similar to SDS. So instead of going through the whole list, you just get len, which is order 1.

As you can see, all of these features reduce the time overhead of using lists.

3. Compress lists

Double-ended lists are already familiar. I wonder if you have noticed a problem: if you store a small piece of data, such as a byte, in a linked list node. Save extra data like the head node, front and back Pointers.

This wastes space and can easily lead to memory fragmentation due to repeated requests and frees. This is inefficient use of memory.

So, compressed lists come into play!

It is specially coded to improve memory efficiency. All operations are performed using Pointers and decoded offsets.

And the memory of the compressed list is continuously allocated, traversal speed is very fast.

4, the dictionary

Redis is a K-V database, and all key values are stored in dictionaries.

If you want to find a word, you can directly locate it by a word. It’s very fast. The dictionary mentioned here is the same in principle. A key can be used to directly obtain the corresponding value.

A dictionary is also called a hash table, which is nothing to say. Hash tables are well known for their ability to extract and insert associated values in O(1) time.

5. Skip lists

As a special data structure in Redis – skip list, it adds multi-level index on the basis of linked list to improve the search efficiency.

This is a simple schematic of a skip list. Each layer has an ordered list, and the lowest list contains all the elements. In this way, the skip list can support the time complexity of O(logN) to find the corresponding node.

The following is the actual storage structure of the table. Like other data structures, the corresponding information is recorded in the header node, reducing some unnecessary system overhead.

Reasonable data encoding

For each data type, the underlying support may be multiple data structures, and when to use which data structure is a matter of coding transformation.

Let’s take a look at how different data types are encoded:

String: Int encoding is used for numbers, or raw encoding is used for non-numbers.

List: Ziplist encoding is used when the string length and the number of elements are less than a certain range. If any condition is not met, it is converted to LinkedList encoding.

Hash: The Hash object stores internal key and value strings whose length is less than a certain value and key-value pairs.

Set: Save elements as integers and the number of elements less than a certain range using intset encoding. If any conditions are not met, use Hashtable encoding.

Zset: Ziplist encoding is used when the number of elements stored in Zset is less than a certain value and the member length is less than a certain value. If any condition is not met, skiplist encoding is used.

Appropriate threading model

Another reason why Redis is fast is because it uses the proper threading model:

1. I/O multiplexing model

  • I/O: network I/O

  • Multiple: Multiple TCP connections

  • Reuse: Sharing a thread or process

For use in a production environment, it is common for multiple clients to connect to Redis and send commands to the Redis server, which processes the requests and returns the results.

Redis uses I/O multiplexers to listen for multiple sockets at the same time and push these events into a queue, which are then executed one by one. The result is eventually returned to the client.

2. Avoid context switches

You’ve probably heard that Redis is single-threaded. So why is single-threaded Redis fast?

Because multithreading requires CPU context switching during execution, this operation is time-consuming. Redis is also based on memory, for memory, no context switch is the most efficient. Multiple reads and writes on the same CPU are optimal for memory.

3. Single-threaded model

By the way, why Redis is single threaded.

Redis uses the Reactor single-threaded model, which you may not be familiar with. It doesn’t matter. I just need a general idea.

In this diagram, all requests received by the user are pushed to a queue and then sent to the file event dispatcher, which works in a single-threaded manner. Redis works on it, so Redis is single threaded.

conclusion

Memory-based implementation

  • Data is stored in memory, reducing some unnecessary I/O operations, operation speed is very fast.

Efficient data structures

  • The underlying data structure supports different data types and supports Redis to store different data.

  • Different data structures are designed to minimize the time complexity of data storage.

Reasonable data encoding

  • Different encoding formats are adapted according to the length of the string and the number of elements.

Appropriate threading model

  • The I/O multiplexing model also listens for client connections;

  • Single thread does not need to perform context switch during execution, reducing time consumption.

About the author

Author: Hello, I’m Lewu, a brick remover from BAT. From a small company to dachang, I gained a lot along the way. I want to share these experiences with people in need, so I created a public account “IT Migrant Workers”. Regular updates, I hope to help you.