Almost all related to Java interview will be asked about the cache problem, is the basis of ask what is this principle, what is “hot and cold data”, complicated ask caching avalanche, caching, penetration, caching, preheating, cache updates, caching, degradation and other issues, these seemingly unusual concept, are associated with our cache server, Redis and Memcached are commonly used cache servers, and Redis is the only one I use most.

If you haven’t had an interviewer ask you “Why Redis is single threaded and why Redis is so fast! If you read this article, you should feel lucky! If you happen to be a high-achieving interviewer, you can also use this question to test his or her mastery of the interview.

All right! Get down to business! Let’s first look at what Redis is, why is Redis so fast, and then why is Redis single threaded?

A. Introduction of Redis

Redis is an open source in-memory data structure storage system that can be used as: database, cache, and messaging middleware.

It supports multiple types of data structures, such as String, Hash, List, Set, Sorted Set (Sorted Set or ZSet) and range queries, Bitmaps, Hyperloglogs and Geospatial index radius query. The common data structure types are String, List, Set, Hash, and ZSet.

Redis is built with Replication, LUA scripting, LRU eviction, Transactions and different levels of disk Persistence, And provides High Availability through Redis Sentinel and Automated partitioning (Cluster).

Redis also provides persistence options that allow users to save their data to disk for storage. Depending on the actual situation, data sets can be exported to disks (snapshots) or appended to command logs (AOF only appends files) at regular intervals. When a write command is executed, the command is copied to the disk. You can also turn off persistence and use Redis as an efficient network for caching data.

Redis does not use tables, and its database does not pre-define or force users to associate different data stored in Redis.

The working mode of database can be divided into hard disk database and memory database according to storage mode. Redis stores data in memory, and reads and writes data without being limited by hard disk I/O speed, so it is extremely fast.

(1) Working mode of hard disk database:

(2) Working mode of in-memory database:

What is Redis? What are the common types of Redis data structures? How does Redis persist?

Two. How fast is Redis?

Redis adopts KV database based on memory, which adopts single-process single-thread model and is written by C language. The official data can reach 100000+ QPS (query times per second).

This data is no worse than Memcached, the same memory-based KV database that uses single process and multiple threads!

The horizontal axis is the number of connections, and the vertical axis is QPS. At this point, this chart reflects an order of magnitude, I hope you can describe it correctly in the interview, do not ask you when you answered the order of magnitude is far away!

Why is Redis so fast?

1, completely memory based, most requests are pure memory operations, very fast. Data is stored in memory, similar to a HashMap, which has the advantage of O(1) time complexity for both lookup and operation.

2, the data structure is simple, the data operation is also simple, Redis data structure is specially designed;

3, the use of single thread, avoid unnecessary context switch and competition conditions, there is no multi-process or multi-threading caused by the switch and CPU consumption, do not have to consider the problem of various locks, there is no lock release lock operation, there is no performance consumption due to the possibility of deadlock;

4. Use multi-channel I/O multiplexing model, non-blocking IO;

5, the use of the underlying model is different, between them the underlying implementation and communication with the client between the application protocol is not the same, Redis directly built their own VM mechanism, because the general system call system function, will waste a certain amount of time to move and request;

All of the above points are easy to understand. Let’s briefly discuss the multiplex I/O multiplexing model:

(1) Multiplex I/O multiplexing model

Multi-channel I/O multiplexing model uses the ability of SELECT, poll and epoll to monitor I/O events of multiple streams at the same time. When idle, the current thread will be blocked. When one or more streams have I/O events, it will wake up from the blocking state. The program then polls all the streams (epoll only polls the streams that actually emitted the event) and only polls the ready streams sequentially, which avoids a lot of useless operations.

“Multiplexing” refers to multiple network connections, and “multiplexing” refers to the reuse of the same thread.

The use of multiplex I/O multiplexing technology allows a single thread to efficiently process multiple connection requests (minimizing the time consumption of network IO), and Redis in memory data manipulation speed is very fast, that is to say, in-memory operations will not become a bottleneck affecting Redis performance. The above points contribute to the high throughput of Redis.

So why is Redis single threaded?

First of all, we must understand that all the above analysis is to create a Redis fast atmosphere! According to the official FAQ, because Redis is a memory-based operation, CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of machine memory or network bandwidth. Since single-threading is easy to implement and the CPU is not a bottleneck, it makes sense to go with a single-threaded solution (there is a lot of trouble with multi-threading after all!). .

You might cry when you see this! We thought there would be some major technical glitch that made Redis so fast with a single thread, but the official answer seemed to fool us! However, we can clearly explain why Redis is so fast, and because it is fast in single-threaded mode, there is no need to use multithreading again!

However, our single-threaded approach does not allow for multi-core CPU performance, but we can improve this by opening multiple instances of Redis on a single machine!

Warning 1: Here we have been emphasizing the single thread, only in the process of our network request only one thread, a formal Redis Server running must be more than one thread, here we need to pay attention to clearly! For example, Redis will persist as child process or child thread (whether child thread or child process needs further research). For example, I look at the Redis process on the test server and find the thread under that process:

The “-t” parameter of the ps command indicates the thread to be displayed (Show Threads, possibly with SPID Column.). The “SID” column indicates the thread ID, while the “CMD” column indicates the thread name.

Warning # 2: In the last paragraph of the FAQ above, it states that Redis 4.0 will support multi-threading, but only for certain operations! So this article in the future version of the single-thread way needs to be verified by readers!

Five. Be careful

1. We know that Redis uses a “single-threaded multiplexing IO model” to implement high performance in-memory data services. This mechanism avoids the use of locks, but it also reduces Redis concurrency for time-consuming commands such as sunion.

Because it is a single thread, there is only one operation going on at a time, so time-consuming commands can lead to a decrease in concurrency, not just read concurrency, but also write concurrency. A single thread can only use one CPU core, so multiple instances can be started in the same multi-core server, in the form of master-master or master-slave, and time-consuming read commands can be executed entirely on the slave.

Redis. Conf items that need to be changed:

Pidfile /var/run/redis/redis_6377.pid #pidfile specifies the port number

Port 6377 # this must be changed

Logfile /var/log/redis/redis_6377.log #logfile name plus port number

Dbfilename dump_6377. RDB #rdbfile also adds the port number

2. “We can’t let the operating system load balance because we know our own programs better, so we can manually allocate CPU cores to them without taking up too much CPU or crowding our key processes with a bunch of other processes.”

CPU is an important factor, as Redis prefers fast cpus with large cache rather than multiple cores due to the single-threaded model.

On a multi-core CPU server, Redis performance also depends on NUMA configuration and processor binding location. The most obvious effect is that Redis-Benchmark randomly uses the CPU kernel. To get accurate results, you need to use fixed processor tools (taskset is available on Linux). The most efficient way to use level 3 caching is to separate the client and server into two different cpus.

6. Extension

Here are a few models you should know to help you interview!

1. Single-process multi-threaded model: MySQL, Memcached, Oracle (Windows version);

2. Multi-process model: Oracle (Linux version);

3. Nginx has two types of processes, one is called the Master process (equivalent to the management process) and the other is called the Worker process (actual Worker process). There are two boot modes:

(1) Single-process startup: there is only one process in the system, which acts as both the Master process and the Worker process.

(2) Multi-process start: At this time, the system has only one Master process and at least one Worker process.

(3) The Master process mainly performs some global initialization and Worker management; Event processing happens in the Worker.