preface

As for Redis’s “starting and closing”, I have already made a detailed exposition of the basic Redis – “starting” in five chapters. I believe that you can learn a lot from Redis regardless of whether you have touched it before. The content of the basic chapter, as the name implies, is just a foundation, mainly said the development of Redis and the basic data types of Redis, the content and the usual use of the association will be relatively large, not too difficult, I hope you can digest. Here are the plane tickets for the basics:

[From] Overview of Redis – take you through the past life of Redis

[正] Redis foundation – Basic data structure String, Hash

【 英 】Redis foundation – basic data structure List, Set

[正] Redis – Basic data structure ZSet, Bitmap…

[From] Redis Foundation – summary of basic data structure

In the “Commitment” part, I will focus on the principles of Redis and talk about some relatively advanced features, such as pub/ SUB (publish/subscribe), persistence, high performance features, transactions, memory reclamation, etc. In the next part, I will lay the groundwork for you. Let’s string together the content of each chapter, so I don’t want to take up your preface.

Redis is single-threaded IO multiplexing, memory manipulation and simple data structure and self-implementation of VM is not enough, you also need to have a deeper understanding of it, in order to fully display their style in front of the interviewer.

The body of the

Why is Redis so fast?

How fast is Redis?

Redis. IO/switchable viewer/benc…

CD /usr/local/soft/redis-5.0.5/src Redis-benchmark -t set,lpush -n 100,000-qCopy the code

Result (local VM) :

SET: 51813.47 requests per second 51706.31 requests per second -- Redis-benchmark-n 100000 -q script load "redis.call('set','foo','bar')"Copy the code

Result (local VM) :

Script load redis. Call ('set','foo','bar'): 46816.48 requests per second -- 46,000 lua script calls per secondCopy the code

Horizontal axis: connection number; The vertical axis: QPSAccording to official data, Redis QPS can reach around 100,000 (requests per second).

Why is Redis so fast?

Summary: 1) Pure memory structure; 2) Single thread; 3) multiplexing;

memory

KV structure memory database, time complexity O(1).

Second, to achieve such high concurrency performance, do you have to create a lot of threads?

On the contrary, Redis is single-threaded.

Single thread

What are the benefits of single threading?

  1. No thread creation and thread destruction costs;

  2. Avoid CPU consumption caused by line switching;

  3. Avoid contention between threads, such as lock release deadlock and so on.

Why is Redis single threaded?

Isn’t that a waste of CPU resources?

Redis. IO/switchable viewer/faq#…

CPU is not the bottleneck for Redis because single threads are sufficient. Redis bottlenecks are most likely machine memory or network bandwidth. Since single-threading is easy to implement and the CPU is not a bottleneck, it makes sense to adopt a single-threaded solution.

Why is a single thread so fast?

Since Redis is a memory-based operation, let’s start with memory.

Virtual Memory Vitual Memory

Main memory: memory; Secondary storage: disk (hard disk)

Computer main memory (memory) can be thought of as an array of M contiguous byte size cells, each byte has a unique address, called a physical address (PA). In early computers, if the CPU needed memory, it used physical addressing to access main memory directly.

This approach has several drawbacks:

  1. In a multi-user, multi-task operating system, all processes share main memory, and if each process monopolizes a piece of physical address space, main memory can be used up quickly. We want different processes to share the same physical address space at different times.

  2. If all processes access physical memory directly, one process can modify the memory data of other processes, causing the physical address space to be corrupted and the program to run abnormally. To solve these problems, we came up with a way to add an intermediate layer between the CPU and main memory. Instead of using a physical address, the CPU accesses a virtual address, and this middle layer translates the address into a physical address and finally retrieves the data. This middle tier is called Virtual Memory.

The specific operations are as follows:

When a process is created, it allocates a virtual address and then obtains real data by mapping the virtual address to the physical address. In this way, the process does not directly touch the physical address or even know which piece of physical address it is invoking.

At present, most operating systems use virtual memory, such as Windows virtual memory, Linux swap space and so on. Windows virtual memory (pagefile.sys) is part of disk space. On 32-bit systems, the virtual address space size is 2^32bit=4G. What is the maximum virtual address space size on a 64-bit system? 2^64bit=1024 x 1014TB=1024 pb =16EB You don’t really use 64-bit because you don’t use that much space and it’s a lot of overhead. Linux generally uses the lower 48 bits to represent the virtual address space, i.e. 2^48 bits = 256TB.

Cat /proc/cpuinfo address sizes: 40 bits physical, 48 bits virtual The actual physical memory may be much smaller than the virtual memory size.Copy the code

Conclusion: The introduction of virtual memory can provide a larger address space, and the address space is continuous, making the program writing, linking easier. In addition, physical memory can be isolated, and operations of different processes do not affect each other. Memory sharing can also be achieved by mapping the same physical memory to different virtual address Spaces.

User space and kernel space

To prevent User processes from manipulating the Kernel directly and keep it safe, the operating system divides virtual memory into two parts: kernel-space and user-space.The kernel is the core of the operating system, independent of ordinary applications, and has access to the protected memory space as well as access to the underlying hardware devices.

Kernel space holds kernel code and data, while process user space holds user program code and data. Both kernel space and user space reside in virtual space and are mappings to physical addresses.

In Linux, the ratio of virtual memory used by kernel processes to user processes is 1:3.

Processes are kernel-mode when running in kernel space and user-mode when running in user-space.

Processes in the kernel space to execute arbitrary commands, call all resources of the system; In user space, only simple operations can be performed, and system resources cannot be directly called. Only through system interface (also known as system call) can instructions be issued to the kernel.

The top command:

Us represents the percentage of CPU time spent in User space;

Sy represents the percentage of CPU time consumed in Kernel space.

Process switching (context switching)

How does a multitasking operating system run more tasks than the number of cpus?

Of course, these tasks are not actually running at the same time, but because the system alternates CPU allocation to them in a very short period of time through a time-slice sharding algorithm, creating the illusion of multitasking.

To control process execution, the kernel must have the ability to suspend a process running on the CPU and resume execution of a previously suspended process. This behavior is called process switching.

What is context?

Before each task is executed, the CPU needs to know where the task is loaded from and where it is started. In other words, the system needs to set up the CPU register and Program Counter, which is called the CPU context. The saved context is stored in the system kernel and reloaded when the task is rescheduled. This ensures that the original state of the task remains intact and the task appears to be running continuously.

There is a lot of work to be done when switching contexts, which is a very resource-intensive operation.

Process blocking

A running process makes a request for system services (such as I/O operations) but does not receive an immediate response from the operating system for some reason. Therefore, the process can only put itself in the blocked state and wait for corresponding events to occur before waking up. Processes in the blocked state do not consume CPU resources.

File descriptor FD

Linux systems treat all devices as files, and Linux uses file descriptors to identify each file object.

File descriptors are indexes created by the kernel to efficiently manage open files. They are used to point to open files. All system calls to perform I/O operations go through File descriptors. The file descriptor is a simple non-negative integer that identifies each file opened by the process.

There are three standard file descriptors in Linux.

0: standard input (keyboard). 1: standard output (display); 2: Standard error output (display)

Traditional I/O data copy

Take the read operation as an example:

When an application executes a read system call to read a file descriptor (FD), the data is read directly from memory if the piece of data already exists in the page memory of the user process. If the data does not exist, it is first loaded from disk into the kernel buffer, and then copied from the kernel buffer into the page memory of the user process. (Two copies, two context switches between user and kernel).

Where is the I/O blocking?

Blocking I/O

When a file descriptor is read or written, the system does not respond to other operations if the current FD is unreadable. Copying data from the device to the kernel buffer is blocked, and copying data from the kernel buffer to user space is blocked until the kernel returns the result and the user process unblocks.

To solve the blocking problem, we have several ideas:

  1. Create multiple threads on the server or use a thread pool, but in the case of high concurrency, the number of threads will be too large for the system to handle, and the creation and release of threads will consume resources;

  2. Periodic polling by the requester and copying data from the kernel cache buffer to user space (non-blocking I/O) after the data is ready, which has some latency;

Can you handle multiple client requests with a single thread?

I/O Multiplexing

I/O refers to network I/O.

Multiplexing refers to multiple TCP connections (sockets or channels).

Reuse refers to the reuse of one or more threads.

The basic principle is that instead of the application monitoring the connection itself, the kernel monitors the file descriptor for the application.

The client generates sockets with different event types during operation. At the server side, the I/O Multiplexing Module would queue the message and then forward it through the File Event Dispatcher to different event handlers.

There are many implementations of multiplexing. Take SELECT as an example. When a user process calls a multiplexer, the process blocks. The kernel monitors all sockets for which the multiplexer is responsible, and when data is ready for any socket, the multiplexer returns. The user process then calls the read operation to copy data from the kernel buffer into user space.

Therefore, I/O multiplexing is characterized by a mechanism whereby a process can wait for multiple file descriptors at the same time, and the select() function can return if any one of these file descriptors (socket descriptors) becomes readable.

Redis multiplexing, provides select, epoll, evport, kqueue several options, at compile time to choose one of them.

Ae source code. C

#ifdef HAVE_EVPORT
#include "ae_evport.c"
#else
#ifdef HAVE_EPOLL
#include "ae_epoll.c"
#else
#ifdef HAVE_KQUEUE
#include "ae_kqueue.c"
#else
#include "ae_select.c"
#endif
#endif
#endif
Copy the code

Evport is supported by the Solaris kernel;

Epoll is supported by the LINUX kernel.

Kqueue is supported by the Mac system;

Select is provided by POSIX, which is supported by common operating systems.

Ae_epoll. c, ae_select.c, ae_kqueue.c, ae_evport.c

By the way

There is a problem? Can you leave me a message or chat privately? Just give it a thumbs up

Of course, you can also go to my official account “6 Xi Xuan”,

Reply to “Learn” and receive a copy of the Video tutorial for Advanced Architects for Java Engineers

Answer “interview”, can obtain:

MySQL brain Map MySQL brain map

There are [Ali cloud] [Tencent cloud] purchase discount oh ~ specific please contact me

The sunrise hin I am trained programmers, PHP, Android and hardware are done, but in the end or choose to focus on Java, so have what questions to ask the public for discussion (emotional pouring technology can ha ha ha), see words will reply as soon as possible, hope can with everyone common learning progress, about the server architecture, Java core knowledge analysis, career, interview summary and other articles will be pushed irregularly output, welcome to pay attention to ~~~