Why does zero copy improve Kafka performance?

Start with the operating system

Computer system is composed of “hardware” and “software” two parts, computer hardware includes one or more processors (CPU), memory, keyboard, display, disk, I/O interface and some other peripheral equipment such as printer, plotter and so on. In short, the computer hardware part is a hardware system composed of a variety of electronic and mechanical devices.

In order to use these devices easily and correctly, it is necessary to write programs to manage them, and these programs make up the software system of a computer. Software can also be divided into two broad categories: system software and application software. First of all, people directly load a layer of software on the hardware, with it to manage the entire computer hardware equipment and some software information resources, but also provide users with the development of application environment, which is the operating system software and practical software. Application software is under the support of the operating system, in order to achieve user requirements and compiled a variety of applications.

A primary device consisting of CPU, memory, and I/O interfaces is usually called a host, and a host without an operating system is called a bare machine. The interface between a bare computer and the operating system software consists of the CPU command system and the system BIOS provided by the manufacturer.

Because the operating system hides the hardware used by the system from the user, the operating system provides a set of command or system call interfaces to the application system software on it for user programs to use. For example, if we need to use disks, we can do this indirectly through system naming or system calls, rather than manually writing a disk device driver. Therefore, for the user, when the computer loads the operating system, the user does not directly deal with the computer hardware, but uses the commands and functional areas provided by the operating system to use the computer.

Because the operating system is in the central position of hardware and software, so someone very early put the operating system as the core of computer system software, referred to as the core or kernel.

Kernel mode and user mode

From the point of view of system security and protection, there are two execution modes of processor in computer architecture design: kernel mode and user mode. When the processor is executed in kernel mode, it means that the system can execute privileged instructions in addition to general instructions, that is, it can execute instructions to access various control registers, I/O instructions and program status words.

When the processor is in user mode, only general instructions can be executed, and privileged instructions are not allowed to be executed. Doing so protects the core code from intentional and unintentional attacks by user programs. Obviously, the processor needs to switch between kernel mode and user mode during runtime.

Zero copy

Kafka provides its performance using zero-copy technology, which copies data directly from disk files to the nic device without the application’s hand, reducing context switching between kernel and user mode. Zero-copy technology is implemented via DMA technology.

Direct Memory Access (DMA) DMA control is centered on Memory. A Direct path is established between main Memory and I/O devices, and data exchange between the devices and main Memory is carried out under the control of DMA controller. This approach requires CPU intervention only at the beginning and end of the transfer. It is very suitable for batch data transfer between high-speed equipment and main memory.

Let’s take a look at a scenario like this:

What happens when the client initiates a request for content in the viewer until it sees the specific content?

First, after the request is parsed, it is executed by system call from user mode to core mode. In the core mode, TCP/IP protocol code and nic driver in the operating system control the nic to send the request to the corresponding network and wait for the corresponding Web server. When the server returns, it is received by the network adapter and transmitted to the client through the kernel.

On the server side, the kernel accepts Web requests from the network through a network card and passes them to the Web server through system calls. The Web server executes the corresponding service process according to the service request, and the kernel sends the result to the network and uploads it to the user.

From the figure above, you can see that the server goes through four steps from preparing data to sending data.

When read() is called, the contents of the file are copied into the read Buffer in kernel mode
CPU control copies kernel mode data to user mode
When send() is called, the user-mode contents are copied to the kernel-mode Socket Buffer.
Copy the Socket Buffer data in kernel mode to the nic device and send it.

As you can see from the above procedure, the data is first copied from kernel mode –> user mode –> kernel mode, wasting two replication processes: the first is copied from kernel mode to user mode; The second copy is from user mode back to kernel mode, and the context switch between kernel and user mode is also done four times in the above process.

With zero-copy technology, the application can directly request the kernel to transfer data from disk to Socket.

Zero copy technology copies the contents of a file to a Read Buffer in kernel mode using DMA technology. However, no data is copied to the Socket Buffer, only file descriptors containing information about the location and length of the data are added to the Socket Buffer. The DMA engine passes data directly from kernel mode to the nic device. Here the context switch becomes two, and it only takes two copies to transfer from disk.

Why does zero copy improve Kafka performance?

Start with the operating system

Kernel mode and user mode

Zero copy

Related Posts

EasyC++17, a pointer to C++

Why is PUSH often blamed?

How to ensure the integrity of associated data without using foreign keys