This is the 27th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

preface

Zero-copy is a technology that does not need to Copy data from one area of memory to another during data manipulation. This avoids memory copying and improves CPU performance. The zero-copy mechanism is an optimization scheme for manipulating data, which improves CPU performance by avoiding data copying in memory.

1. What is zero-copy mechanism?

Scenario: The file server downloads files. The server sends data from hard disks to clients over the network.

As shown in the figure, the whole process can be divided into four steps.

1. The operating system copies data from the hard disk to the kernel buffer via DMA transfer

2. The operating system executes the read method to copy data from the kernel buffer into user space

3. The operating system executes write to copy user-space data to the kernel socket buffer

4. The operating system copies the kernel socket buffer data to the nic through DMA transmission to send data

2. Zero copy operating system?

As we can see, the operating system copies data from disk to kernel space, from kernel space to user space, then from socket space to the kernel, and then to the network card for transmission. It feels like a lot of action, so why is the operating system designed this way?

For an operating system, this data can be used and modified by multiple applications at the same time, causing conflicts if everyone uses the same copy of kernel space. Therefore, the operating system is designed so that each application that wants to use this data must make a copy of it into its own user space so that it does not interfere with each other. Therefore, this mechanism is wasteful in cases where data does not need to be modified, and the data could have stayed in the kernel buffer without having to copy it into user space again.

Mmap optimization

To avoid this waste, mMAP calls were initially used for optimization. Mmap maps files to the kernel buffer through memory mapping, and the user space can share data in the kernel space.

Now, you only need to copy from the kernel buffer to the Socket buffer, which will reduce one memory copy (3 instead of 4), but not the number of context switches.

Sendfile optimization

The sendFile function is provided in Linux version 2.1. The basic mechanism is as follows: data goes directly from the kernel Buffer to the Socket Buffer without going through the user state at all. At the same time, it has nothing to do with the user state, thus eliminating a context switch.

As shown in the figure above, when we make the sendFile system call, the data is copied from the file to the kernel buffer by the DMA engine, and then when we call the write method, from the kernel buffer to the Socket, there is no context switch, because it is all in kernel space.

Sendfile was optimized again

In Linux 2.4, some changes were made to avoid copying from the kernel buffer to the Socket buffer, and instead copy directly to the protocol stack, again reducing data copying. The details are as follows:Instead of three copies, only two copies are required: the first copy from the file to the kernel buffer using the DMA engine, and the second copy from the kernel buffer to the network protocol stack; The kernel cache only copies some offset and length information to the SocketBuffer.

Zero copy definition: “Zero-copy” describes computer operations in which the CPU does not perform the task of copying data from one memory area to another.

Two replicates from user space to socket buffer. But why call it “zero copy” when there are two more copies of the data? This is because from the operating system’s point of view, there is no copying of data from memory to memory, so there is no CPU involved, so there is zero copy for the operating system.

Mmap is different from SendFile

Difference between mmap and sendFile.

  1. Mmap is suitable for reading and writing small amounts of data, while sendFile is suitable for transferring large files.
  2. Mmap requires 4 context switches and 3 data copies; SendFile requires 3 context switches and at least 2 data copies.
  3. SendFile can use DMA to reduce CPU copying, whereas Mmap cannot (it must be copied from the kernel to the Socket buffer).

In a real scenario, rocketMQ uses MMAP when consuming messages. Kafka uses sendFile.