With the continuous progress of technology, computers are getting faster and faster. But disk IO speed often let tears, and memory reading speed has an exponential gap; However, due to the popularity of the Internet, the number of netizens is increasing, which brings great challenges to the performance of the system. The performance of the system is often the direction that countless technical people constantly pursue.
The speed difference between CPU, memory and IO is huge. For systems with high concurrency and low latency, disk I/O is often the first bottleneck of the system. To reduce this impact, caching is often introduced to improve performance. However, due to the limited memory space, only part of the data can be stored. And data needs to be persisted, so disk IO is still unavoidable.
Whether it’s hardware upgrades from HDDS (mechanical hard drives) to SSDS (solid-state drives); Or software improvements from BIO (blocking IO) to NIO (non-blocking IO); The disk I/O efficiency has been greatly improved, but there is still a huge gap compared with memory read speed. Today I’m going to introduce a more efficient IO solution called Mmap (Memory mapped Files).
- Click “like” to see, form a habit ~
1. User mode and kernel mode
For security, the operating system divides virtual memory into two modules, user-mode and kernel-mode. They are isolated from each other, and even if the user program crashes, the system will not be affected.
User-mode and kernel-mode involve a lot of complex concepts that I won’t cover here. In simple terms, user mode is where the user program code runs, while kernel mode is the space shared by all processes. Therefore, when reading or writing data, there is often an interaction between user space and kernel space.
In the traditional I/O model, disk data reading and writing generally requires two steps. For example, data writing: 1. Copy from user space to kernel space; 2. Write data to the disk from the kernel space.
2. What is Mmap
Mmap is a method of memory-mapping files. A file or other object is mapped to the address space of a process to achieve the mapping between the file disk address and a segment of virtual address in the process virtual address space.
After Mmap is performed on the file, address space is allocated in the virtual memory of the process to create a mapping relationship with the disk. After this mapping is implemented, you can read and write the mapped virtual memory in pointer mode, and the system automatically writes back to the disk. Conversely, changes made by the kernel space to this area are directly reflected in the user space, enabling data sharing between different processes. Compared with traditional I/O mode, it reduces the operation from user mode to kernel mode.
3. Performance test
In terms of implementation principles, we can make a bold prediction that Mmap performance should be better than traditional IO. In order to ensure as much data fidelity as possible, we benchmarred traditional IO and Mmap reads and writes using the JMH tool. The test code can be obtained from github.
It should be noted that the author’s test results are not rigorous, and the real gap is much more obvious than the following results; The reason is that the test method run time includes the time required for file creation, content initialization, and deletion operations. Here are the test results of the author’s computer: “System: macOS Processor: 2.6GHz six-core I7 Memory: 16GB Disk type: SSD”
Random read performance test:
Random write performance test:
It is not difficult to see from the reading and writing result reports, both the reading and writing results confirm our conjecture and theoretical basis, Mmap performance is far better than traditional IO, and NIO in traditional IO in Java is better than BIO.
4. Application of Mmap in RocketMQ
RocketMQ is a distributed message and flow platform with low latency, high performance and reliability, trillion-scale capacity, and flexible scalability. So the question is, how does it ensure high performance and reliability for massive message processing?
- The rough execution flow of RocketMQ
The message production, storage, and consumption processes in RocketMQ can be broadly divided into the following:
- The producer sends messages to the Broker.
- The Broker stores messages in the CommitLog and writes the message’s commitLogOffset, msgSize, and tagCode to the corresponding ConsumerQueue. The message is sent to the CommitLog for location, size, and label information.
- The consumer reads the message information from the corresponding ConsumerQueue, reads the message body from the CommitLog based on the message location, and then consumes it
- The Mmap RocketMQ
CommitLog is the main storage body of the message and metadata. It stores the message body content written by the Producer, and the message content is of variable length. Single CommitLog files are fixed in size, 1 GB by default. For example, 00000000000000000000 indicates the first file. The start offset is 0. The file size is 1 gb =1073741824. When the first file is full, the second file is 00000000001073741824, and the start offset is 1073741824, and so on. Messages are mainly written sequentially to the log file, and when the file is full, to the next file.
Messages are stored in CommitLog files, and each consumer reads a message based on its offset and size in the file. The process of reading messages is accompanied by random access reads, which severely affects performance. RocketMQ uses the Mmap technology to read and write CommitLog files. It converts file operations to memory addresses, greatly improving file read and write efficiency.
Because of the memory-mapping mechanism, RocketMQ uses a fixed-length structure for file storage, which allows the entire file to be mapped to memory at once.
5.Q&A
- Why is Mmap so fast?
Mmap is used to read and write files across the kernel space, reducing one copy of data, thus improving file I/O efficiency.
- Does Mmap take up a lot of memory compared to disk space?
Note that Mmap does not directly apply for the same size of memory as the disk file. Instead, the address space of the process is mapped to the disk file address, when the actual file is read or written by the process.
When I/O operations are performed, if the corresponding data page is not found in the user space (missing page), the system reads the data from swap cache first. If the data page is not found, the system loads the data from disk (paging).
- What are the application scenarios of Mmap?
Interprocess communication: Mmap provides the ability to share memory and communicate with each other. Each process can map its user space to the same area of the same file, and modify and sense the mapping area to achieve the purpose of interprocess communication and sharing.
Efficient access to big data: In scenarios where a large amount of data needs to be managed or transmitted, the memory space is often insufficient. In this case, Mmap can be used for efficient DISK I/O to compensate for the memory shortage. For example, RocketMQ, MangoDB and other mainstream middleware use Mmap technology; In summary, consider using Mmap whenever you need to replace memory space with disk space.
- What are the disadvantages of Mmap?
Memory-mapped files need to occupy a large contiguous logical address space in the process. For Intel’s IA-32’s 4GB logical address space, the contiguous address space available is much less than 2– 3 GiB.
Once a memory-associated file is used, the execution of a program may be affected by an error in the associated file while the program is running. An I/O error (for example, a removable drive or CD/DVD-ROM drive is ejected, or a write operation is performed when the disk is full) of the associated file is reported to the application program. Normal memory operations do not take these exceptions into account.
Memory mapped files are supported only by memory management units (MMUs).