What happens at the bottom of traditional IO? Why is the OS level quite expensive to operate?

  1. The JVM sends the read() system call
  2. The OS switches the context to the kernel state and reads data from the socket buffer.
  3. The OS kernel copies the data to the user buffer, then switches the context back to the user state, and the read() method returns.
  4. The JVM process continues to execute the code logic and then issues a write() system call.
  5. The OS switches the context to the kernel mode and copies the user buffer data to the socket buffer.
  6. The OS switches the context to user mode, the write method returns, and the JVM continues executing the code logic.

As you can see, a simple traditional IO operation involves four context switches and two data copies, so this is quite inefficient. In middleware systems with high concurrency requirements, such efficiencies are intolerable.

How does the JDK optimize traditional IO operations to achieve zero-copy?

Zero-copy means that data is not copied from the kernel state to the user state, rather than not copied at all.

Kernel-mode copying also occurs because DMA(Direct Memory Access) wants to Access a contiguic Memory space.

3, MMAP

Zero-copy, our code can’t do anything except stream, since no user calls have occurred. But there’s a friendlier — and more expensive — way to do it than zero-copy: ephemeral memory-mapping “mmap.”

Mmap allows files to be mapped directly into memory (the location of the file is mapped, not the file itself) and accessed directly in user mode, avoiding unnecessary copying of data, but still not avoiding context switching.

In addition, the OS maps files directly into memory, so mMAP gets all the benefits of OS virtual memory management (depending on the underlying operating system).

  • Intelligent cache of hotspot data and prefetch of adjacent data pages.
  • Data is stored in a contiguous memory space and does not need to be copied back and forth between buffers.

The JavaNIO package MappedByteBuffer class can implement “Mmap”, which is a variant of DirectByteBuffer.

4, ByteBuffer

Java NIO introduced ByteBuffer as a buffer for channels. There are three main implementations of ByteBuffer:

  • HeadByteBuffer
  • DirectByteBuffer
  • MappedByteBuffer

HeadByteBuffer

ByteBuffer is created with bytebuffer.allocate (), which exists in heap space and is therefore supported by GC (which can be garbage collected) and cache optimized. However, it is not a contiguable memory space, which means that if you access native code through JNI, the JVM will copy it into its buffer space first.

DirectByteBuffer

Use ByteBuffer. AllocateDirect () to create, JVM will use malloc () function, distribution of heap space outside of the memory space. The upside is that the allocated memory space is contiguous and the downside is that it is not managed by the JVM, which means you need to watch out for memory leaks.

MappedByteBuffer

Use the Filechannel.map () map to allocate memory space outside the heap space. Essentially a system call around mmap() that allows our Java code to manipulate mapped memory data directly.

5, summary

Although SendFile () and mmap() provide high efficiency and low latency. However, it is important to note that there is no silver bullet in software technology, which brings about some problems as well as efficiency. In complex practical situations, it is necessary to carefully consider whether the introduction of technology is worth tradeoff. Without the JVM’s management and protection, code complexity increases, making it easier for software to crash (and I mean crash, not exception).

Reference

It’s all about buffers: zero-copy, mmap and Java NIO