This is the 19th day of my participation in the Gwen Challenge in November. Check out the details: The last Gwen Challenge in 2021

Traditional IO analysis

If you want to transfer a file from a local disk to another host over the network, traditional IO uses code that reads the file, stores the file contents into a byte array, and sends the byte array over the socket.

File f = new File("helloword/data.txt");
RandomAccessFile file = new RandomAccessFile(file, "r");

byte[] buf = new byte[(int)f.length()]; file.read(buf); Socket socket = ... ; socket.getOutputStream().write(buf);Copy the code

The actual workflow of this code can be seen in the following figure:

  • The first step: Java itself does not have IO read and write capability, so after the read method is called, the Java program from the user state to the Kernel state, to call the operating system (Kernel) method read ability, first read data into the Kernel buffer (disk data can not be directly read into the user buffer).

  • Step 2: Switch from kernel mode back to user mode and read data from kernel buffer into user buffer (byte[] buf).

  • Step 3: Call the write method, which writes data from the user buffer (byte[] buf) to the socket buffer.

  • Step 4: The next step is to write data to the network adapter. Java does not have this ability, so it needs to switch from user mode to kernel mode, call the operating system’s write ability, and write the socket buffer data to the network adapter.

JAVA IO is not a physical device level read/write, but a cache copy. The real read/write at the bottom is done by the operating system.

  • The switch between user mode and kernel mode occurred three times
  • The data was copied four times

NIO optimization

Nio buffers can be used. Note that DirectByteBuf must be used to allocate buffers, because bytebuffer.allocate () allocates buffers, which correspond to HeapByteBuffer, using Java memory. Using ByteBuffer. AllocateDirect () the underlying corresponding DirectByteBuffer, directly using the operating system memory, the memory has a characteristic: the operating system can access, Java can also access

With this improvement, the workflow looks like the following:

Most of the steps are the same as in the previous version, except that DirectByteBuffer can map out-of-heap memory to JVM memory for direct access, thus treating the kernel buffer and user buffer as the same piece of memory, thereby reducing a copy of the data

  • The number of switching between user mode and kernel mode did not change, but still happened 3 times
  • Data copies have been reduced by one to three times

Zero copy technology

This process can be further optimized using zero-copy techniques, and it is important to note that zero-copy means that the data does not need to be copied into JVM memory, rather than not copied at all.

Zero copy technology 1

The first zero-copy technology uses the sendFile method provided after Linux 2.1. In Java, the corresponding two channels call transferTo/transferFrom method to copy data. It should be noted that: These two methods are present in fileChannel, not in SocketChannel

  • Instead of sending data to directBuffer, it can send data directly from the kernel buffer to the socket buffer without going through Java, reducing the need for two user/kernel switches
  • Java first callstransferToMethod from a Java programUser modeSwitch to theKernel mode, read the data inKernel buffer
  • The data is then transferred from the kernel buffer to the socket buffer
  • Finally, the socket buffer data is written to the nic

Analyze this method:

  • There was only one switch between user and kernel mode
  • The data was copied three times

Zero copy technology ii

The above approach is optimized again in Linux 2.4

  • You can send data content directly from the kernel buffer to the network adapter, copying offset and length information into the socket buffer with almost no cost
  • Java callstransferToMethod to follow from the Java programUser modeSwitch to theKernel mode, read the data inKernel buffer
  • You can write kernel buffer data directly to the nic

Characteristics of zero copy technology

  • Less switching between user mode and kernel mode
  • Do not use CPU computing, reduce CPU cache pseudo-sharing
  • Note that zero copy is suitable for small file transfers