IO

What is IO?

We all know that in the Unix (like) world, everything is a file, and what is a file? A file is just a string of binary streams, whether socket, FIFO, pipe, terminal, to us, everything is a file, everything is a stream. In the process of information exchange, we send and receive data to these streams, referred to as INPUT and Output operation (I/O operation), read data to the stream, system call READ, write data, system call write.

What are I/O sources

Disk I/o

The instruction to send a disk IO, usually notifying the disk of the location of the starting sector, and then giving the number of consecutive sectors to be read from the initial sector, as well as whether the action is read or write. When the disk receives this instruction, it reads or writes data as instructed. The controller sends this instruction plus data, which is an IO, read or write.

Concurrent DISK I/OS

Only one instruction can be executed on a disk at a time. Therefore, the concurrency of a disk is 0

The memory I/o

Read and write data from memory, very fast, usually not a performance bottleneck, generally not considered

Device IO

To write or read data from an external device, device IO needs to consider whether the device is a mutually exclusive resource. Only one thread can occupy the I/O of a mutually exclusive resource at a time.

Network IO

Network IO is also a type of device IO, but is usually discussed separately. Network IO is also the read and write of the network card, that is, the IO for sending and receiving requests and reading and writing data on the network card. The main use of sockets to generate and receive data.

Data copies

Different I/O sources follow the same data copy.

The DMA controller

DMA(Direct Memory Access) : An important feature of all modern computers, it allows hardware devices of different speeds to communicate without relying on the heavy interrupt load of the CPU.

Traditional IO operations

A read operation

Copy twice, context switch twice

  • (1) The context in which the user process initiates a read to the kernel via the read() function is switched from the user to the kernel
  • (2) THE CPU uses the DMA controller to copy from memory or disk to read cache once
  • (3) THE CPU copies the data in the cache to the cache of the user process
  • (4) Context switch from user to kernel return from read() function

The write operation

Copy twice and switch contexts twice

  • (1) The user process calls the write() function to initiate a system call to the kernel and the context user switches to the kernel
  • (2) The CPU copies the data from the user cache to the kernel cache
  • (3) THE CPU uses DMA controller to copy data from the socket buffer to memory or disk once
  • (4) The context switches from the kernel to the return of the user write() function

Zero copy

In this mode of data transfer, applications can directly access the hardware storage, and the operating system kernel only assists in data transfer. That is, user processes can read and write directly to disk or memory. The data does not need to be copied, but of course some of the benefits of the traditional approach must be discarded

The main zero-copy-like system calls in Linux are mmap(), sendFile (), and splice().

Mmap () (read copy once, write unchanged)

A copy occurs when DMA reads data from disk or memory into the shared cache. User processes can then directly use the data in the shared cache.

The difference is that the read operation is changed to mmap, and then the user space and the kernel state share the same kernel cache, and the read data is stored in the kernel cache. It’s the same as before.

Sendfile () (read one copy, write two copies)

This method is suitable for reading data can be directly written to other IO sources, copy manipulation is directly completed in the kernel space, the user process does not need to participate, reducing the context switch

A read copy occurs when DMA reads data from disk or memory into the kernel cache

Write two copies

  • (1) Copy the memory cache to the socket cache
  • (2) Copy the contents of the socket cache to the nic

Splice () (read copy once, write copy once)

This method is suitable for reading data can be directly written to other I/O sources. A channel is established between the memory cache and socket cache to access each other without replication.

Read/write ready state

(1) Read ready state

If the number of bytes in the kernel buffer is greater than or equal to the number of bytes requested by the user process, the system can move the data from the kernel buffer to the user buffer.

(2) Write ready state

If the number of bytes left in the kernel buffer (free space) is greater than or equal to the number of bytes requested by the user process, the system can move data from the user buffer to the kernel buffer.

That is to say, a read or write process must first go through a read/write ready state, read/write ready, before the “real” IO, read ready, the system can move the kernel buffer data to the user buffer;

After the write is ready, the system can move the user buffer data to the kernel buffer.

Network IO

Network IO as a Java service focuses on situations where both requests/responses and database operations are done through network IO

Network I/O features

When other I/O sources read data, data does not exist. However, network IO for the operating system, read and write the contents of the network adapter. Whether the contents of the network adapter can be read depends on whether new data is written over the network. Therefore, the user process does not know when to fetch data from the network IO, so it needs to use the IO model.

Network IO The process of receiving network packets

www.easemob.com/news/5544

www.yuque.com/henyoumo/ik…

  • (1) After the nic receives the data, the network driver will write the data received on the NIC to the memory through DMA, and send a soft interrupt to the CPU to notify the CPU of the data arrival
  • (2) The kernel has a thread ksoftirQD dedicated to processing soft interrupt requests, KsoftirQD continuous loop, determine whether there is a soft interrupt request to be processed. Poll continuously with poll().
  • (3) After ksoftirQD finds the interrupt request from the network card, it delivers the data to all levels of protocol stack for processing.
  • (4) The protocol stack is responsible for processing the data of different protocols (such as receiving a complete number of TCP data packets), turning them into available data and putting them into the socket queue. The data representing the socket is ready.
  • (5) The user process itself maintains a continuous loop thread (or the network framework maintained by the user process), constantly access the kernel space to see if there is ready socket data.

IO model

The IO model is applicable to any interaction with an IO source, but for network IO, the wait time for an IO interaction can be infinite.

  • (1) IO model is mainly used to discuss the situation that data is not ready, if the data is ready, any model can directly read the data
  • (2) The IO model only discusses the situation after the application triggers the data acquisition operation, and the IO model has nothing to do with when the application triggers the data acquisition operation

Blocking IO (BIO)

  • (1) a thread of the application to obtain data operation, at this time the kernel data is not ready
  • (2) The thread waits
  • (3) The kernel data is ready, and the wake thread reads the kernel data into the user process space
  • (4) The data reading operation of the thread is completed

Blocking IO A thread can be used to retrieve data from only one socket.

Non-blocking IO (BIO)

  • (1) a thread of the application to obtain data operation, at this time the kernel data is not ready, return an error message
  • (2) After the thread returns, it can do something else, and then it will try to fetch the data again
  • (3) The thread keeps polling until the kernel data is ready, then copies the kernel data to user space
  • (4) The data reading operation of the thread is completed

Non-blocking IO, if you maintain multiple socket connections in a thread, can achieve similar effects to select().

IO multiplexing

IO multiplexing is a synchronous IO model. A thread listens for multiple I/O events. When an I/O event is ready, the thread will be notified to perform the corresponding read and write operation.

Multiplexing refers to network links, and multiplexing refers to the reuse of the same thread.

Because IO multiplexing applies not only to sockets, but to all file descriptors fd, it is introduced as FD

select()

The user thread maintains an array of all interested FDS (in sockets, all sockets that have been connected). The size of the array is limited to 1024 on 32-bit systems and 2048 on 64-bit systems

(1) The thread keeps calling the select() method to copy the array from user space to the kernel space. The kernel space will check the array to see if there is an I/O event in the FD. If there is an I/O event, the fd will be put into the ready state.

(2) The select() method returns, and the process iterates through the array to see which FDS are ready. If so, it calls the corresponding method of the FD and copies the data from the kernel space to the process space.

poll()

The process maintains a linked list, and because it is a linked list, there is no length limit.

The other operations are the same as the select() method. No performance improvement

epoll()

Epoll is an improvement on select and poll. Its core idea is based on event-driven implementation, which is equivalent to the establishment of the corresponding data structure in advance + the use of callback functions, so that no polling is required, but only return ready FD.

The epoll operation corresponds to three functions: epoll_create, epoll_ctr, and epoll_wait.

epoll_create

Epoll_create is equivalent to creating a data structure in the kernel to hold FDS. In both the SELECT and poll methods, the kernel does not have a data structure for the FD, but simply copies the array or linked list in. Epoll_create creates a red-black tree for storing FD nodes in the kernel. If new FD nodes are added to the tree, they will be registered to the epoll red-black tree.

epoll_ctr

Unlike Select and Poll, which copy all FDS monitored to the kernel at once, epoll calls epoll_ctr when a new FD needs to be added, registers a callback function for the FD, and then registers the FD node in the kernel’s red-black tree. When the device corresponding to the FD is active, the callback function on the FD is called to store the node in a ready linked list. This also solves the problem of copying back and forth between kernel space and user space.

epoll_wait

The epoll_wait method is called when a process retrieves a ready FD, which in fact fetkes nodes from the ready list

Epoll workflow

Epoll_wait () is called. If there is data in the ready list, the thread will return it. If there is no data, the thread will block and wake up when there is data. The thread that obtains the data continues backward from the epoll_wait() method

When to select select(), poll() or epoll()

Epoll is not always the best in all situations. For example, when the number of FDS is small, epoll is not always better than select and poll

AIO asynchronous I/o

Asynchronous IO is definitely not blocking. Asynchronous IO looks like an epoll callback at first glance, but epoll actually wakes up the thread that was trying to fetch the data when it was ready. The previous thread was blocked until it was woken up

  • (1) The user thread makes a call to the kernel space to read data.
  • (2) If the data is ready, read directly and copy the data to user space
  • (3) If not ready to return directly, then the thread is destroyed
  • (4) The kernel already knows what data the user process wants. When the kernel data is ready, the kernel actively copies the data to user space, and the kernel actively calls the callback function provided by the user to process the data.

IO evolution of the JDK

  • (1) JDK 1.4, which supports only traditional Bio and niO;
  • (2) JDK 1.7, after this release, there is AIO.
  • (3) The IO operation at the programming language level actually calls the read/write interface of the operating system kernel (reading and writing to the underlying hardware device), so it essentially depends on the operating system kernel. If the operating system does not support AIO, even if there is aiO interface at the programming language level, it is useless, which is why there is AIO. However, niO is still the actual use of most applications, because most applications are deployed on Linux servers, and the Linux operating system kernel has not implemented AIO (Windows implements AIO).