This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.

This paper introduces four IO models in Java, namely BIO, NIO, IO multiplexing, AIO, and select, Poll, epoll system functions in detail.

Procedure 1 Perform network I/O operations

The request and response process of network I/O can be divided into the following steps. The server sends and receives messages as an example:

  1. The Linux kernel reads the request data from the client through the nic and reads the data into the kernel’s Page Cache. Data from nic to kernel space;
  2. Read data from the kernel buffer into the Java process buffer. Data from kernel space to user space;
  3. Java server processes handle client requests in their own user space. Data is processed in user space;
  4. After processing the data and building the response, the data is written from the user buffer to the kernel buffer.
  5. The Linux kernel writes the response from the kernel buffer to the network adapter, which sends the data to the target client through the underlying communication protocol.

The network IO process can be divided into two stages:

  1. The packet arrives at the nic and is copied to the kernel space cache. This process is called the preparation phase. In the preparation phase, if it blocks the calling thread, it is called blocking IO, otherwise it is called non-blocking IO.
  2. Data is copied from kernel space to application space so that applications can actually use it. At this stage, if it blocks the calling thread, it is called synchronous IO, otherwise it is called asynchronous IO.

Based on the above steps, there are four common network IO models:

  1. Blocking IO: the traditional BIO model;
  2. Non-blocking IO (SYNCHRONOUS NON-blocking IO) : All created sockets are blocked by default. Non-blocking IO requires that the socket be set to NONBLOCK. NIO here is not the NIO (New IO) library in Java.
  3. IO Multiplexing: the classic Reactor design pattern, sometimes referred to as asynchronous blocking IO, is NIO’s Selector in Java and epoll in Linux.
  4. Asynchronous IO (Aynchronous IO) : the classic Proactor design pattern, asynchronous non-blocking IO, also known as AIO.

1.1 Synchronous and asynchronous

Synchronous I/O: Each request must be processed individually; the processing of one request results in a temporary wait for the entire process, and these events cannot be executed concurrently. After the user thread initiates an I/O request, it must wait for or poll the kernel I/O operation to complete before continuing to execute the request.

Asynchronous I/O: Multiple requests can be executed concurrently, and the execution of a single request or task does not result in a temporary wait for the entire process. The user thread initiates an I/O request and continues to execute it, either notifying the user thread when the kernel I/O operation is complete or invoking the callback function registered by the user thread.

1.2 Blocking and non-blocking

Blocking and non-blocking are ways of dealing with whether a process is ready to request an operation when it accesses data.

Block: After a request is issued, the request operation is blocked until the condition is met.

Non-blocking: After a request is sent, if the condition required by the request is not met, a flag message is immediately returned indicating that the condition is not met, rather than waiting for a long time. Generally, it is necessary to obtain the request result by checking whether the request condition is met through a loop.

Synchronization and asynchrony focus on whether a task initiated later must be performed after the task initiated earlier is completed. Whether the task request initiated first blocks and waits for completion, or immediately returns through a loop waiting for the request to succeed.

Blocking and non-blocking focus on whether the requested method is blocked if the condition is not met and whether it returns immediately.

2 Block I/OS synchronously

The blocking IO model, also known as BIO, is the simplest and most common. In Linux, all sockets are blocking IO by default.

In this model, user-space applications perform read calls (the underlying recvFROM system call) to read data from the socket. After an application makes a READ call, it blocks until the packet reaches the network card and is copied to kernel space, and then is copied from kernel space to user space before returning.

Most of the time, the program executes read call, packet has not arrived, or not to be copied into the kernel space, read call at this time (the underlying recvfrom system call) will block the calling thread, wait until after the data is copied into the kernel space, in the process of copy to the application space, still will block the calling thread.

The synchronous blocking IO model is characterized by the fact that both phases of the IO operation block the calling thread.

The BIO model is simple to program. During the period of blocking and waiting for data, the user thread is suspended. At this time, the CPU resources are not occupied and the data can be returned in time without delay.

However, due to the nature of synchronous blocking, the server usually allocates a separate thread for each client connection to process, namely multi-threaded schemes.

This is fine if the number of client connections is not high. However, when faced with hundreds of thousands or even millions of connections, a large number of threads are needed to maintain a large number of network connections, and the overhead of memory and thread switching can be huge. The traditional BIO model is useless. Therefore, we need a more efficient I/O processing model to handle the higher concurrency.

3 Synchronize non-blocking I/OS

In Linux, you can set the socket to make it non-blocking, that is, synchronize the non-blocking IO model. In Java, by performing channel. ConfigureBlocking (false) method to make it into a non – blocking.

A read call under this model (underlying is a recvFROM system call) has two possibilities:

  1. In the case of no data in the kernel buffer, the recvFROM system call returns immediately, returning the process with a failed call (EAGAIN or EWOULDBLOCK).
  2. In the case of data in the kernel buffer, the recvFROM system call copies the data from the kernel buffer to the user process buffer, and the copying process is still blocked. After the replication is complete, the system call returns success and the application process obtains the data.

After the data is returned to the process because it is not ready, the process (or program) can do something else and then issue a recvFROM system call to repeat the above process, often called polling, which makes efficient use of the CPU.

After each IO system call, if it is in the IO preparation phase (data is not ready), it will immediately return. Through polling operation, the calling thread is not blocked.

However, simple NIO requires repeated IO system calls. This constant polling, which constantly asks the kernel, consumes a lot of CPU time and leads to context switching and low system resource utilization.

This model is rarely used directly by various Web server and framework underlayers, instead using the non-blocking IO feature in other, more advanced IO models. The actual development of Java will not involve this IO model, Java NIO is not the NIO here, but another IO multiplexing model.

4 IO multiplexing

IO multiplexing model is used to solve the synchronous non-blocking NIO model polling cost a lot of CPU problem.

The IO multiplexing model uses a listener thread to make another form of system call. A thread listens for multiple file descriptors (FD). Once a FD is ready (usually kernel buffer readable/writable), the system call returns, The listener thread can then tell the program to make the corresponding IO system call to the ready FD, such as reading data through recvfrom. Some places are also calledThis IO mode is event Driven IO, that is, event driven IO, because different I/O behaviors will return different events to distinguish them, such as read ready events and write ready events.

Currently supports IO multiplexingSelect, poll, epollWe can register multiple socket connections with the same select operation, so that we can implement multiple socket connections at the same time, when any socket is ready, can return readable (writable), and then the process recvFROM system call. Copying data from the kernel to the user process is, of course, blocked.

Select, poll, epoll, etc., block the calling thread after the system call. However, unlike blocking I/O, a single call to select, poll, epoll can block and listen on THE I/O operations of multiple sockets. It can achieve the purpose of listening multiple IO requests in the same thread at the same time, but in the synchronous blocking model, it must be achieved by multi-threading, which is one of the advantages of IO multiplexing, the system does not need to create a large number of threads, thus greatly reducing the system overhead.

In the IO multiplexing model, socket registration is also set as a non-blocking model, but it is transparent to the user program. Similarly, the original non-blocking model requires polling operations, This is also done by the select, poll, epoll and other system call functions in the kernel space, reducing invalid system calls and context switches, and reducing CPU consumption, which is another benefit.

The Selector introduced in Java4’s new NIO package uses the IO multiplexing model to manage multiple client connections with a single thread. At the bottom, on Linux systems, epoll system calls are used. Selector works with classes such as channels and buffers to build multiplexed, synchronous, non-blocking IO programs that provide high-performance data manipulation closer to the underlying operating system.

Based on IO multiplexing model and further encapsulation, a more easily understood Reactor model is produced. Netty, Redis, Nginx, Tomcat and many other software and frameworks all use the Reactor model, also called the Reactor model (to react to events). Java NIO’s Selector is a simple implementation of the Reactor pattern.

4.1 Select /poll System functions

There is no essential difference between a SELECT call and a poll call, both of which internally use a “linear structure” to store a collection of sockets that the process cares about.

First, bind all the sockets into a set of file descriptors. In use, the SELECT call first needs to copy the socket set of interest from the user state to the kernel state through the SELECT /poll system call, and then block, and the kernel detects the event. When a network event occurs, the thread returns from the block, traverses the Socket set, finds the corresponding Socket, sets its state to readable/writable, and copies the entire Socket set from kernel state to user state. The user mode also continues to traverse the entire Socket set to find readable/writable sockets and process them.

The select call uses a fixed-length BitsMap to represent a set of file descriptors, and there is a limit to the number of file descriptors supported. By default, a maximum of 1024 sockets can be listened on. Poll calls no longer use bitsmaps to store the file descriptors of interest, but instead use dynamic arrays organized in a linked list, breaking the limit on the number of file descriptors for select and, of course, system file descriptors.

For select and poll, the file descriptor set is traversed twice from beginning to end, once in kernel state and once in user state, and two copies of the file descriptor set occur, first from user space into kernel space, modified by the kernel, and then out into user space.

The more clients are connected, the larger the set will be, and the traversal and copy of Socket set will bring great overhead and poor efficiency, so the following epoll model is created.

4.2 Epoll System Functions

There are three steps to complete the epoll operation, that is, three functions work together:

// Create an epoll object (allocate resources to this handle in the epoll file system);
int epoll_create(int size);  
// Add a connected socket to the epoll object;
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);  
// Wait time is generated to collect connections where the event occurred, similar to the select() call.
int epoll_wait(int epfd, struct epoll_event *events,int maxevents, int timeout);  
Copy the code

Use epoll_create to create an epoll object epfd, use epoll_ctl to add the socket to be monitored to epFD, and call epoll_wait to wait for the data.

When epoll_create is executed, a red-black tree and ready list are created in the kernel cache.

When epoll_ctl is executed to place the socket, epoll checks for the presence of the socket in the above red-black tree. If the socket exists, epoll returns the socket immediately. It then registers a callback to the kernel interrupt handler, telling the kernel that if the socket handle is interrupted, it should be placed in the ready list. If the nic receives data, it sends an interrupt signal to the CPU. The CPU responds to the interrupt and the interrupt program executes the previous callback function. Red-black tree is an efficient data structure, the general time complexity of add and delete is O(logn).

Epoll_wait checks only the ready list and returns the ready socket if the list is not empty, otherwise it waits. Only the Socket set with event occurrence is passed to the application program, which does not need to poll and scan the whole set (including Socket with and without event) like select/poll, which greatly improves the detection efficiency.

4.3 Trigger Mode

Epoll has two working modes: LT (level-triggered) and ET (edge-triggered).

Horizontal trigger (level-trggered) : Keeps firing while in a state.

  1. As long as the read kernel buffer associated with the file descriptor is not empty and there is data to read, it always wakes up from epoll_wait and sends a readable signal to notify.
  2. As long as the kernel write buffer associated with the file descriptor is not enough to write to, it always wakes up from epoll_wait to issue a writable signal for notification.

Edge-triggered: Triggered once on the edge of a state transition.

  1. When the read kernel buffer associated with the file descriptor is converted from empty to non-empty, it wakes up from epoll_wait and sends a readable signal to notify.

  2. When the write kernel buffer associated with the file descriptor changes from full to insufficient, it wakes up from epoll_wait and sends a writable signal to notify.

In simple terms, the ET mode notifying only once when readable and writable, while the LT mode notifying only once when readable and writable. For example, a socket in the kernel buffer from no data into 2 k, ET models and LT will notify right now, then the application can read the data, assuming that read only 1 k, 1 k left in the buffer, the buffer is readable, if check again, then ET mode will not notice, LT mode will notify again.

ET performs better than LT because if you have a large number of ready file descriptors on your system that you don’t need to read or write, they will return every epoll_wait after LT mode is used, making it much less efficient for the processor to retrieve the ready file descriptors it cares about! With ET mode, there is no second notification and the system is not flooded with ready file descriptors that you don’t care about.

Therefore, in ET mode, the data in the buffer must be read until EGAIN(EGAIN indicates that the buffer is empty) is reached. Otherwise, incomplete data may be read.

Similarly, LT mode can handle both blocking and non-blocking sockets, while ET mode only supports non-blocking sockets, because if it is blocked, the process will block in the read-write function when there is no data to read or write, and the program cannot proceed.

By default, select and poll only support LT mode. Epoll works in LT mode and can be set to ET mode.

5 Asynchronous non-blocking I/OS

In essence, a select/epoll system call, which is a synchronous IO, is also a blocking IO. Both need to be responsible for reading and writing after the read and write event is ready, that is, the read and write process is blocked. IO multiplexing can still be summarized as a synchronous blocking model.

IO multiplexing requires a select request to ask about the state of the data and then an actual request to read the data through the application, whereas asynchronous non-blocking IO (AIO) takes it one step further and does not require the read operation to be called in the application.

The basic flow of AIO is that the user thread tells the kernel to start an IO operation through a system call, and the user thread returns. After the entire I/O operation (including data preparation and data replication) is complete, the kernel notifies the user program and the user performs subsequent service operations.

AIO is implemented based on an event and callback mechanism, meaning that an application action is returned directly after it has been processed and not blocked, while the kernel completes the two-stage operation of data preparation and replication. When the kernel has finished processing, the operating system notifies the appropriate application thread to proceed with the subsequent operation.

In AIO mode, the user thread is not blocked in both the waiting and copying phases of the kernel, achieving true asynchronous non-blocking. The user thread only needs to register the corresponding event callback function to receive the corresponding notification, so asynchronous IO is sometimes called signal-driven IO.

At present, AIO is not widely used. Netty tried AIO before and abandoned it. This is because Netty’s performance on Linux has not improved much since AIO was adopted.

In Java7, Asynchronous channel-based IO was added to the java.nio.channels package with several Asynchronous Channel interfaces and classes for AIO communication. Java7 called them niO.2.

Based on asynchronous IO model and further encapsulation, Proactor mode is more easily understood.

References:

  1. Promise me this time to take I/O multiplexing down the line!
  2. 5 IO models including the select epoll principle

If you need to communicate, or the article is wrong, please leave a message directly. In addition, I hope to like, collect, pay attention to, I will continue to update a variety of Java learning blog!