Dig deep into the Hotspot source code and Linux kernel to understand NIO and Epoll

Why is it necessary to understand NIO

First of all, you need to Netty after the main means of implementation is Nio, a lot of people have not understood Netty, the root reason is that in addition to the daily development is difficult to practice, a large part of the reason is not familiar with Nio, in fact, really familiar with Nio and the principle behind it, to view Netty source code is like god help! Today we will learn from the most basic IO, and NIO! Welcome to the public number [source apprentice]

How does the operating system define I/O

I/O related operations, detailed you engaged in Java personnel are not strange, as the name indicates, is also the Input/Output, corresponding to the verb, Read/Write two actions, but in the upper system application whether Read or Write, the operating system will not directly operate the physical machine disk data, Instead, the system kernel loads the disk data! Let’s take Read as an example. When a program initiates a Read request, the operating system will load the data from the kernel buffer to the user buffer. If there is no data in the kernel buffer, the kernel will append the Read request to the request queue. Copy data from the kernel buffer to the user buffer and return to the upper application!

Write requests are similar to the situation above, where the user process writes to the user buffer, copies to the kernel buffer, and then writes from the kernel to the network port or disk file when the data reaches a certain level!

Suppose we use a Socket server as an example, and we dictate a complete read/write process:

The client sends a piece of data to the network card, and the operating system kernel copies the data to the kernel buffer!
When the user process initiates a read request, the data is copied from the kernel buffer to the user buffer!
User buffer to obtain data after the program began business processing! When the processing is complete, the Write request is called to Write the data from the user buffer to the kernel buffer!
The system kernel will write data from the kernel buffer to the network card and send it to the client through the underlying communication protocol!

The IO model in network programming

This article aims to let beginners first roughly understand the basic principles, so here will not involve too much code, specific implementation logic, can pay attention to the subsequent source analysis of the article, here only to make a foundation, for the future study to do a better foundation!

1. I/ OS are blocked synchronously

I. The traditional blocking IO model

This model is a single-threaded application, the server listens to the client connection, when listening to the client connection immediately to do business logic processing, the request before the completion of processing, the server received other connections are blocked and cannot be operated! Of course, we will not write this in the development, this will only exist in the protocol demo! What’s the problem with this?

Look at the picture, we found that when a new connection is access, all other clients connected in the blocking state, so when the client with the client for too long, can lead to obstruction of client connection more lead to system crashes, we can find a way to make it to the business process and separating the Accept to receive the new connection! This will solve the problem without affecting new connections!

II. Pseudo-asynchronous blocking I/O model

This business model is to step on a single thread model optimization, when a new connection access, access to the link of the Socket, give a new thread to handle, the main program continues to receive a new connection, so that it can be solved at the same time can only deal with the issue of a new connection, but everybody can see, This has a very fatal problem. This model can handle small concurrency for a short time without any problems, but suppose I have 10W connections and I need to open 10W threads, which will crash the system! We need to limit the number of threads, so thread pools must come into play. Let’s optimize the model!

III. Optimize the pseudo-asynchronous blocking I/O model

After receiving a new connection from a client, the server wraps the Socket connection and business logic into a task and commits it to a thread pool. The thread pool starts execution while the server continues to receive new connections! It can solve the problem of step because the thread on the explosion caused, but we think back to the thread pool submission step: after the core thread pool is full of task will be to queue, when the queue is full, it would take a maximum number of threads continue to open a thread, when the maximum number of threads started to refuse strategy! It is proved that my maximum number of concurrent is only 1500, and the rest are in the queue for 1024. Assuming the current number of connections is 1W and the discard strategy is used, there will be nearly 6000 connection tasks discarded, and 1500 threads, switching between threads is also a particularly large overhead! This is a fatal problem!

In addition to the problems mentioned above, all three models have a particularly deadly problem: blocking!

Where is it blocked?

The connection is blocked when no client is connected! When there is no client connection, the thread is stuck waiting for a new connection to come in.
The thread is blocked while waiting for data to be written. When a new connection is added but no data is written, the thread will wait for data to be written until the writing is complete! If we use the optimized pseudo-asynchronous threading model, maybe 100 out of 1000 connections are writing frequently and the remaining 900 are writing infrequently, then 900 threads are waiting for the client to write. So, this is also a significant performance overhead!

Now let’s summarize the problems of the above model:

Serious waste of thread overhead!
Switching between threads is frequent and inefficient!
Read /write execution blocks!
Accept blocks waiting for new connections

So, can we have a scenario where we have very few threads to manage thousands of connections, and read/write blocks the process, which would lead to the following model

2. Synchronize non-blocking I/ OS

Synchronizing the non-blocking I/O model must be implemented using Java NIO. Look at a simple code:

Public static void main(String[] args) throws IOException {// New interface pool List<SocketChannel> socketChannelList = new ArrayList<>(8); ServerSocketChannel ServerSocketChannel = serverSocketChannel.open (); serverSocketChannel.bind(new InetSocketAddress(8098)); / / set to non-blocking serverSocketChannel. ConfigureBlocking (false); While (true) {/ / detect new connection, due to set up a non-blocking, even without a new connection here is blocked, but direct return null SocketChannel SocketChannel = serverSocketChannel. The accept (); // If (socketChannel! =null){system.out.println (" new connection access "); / / set the client to a non-blocking Such a read/write won't block a socketChannel. ConfigureBlocking (false); // Add the new connection to the thread pool. Socketchannellist.add (socketChannel); } / / iterators iterate through the connection pool Iterator < SocketChannel > Iterator = socketChannelList. The Iterator (); while (iterator.hasNext()) { ByteBuffer byteBuffer = ByteBuffer.allocate(128); SocketChannel channel = iterator.next(); Int read = channel.read(byteBuffer); Println (new String(byteBuffer.array())); if(read > 0) {system.out.println (new String(bytebuffer.array ())); }else if(read == -1) {// remove the connection when the client exits iterator.remove(); System.out.println(" disconnect "); }}}}

The code above we can see the logic of a key: serverSocketChannel. ConfigureBlocking (false); When this is set to non-blocking, neither Accept nor read/write will block! The specific reason why non-blocking, I put in the article later, let’s see what is wrong with this kind of implementation logic!

Look at this, we do seem to be using one thread to process all the connections and read and write operations, but suppose we have 10W connections and only 1000 active connections (often read/write), but our thread needs to poll 10W of data processing per no, which is a huge CPU drain!

What do we expect? It is expected that each polling value will poll the Channel that has data, and leave the Channel that does not have data. For example, in the previous example, there are only 1000 active connections, so it will only poll those 1000 connections each time. The others will poll if they have read or write data, and not if they do not have read or write data!

3. Multiplexing model

The multiplexing model is a classic model recommended by JAVA NIO. The internal event selection is carried out by Selector, and the Selector event selection is implemented by the system. See a code for the specific process:

Public static void main(String[] args) throws IOException {// Enable serverSocket ServerSocketChannel ServerSocketChannel = ServerSocketChannel.open(); serverSocketChannel.bind(new InetSocketAddress(8098)); / / set to non-blocking serverSocketChannel. ConfigureBlocking (false); Selector = selectors. Open (); serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT); While (true) {// block until the event to be handled occurs selectors. Select (); Set<SelectionKey> selectionKeys = selection.selectedKeys (); Iterator<SelectionKey> Iterator = selectionkey.iterator (); while (iterator.hasNext()) { SelectionKey next = iterator.next(); / / when events if found connection (next isAcceptable ()) {/ / get a client connection SocketChannel SocketChannel = serverSocketChannel. The accept (); / / set non-blocking socketChannel. ConfigureBlocking (false); // Register the client connection into the selector and watch the read event socketChannel.register(selector, selectionKey.op_read); }else if(nex.isreadable ()){ByteBuffer allocate = ByteBuffer.allocate(128); SocketChannel Channel = (SocketChannel) next.channel(); Int allocate = channel. Read (allocate); if(read > 0){ System.out.println(new String(allocate.array())); }else if(read == -1){system.out.println (" disconnect "); channel.close(); }} // Remove the event iterator.remove(); }}}

Instead of synchronizing non-blocking IO, there is a selector that registers a Socket that is interested in different events, and then puts the Socket back into it and waits for the client to poll if the desired event meets the criteria!

NIO in JDK1.4 uses the Linux kernel function select() or poll(). Similar to the NioServer code above, the selector polls all the sockchannels to see which channel has read and write events, and processes them if any. JDK1.5 introduces epoll to optimize NIO based on the event response mechanism. First we register our SocketChannel with the corresponding selector and select the event of interest. The operating system will then put the completed event SocketChannel back into the selector according to the events we set up to be of interest, waiting for the user to process! Can it solve these problems?

Of course it can, because one of the above synchronous non-blocking I/O pain points is that the CPU is always doing a lot of useless polling, which is solved in this model! All the channels that this model gets from the selector are ready, and all it needs to do is that it doesn’t have to do any useless polling!

In-depth analysis of underlying concepts

Select the model

If we want to get to the bottom of NIO we need to look at it step by step. First, we need to look at a model called the select() function. What is it? He is also one of the multiplexing models used by NIO, is a model used in JDK1.4, he is a model commonly used before the epoll model, his efficiency is not high, but was widely used at that time, later will be optimized for epoll!

How did he do it? As shown in figure:

First of all, we need to understand the operating system has a concept called work queue, the CPU takes turns to execute the process in the work queue, we usually write the Socket server client program also exists in the process of the work queue, as long as it exists in the work queue, it will be called by the CPU execution! We will refer to this network program as process A

It maintains A list of sockets internally. When the system function select(Socket []) is called, the operating system will add process A to the waiting queue for each Socket in the list and remove process A from the work queue. At this time, process A is blocked!
When the network card receives the data, triggering the operating system interrupt program, according to the Socket port of the program to take the corresponding Socket list to find the process A, and the process A from all the Socket list waiting queue removal, and join the operating system work queue!
At this time, process A is woken up and knows that at least one Socket has data. Process A starts to traverse all sockets in order to find the Socket with data and perform subsequent service operations

The core idea of this structure is that I first let all sockets hold A reference to the process A. When the operating system triggers the Socket to interrupt, I can find the corresponding Socket based on the port, and then I can find the corresponding process of the Socket, and then I can find all monitored sockets based on the process! Note that when process A wakes up, it proves that the operating system has A Socket interrupt, and at least one Socket is ready. You can find and process the incoming data from the client by traversing all sockets!

However, you will find that this operation is very tedious, there seems to be A lot of traversal in the process, the first to add process A to the queue of all sockets need to be traversed once, after the interruption needs to traversal the Socket list once, remove all references to process A, and add process A to the work queue! Because process A does not know which Socket has data, it needs to iterate over the Socket list again to actually process the data. The entire operation traverses A total of three sockets. In order to ensure the performance of the 1.4 version, You can monitor a maximum of 1024 sockets. If you remove the standard output and error output, only 1021 sockets are left. If there are too many sockets, the performance of each traversal will be very high.

Epoll model

Epoll has three important functions:

Epoll_create the corresponding JDK NIO code type Selector.open()
Epoll_ctl socketChannel.register(selector, XXXX);
Epoll_wait corresponds to selectors. Select () in JDK NIO code;

Interested in can download an open-JDK-8U source code, can also follow the public number reply openJdk source code package!

How does he optimize the select?

Epoll_create: These system calls will return a non-negative file descriptor, which also has a waiting queue like sockets, but also has a ready queue!
Epoll_ CTL: add monitor to Socket, corresponding to Java to register SocketChannel to the Selector, it will create a reference to the file descriptor added to the Socket wait queue! This is difficult to understand, note that EPFD (Epoll file descriptor) is put on the Socket wait queue!
When the operating system interrupt program, based on the port number (the client port number is unique) to find the corresponding Socket, get a reference to THE EPFD, the Socket reference to the EPFD sequence table!
Epoll_wait: check whether EPFD’s ready list has A Socket reference. If so, return it directly. If not, add process A to EPFD’s wait queue and remove process A’s reference from the work queue!
When the network adapter receives the data again, interrupt occurs, proceed with the above steps, add the Socket reference to the sequence table, wake up process A, remove the EPFD waiting queue process A, add process A to the work queue, the program continues to execute!

4. Asynchronous non-blocking I/ OS

Asynchronous non-blocking model is the user application only need to issue the corresponding event, and register the corresponding callback function, the operating system completed, callback callback function, complete the specific about for the operation! Let’s start with a piece of code

public static void main(String[] args) throws Exception { final AsynchronousServerSocketChannel serverChannel = AsynchronousServerSocketChannel.open().bind(new InetSocketAddress(9000)); // Listen for connection events, And register the callback serverChannel. Accept (null, new CompletionHandler < AsynchronousSocketChannel, Object>() { @Override public void completed(AsynchronousSocketChannel socketChannel, Object attachment) { try { System.out.println("2--"+Thread.currentThread().getName()); Serverchannel.accept (attachment, this); serverChannel.accept(attachment, this); System.out.println(socketChannel.getRemoteAddress()); ByteBuffer buffer = ByteBuffer.allocate(1024); Socketchannel. read(buffer, buffer, new CompletionHandler<Integer, ByteBuffer>() { @Override public void completed(Integer result, ByteBuffer buffer) { System.out.println("3--"+Thread.currentThread().getName()); buffer.flip(); System.out.println(new String(buffer.array(), 0, result)); Socketchannel. write(bytebuffer.wrap ("HelloClient".getBytes())); } @override public void failed(Throwable exc, ByteBuffer buffer) {exc.printStackTrace(); }}); } catch (IOException e) { e.printStackTrace(); }} @override public void failed(Throwable exc, Object Attachment) {exc.printStackTrace(); }}); System.out.println("1--"+Thread.currentThread().getName()); Thread.sleep(Integer.MAX_VALUE); }}

AIO client

public static void main(String... args) throws Exception { AsynchronousSocketChannel socketChannel = AsynchronousSocketChannel.open(); SocketChannel. Connect (new InetSocketAddress (9000) "127.0.0.1,"). The get (); socketChannel.write(ByteBuffer.wrap("HelloServer".getBytes())); ByteBuffer buffer = ByteBuffer.allocate(512); Integer len = socketChannel.read(buffer).get(); if (len ! = -1) {system.out.println (" client received message: "+ new String(buffer.array(), 0, len)); }}

The whole logic is to tell the system that I want to pay attention to a connection event, call the callback function I registered if there is a connection event, get the client connection from the callback function, register a read request again, and tell the system to call the callback function I registered if there is readable data! When the data is present, perform the read callback and write out the data!

Why does Netty use NIO instead of AIO?

On Linux, the underlying implementation of AIO still uses Epoll, does not implement AIO well, and therefore does not have a significant performance advantage, and is wrapped in a JDK layer that is not easy to deep optimization, AIO on Linux is not mature enough. Netty is an asynchronous non-blocking framework, and Netty does a lot of asynchronous wrapping on NIO. Simply put, the current AIO implementation is relatively weak!