1. Silly not clear IO

1.1 is written in front (Audio Advanced edition access address)

The main purpose of this article is to guide the study of Netty programming. Before we learn Netty, we should always know about network IO. I believe many students are relatively unfamiliar to network IO, in a long time of development may not use IO or Netty. But everyone on the Internet is talking about how good Netty is, how np, it seems that everyone is using, I also want to get started but very confused, Netty this thing in the end is what? What am I gonna do with it? I believe you will have similar questions for beginners. This article will take you through these IO differences, what is Socket? What is Netty’s relationship with them? Hope to help you.

1.2 All basic sockets

1.2.1 This section describes sockets

Socket translates to slot. We usually call this a network slot. A simple understanding is: when the client and the server are communicating, they will generate a Socket at the same time, so they communicate through this Socket.

Socket is like a file to read and write, so the actual underlying implementation of most operating systems for Socket is a file, such as we commonly used Linux.

FDS are file descriptors, which are typically BSD sockets (Berkeley Sockets), in Unix/Linux systems. In Unix/Linux, a socket handle can be regarded as a file. Sending and receiving data on a socket is equivalent to reading and writing a file. Therefore, a socket handle is usually represented by the fd of a file handle.

1.2.2 Socket further explanation – Threading model

We simply understand the process of data transmission from the perspective of hardware: data through the network cable to the machine network card data is finally written to the machine memory. When the network card data is written to the memory, the network card sends an interrupt signal to the CPU, the operating system can know that there is new data, and then through the network card interrupt program to process the data.

A complete request processing thread model is shown below, assuming thread C runs the following code

/ / create a socket
int s = socket(AF_INET, SOCK_STREAM, 0);   
/ / binding
bind(s, ...)
/ / to monitor
listen(s, ...)
// Accept the client connection
int c = accept(s, ...)
// Receive the client datarecv(c, ...) ;// Print the data
printf(...).Copy the code

When a client network request comes to a server (such as a TCP connection), the request entersPendling Queue QueueUntil one of the threads initiates the accept client connection operation, the specification here is that the above code executes to accept, the operation will fetch a request andCreate the corresponding Socket file, the Socket containsSend buffer, wait buffer, wait queueAnd so on. There will beBlock waiting for data to be receivedThat is, when recV is executed, the corresponding thread is blocked and added to the waiting queue of the socket. After the socket receives data, the thread is restored to the working queue and removed from the waiting queue.It is worth mentioning that Socket is between the application layer and the network layer. Specifically, it can be understood as between Http and Tcp.

1.3 What are BIO and NIO

Through the above explanation, I believe you have a certain specific understanding of Socket, this time we will talk about IO model. The IO model refers to the use of certain modes and channels for network data transmission. Java provides us with three kinds: BIO, NIO, and AIO

1.3.1 BIO Blocking IO

Synchronous blocking model, one client connection for one processing thread

BIO’s simple server-side implementation of Java requests for the following tests can be run directly from the consoletelnet localhost 19001Use to send request datacontrol + ]+ commandsend ipType the data, press Enter to send

import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.charset.StandardCharsets;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class SocketServer {

    private static ExecutorService bossPoolExecutor= Executors.newFixedThreadPool(1200);
  
    public static void main(String[] args) throws IOException {
        ServerSocket serverSocket = new ServerSocket(19001);
        while (true) {
            System.out.println("Waiting for connection");
            Socket clientSocket = serverSocket.accept();
            System.out.println("Client connected");
            bossPoolExecutor.submit(new Runnable() {
                @Override
                public void run(a) {
                    try {
                        handler(clientSocket);
                    } catch(IOException e) { e.printStackTrace(); }}}); }}private static void handler(Socket clientSocket) throws IOException {
        byte[] bytes = new byte[1024];
        System.out.println("Ready to read");

        // Block read data
        int read = clientSocket.getInputStream().read(bytes);
        System.out.println("Read complete");
        if(read! = -1){
            System.out.println("Client data received"+new String(bytes,0,read));
        }
        clientSocket.getOutputStream().write("Hello client".getBytes(StandardCharsets.UTF_8)); clientSocket.getOutputStream().flush(); }}Copy the code

The drawbacks of BIO are obvious:

1. One request corresponds to one thread, and the read operation is a blocking operation in the IO code. If the connection does not do data read and write operations, the thread will block, which is a waste of thread resources.

C10K problem: When too many processes or threads are created, data is copied frequently (caching I/O, kernel copying data to user process space, blocking, process/thread context switch consumption is high, resulting in operating system crash

Application scenario:

BIO mode is suitable for small and fixed number of connections. This mode requires more server resources, but the program is simple and easy to understand.

1.3.2 NIO (Non Blocking I/o)

Synchronous non-blocking IO. The server implementation mode is that a thread can process multiple requests (connections). All connection requests sent by the client are registered to the multiplexer, and the multiplexer surveys the connection and processes the IO request.

NIO has an iterative process. For example, the Linux system uses epoll as the underlying implementation of multiplexer. The early NIO uses SELECT to poll for evolution, which is essentially to detect whether data is sent in each link loop, and the performance is low.

Application scenario: NIO is applicable to architectures with large number of connections and short connections (light operation), such as chat server, bullet screen system, communication between servers, and complicated programming

The underlying implementation goes from SELECT to epoll

This is where socket theory comes in handy. We all know that the server generates a socket file for each request.

So for the BIO, each socket file needs one thread to block the receiving of the processing data. This is the case for a large number of requests coming in, the server will not be able to carry so many threads, so the NIO was born.

For NIO, we need to think about how to listen to handle multiple socket requests. Would it be possible to put all socket requests into a single set and simply iterate over the set instead of creating multiple threads? That’s where select comes in. Let’s go through the actual code to understand what a select is

The following is C++ pseudo-code, the basic principle is to listen to the socket connections into a collection, that is, maintain a list of sockets, if there is no data in the list of sockets, suspend the process until one of the socket received data, wake up the process to process the data. When a socket is notified that the data has been received, the program does not know which socket in the list has received the data, so we need to go through the list we maintain to find out which socket has received the data and conduct subsequent processing. In C++ by FD_ISSET

int s = socket(AF_INET, SOCK_STREAM, 0);  
bind(s, ...)
listen(s, ...)

intFDS [] = Stores the socket to be monitoredwhile(1) {int n = select(... , fds, ...)for(int i=0; i < fds.count; i++){
        if(FD_ISSET(fds[i], ...) ) {// FDS [I] data processing}}}Copy the code

From the socket level, after we call select, the corresponding thread will be added to the waiting list of all sockets that need to be monitored and blocked. After any socket that we listen to gets data, the thread will be woken up and moved out of all waiting queues and into the working queue. At this point, our thread can iterate through the listening socket list to find out which socket got the data and process it accordingly.

The performance problems are obvious:

If we maintain a list of 1 w sockets, even if only one socket received the data, we still need to iterate through the list. The average time complexity is O(n). Is there a better way to get to O(1) where we know exactly which socket received the message, without having to go through the search, and epoll was born.

What did Epoll do

Compared to SELECT, epoll is optimized in two ways:

  1. Function separation, API decoupling.
  2. Maintain ready list, space for time idea
1. Function splitting

Let’s look at what epoll does from C++ code

// Add FDS to wait queue and wait for dataselect(... , fds, ...)//====== Epoll function decoupling
// Create an epoll object epfdepoll_create(...) ;// Add all sockets that need to be listened on to epFDepoll_ctl(epfd, ...) ;// Call epoll_wait to wait for data
epoll_wait(...)

Copy the code

It is clear from the above code that the original step select adds and blocks the split to epoll by adding epoll_CTL in two steps and waiting for epoll_wait. At the method level, decoupling is done to lay the foundation for optimization.

2. Maintain the ready queue RDList

As discussed earlier, we don’t know which socket received the data, so ePoll maintains a “ready list” in the kernel that references the socket that received the data to avoid traversal. This is called rdList.

Epoll principle and process

I’ve put together the select schematic diagram and the Epoll schematic diagram so you can compare them. When a process calls the epoll_create method, the kernel creates an EventPoll object (the object represented by epFD in the program). An EventPoll object is also a member of the file system and, like a socket, has a wait queue. The EventPoll object is required because the system kernel maintains data such as ready lists, watch lists, and so on. Eventpoll acts as a proxy, a mediation, to manage the interaction between the socket and the thread.

After creating an EventPoll object, you can use epoll_ctl to add listening sockets. The kernel will add eventPoll to the waiting queue of corresponding sockets

When the socket receives the data, it adds the reference to the RedList of EventPoll. Therefore, receiving the data does not affect the operation of the thread. When a thread executes ePOLL_WAIT, it checks whether rdList is empty. The thread can read the data directly from the rdList without having to do the full traversal. Because rdList exists, threads can know which sockets have changed.

The last two data interfaces we focus on areRdlist Ready listAs well asRBR watch list(epoll_ctl adds the socket’s storage structure)

The RBR monitor list needs to be added and deleted quickly. The structure of the RBR monitor list is bidirectional linked list. Rdlist ready lists require efficient addition, deletion, and lookup. Balanced binary search trees are a good choice, and epoll uses a red-black tree.

conclusion

And finally, poll. Polling works in a similar way to SELECT, not much different in nature. Managing multiple descriptors is also polling, with the state of the descriptors being handled, but because the underlying data structure is a linked list, there is no limit to the maximum number of file descriptors a poll can have. Below is a comparison of the three system-level calls.

NIO Java code implementation

Finally, attach the NIO Java implementation

The select implementation

import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class NioServer {

    // Save the client connection
    static List<SocketChannel> channelList = new ArrayList<>();

    public static void main(String[] args) throws IOException, InterruptedException {

        // Create NIO ServerSocketChannel, similar to BIO serverSocket
        ServerSocketChannel serverSocket = ServerSocketChannel.open();
        serverSocket.socket().bind(new InetSocketAddress(9000));
        // Set ServerSocketChannel to non-blocking
        serverSocket.configureBlocking(false);
        System.out.println("Service started successfully");

        while (true) {
            // The non-blocking accept method does not block, otherwise it would block
            // NIO's non-blocking is implemented internally by the operating system, calling the Accept function of the Linux kernel
            SocketChannel socketChannel = serverSocket.accept();
            if(socketChannel ! =null) { // If a client is connected
                System.out.println("Connection successful");
                // Set SocketChannel to non-blocking
                socketChannel.configureBlocking(false);
                // Save the client connection in the List
                channelList.add(socketChannel);
            }
            // Iterate over the connection to read the data
            Iterator<SocketChannel> iterator = channelList.iterator();
            while (iterator.hasNext()) {
                SocketChannel sc = iterator.next();
                ByteBuffer byteBuffer = ByteBuffer.allocate(128);
                // The non-blocking read method does not block, otherwise it would block
                int len = sc.read(byteBuffer);
                // Print the data, if any
                if (len > 0) {
                    System.out.println("Received message:" + new String(byteBuffer.array()));
                } else if (len == -1) { // If the client disconnects, remove the socket from the collection
                    iterator.remove();
                    System.out.println("Client disconnected");
                }
            }
        }
    }
}
Copy the code

Epoll multiplexer implementation,


import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Iterator;
import java.util.List;
import java.util.Set;

public class NioServer {


    public static void main(String[] args) throws IOException {
        // Create NIO serverSocketChannel
        ServerSocketChannel serverSocketChannel = ServerSocketChannel.open();
        // Bind the port
        serverSocketChannel.bind(new InetSocketAddress(19002));
        // Set non-blocking
        serverSocketChannel.configureBlocking(false);
        // Open selector to process a channel and create an epoll
        Selector selector = Selector.open();
        // Register the serverSocketChannel with the selector
        serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT);
        while (true) {
            // Block the fetching connection
            selector.select();
            Set<SelectionKey> selectionKeys = selector.selectedKeys();
            Iterator<SelectionKey> selectionKeyIterator = selectionKeys.iterator();
            while (selectionKeyIterator.hasNext()) {
                SelectionKey selectionKey = selectionKeyIterator.next();
                // If it is a connection registration event
                if (selectionKey.isAcceptable()){
                    ServerSocketChannel serverSocket= (ServerSocketChannel) selectionKey.channel();
                    SocketChannel socketChannel=serverSocket.accept();
                    socketChannel.configureBlocking(false);
                    socketChannel.register(selector,SelectionKey.OP_READ);
                    System.out.println("Client connection successful");
                }else if(selectionKey.isReadable()){
                    // If the event is read
                    SocketChannel socketChannel= (SocketChannel) selectionKey.channel();
                    ByteBuffer byteBuffer=ByteBuffer.allocate(128);
                    int len=socketChannel.read(byteBuffer);
                    if (len>0){
                        System.out.println("Client message received:"+new String(byteBuffer.array()));
                    }else if (len==-1){
                        System.out.println("Client disconnected :"+socketChannel.isConnected()); socketChannel.close(); }}// Delete this processing from the iterator to avoid repeated processingselectionKeyIterator.remove(); }}}}Copy the code

If you look at the Selector selector = Selector.open();This line of code tracks the underlying create implementation on LinuxEPollSelectorProviderIf you are interested, you can download the openJDK source code to see the implementation of the selector, which is to call the local C++ implementation epoll_ctl, epoll_create, epoll_wait, etc.

Write at the end:

This article takes you from the bottom up, introduces what IO is, and its essence Socket, and analyzes BIO and NIO’s underlying implementation, comparing the development process from select to poll to epoll. In fact, all technologies have a development process, and optimization is the core evolutionary requirement.

I’m dying stranded. I watched a 94-hour movie and didn’t get your “like” or “follow” collection. I think it’s not that you don’t like me enough, but that I didn’t watch the movie long enough…