This article is published by the Java Architects Association, which updates technical articles daily

Therefore, we should not only learn in order to solve problems, but also consider other problems. Based on Netty 4.1, this paper introduces relevant theoretical models, application scenarios, basic components and overall architecture. Hope to give you in the actual development practice, study open source projects to provide reference.




Netty is an asynchronous event-driven network application framework for rapid development of maintainable high-performance protocol servers and clients.

JDK native NIO program issues

The JDK native also has a set of web application apis, but there are a number of problems, mainly as follows:

NIO’s libraries and apis are cumbersome to use. You need to be familiar with Selector, ServerSocketChannel, SocketChannel, ByteBuffer, etc.

Additional skills are needed to pave the way. For example, be familiar with Java multithreaded programming. Because NIO programming involves the Reactor model, you must be familiar with multithreaded and network programming to write high quality NIO programs.

The reliability of the ability to complement, the workload and difficulty of development are very large. For example, the client encounters disconnection and reconnection, intermittent network disconnection, half-packet read and write, failure cache, network congestion, and abnormal code streams. NIO programming is characterized by relatively easy functionality development, but the amount of work and difficulty to make up for reliability capability is very high.

JDK NIO bugs. An example is the infamous Epoll Bug, which causes the Selector to poll empty, eventually causing the CPU to go 100%. Update 18 of JDK 1.6 was officially claimed to fix the problem, but it still exists in JDK 1.7, although the Bug is less likely to occur and not completely resolved.

The characteristics of Netty

Netty encapsulates the NIO API of the JDK to solve the above problems. The main features are:

Elegant design, uniform API blocking and non-blocking sockets for all transport types; Clear separation of concerns based on flexible and extensible event models; Highly customizable threading model – single thread, one or more thread pools; True connectionless datagram socket support (as of 3.1).

Easy to use, well documented Javadoc, user guides and examples; Without other dependencies, JDK 5 (Netty 3.x) or 6 (Netty 4.x) will suffice.

High performance, higher throughput, lower latency; Reducing resource consumption; Minimize unnecessary memory replication.

Secure, complete SSL/TLS and StartTLS support.

Active community, constant updates, active community, short iteration cycle, bugs found can be fixed in time, at the same time, more new features will be added.

Common usage scenarios of Netty

Common Netty scenarios are as follows:

Internet industry. In a distributed system, remote service invocation is required between nodes, and high-performance RPC frameworks are essential. Netty, as an asynchronous high-performance communication framework, is often used by these RPC frameworks as a basic communication component. Typical applications are as follows: Ali distributed service framework Dubbo RPC framework uses Dubbo protocol for inter-node communication. Dubbo protocol uses Netty as the basic communication component by default, which is used to realize internal communication between process nodes.

The game industry. Whether it is mobile game server or large online game, Java language has been more and more widely used. As a high-performance basic communication component, Netty itself provides TCP/UDP and HTTP protocol stacks. It is very convenient to customize and develop private protocol stack, account login server, map server can facilitate high-performance communication through Netty.

Big data. The RPC framework of Avro, a classical high-performance communication and serialization component of Hadoop, uses Netty for cross-border point communication by default, and its Netty Service is implemented based on the secondary encapsulation of Netty framework.

Interested readers can find out which open source Projects currently use Netty: Related Projects.

Netty high performance design

As an asynchronous event-driven network, Netty’s high performance comes from its I/O model, which determines how data is sent and received, and its threading model, which determines how data is processed.

I/O model

The I/O model determines to a large extent the performance of the framework which channels are used to send data to each other, BIO, NIO, or AIO.

Blocking I/O

Traditional blocking I/O(BIO) can be represented as follows:




Blocking I/O

Features are as follows:

Each request requires a separate thread to complete the operation of data Read, business processing, and data Write.

When the number of concurrent connections is large, a large number of threads need to be created to process connections, occupying large system resources.

After a connection is established, if the current thread has no data to Read temporarily, the thread blocks on the Read operation, resulting in a waste of thread resources.

I/O multiplexing model




In the I/O multiplexing model, Select is used. This function also blocks the process, but unlike blocking I/O, these two functions can block multiple I/O operations at the same time.

In addition, the I/O function of multiple read and write operations can be detected at the same time. The I/O operation function is not called until data is readable or writable.

The key to Netty’s non-blocking I/O implementation is based on the I/O multiplexing model, which is represented by a Selector object:




Nonblocking I/O

Netty’s IO thread NioEventLoop can process hundreds or thousands of client connections simultaneously due to its aggregation of multiplexer selectors.

When a thread reads or writes data from a client Socket channel, it can perform other tasks if no data is available.

Threads typically spend idle time of non-blocking IO performing IO operations on other channels, so a single thread can manage multiple input and output channels.

Since both read and write operations are non-blocking, this can greatly improve the efficiency of THE I/O thread and avoid thread suspension due to frequent I/O blocking.

A single I/O thread can concurrently process N client connections and read/write operations, which fundamentally solves the traditional synchronous blocking I/O connection-thread model, and greatly improves the performance, flexibility and reliability of the architecture.

Based on the Buffer

Traditional I/O is byte Stream or character Stream oriented and reads one or more bytes sequentially from a Stream in a streaming manner, so the position of the read pointer cannot be arbitrarily changed.

In NIO, the traditional I/O streams were abandoned and the concepts of channels and buffers were introduced. In NIO, data can only be read from a Channel to Buffer or written from Buffer to a Channel.

Unlike the sequential operations of traditional IO, buffer-based operations can read data at random in NIO.

Threading model

How to read datagrams? The thread in which the codec is carried out after the read, how the message is distributed after the codec, and the thread model have great influence on the performance.

Event-driven model

In general, there are two approaches to designing an event handling model:

In polling mode, threads constantly poll to visit the source of related events to see if events occur, and call event processing logic if events occur.

In event-driven mode, an event occurs, and the main thread puts the event into the event queue. In other threads, the events in the event list are constantly consumed in a loop, and the corresponding processing logic of the event is called to process the event. The event-driven approach, also known as the message notification approach, is the idea behind the observer approach in design patterns.

Take GUI logic processing as an example to illustrate the difference between the two logics:

In polling mode, threads continually poll for button click events, and if so, invoke processing logic.

Event-driven, click events put events into the event queue, events in the list of events consumed by another thread, call the related event processing logic according to the event type.

Here’s O’Reilly’s illustration of the event-driven model:




Event-driven model

It consists of four basic components:

Event Queue: An entry to receive events and store events to be processed.

Event Mediators: Distribute different events to different business logic units.

Event Channels: Communication channels between dispensers and processors.

Event Processor: Implements the service logic. After the processing is complete, an event is emitted to trigger the next operation.

It can be seen that compared with the traditional polling mode, event-driven has the following advantages:

Good scalability, distributed asynchronous architecture, high decoupling between event processors, can easily extend the event processing logic.

High performance, based on queue temporary event, can facilitate parallel asynchronous processing of events.

Reactor thread model

Reactor stands for Reactor and the Reactor model refers to the event-driven processing pattern of service requests that are passed to the service processor simultaneously through one or more inputs.

The server program processes the incoming multiple requests and synchronously dispatches them to the corresponding processing thread. The Reactor model is also called the Dispatcher model, which means that I/O is multiplexed with unified monitoring events and dispatched to a process after receiving the events. It is one of the necessary techniques for programming high performance network servers.

There are two key components in the Reactor model:

The Reactor, which runs in a separate thread, listens for events and distributes them to the appropriate handlers to react to IO events. It acts like a corporate telephone operator, answering calls from customers and redirecting the line to the appropriate contact.

Handlers are the actual events that perform I/O events to accomplish, similar to the actual officials in the company that the customer wants to talk to. Reactor responds to I/O events by scheduling appropriate handlers that perform non-blocking actions.




Reactor model

Depending on the number of reactors and the number of Hanndler threads, there are three variants of the Reactor model:

Single Reactor Single thread.

Single Reactor multithreading.

Reactor is multithreaded.

A Reactor is a code that executes while (true) {selector. Select (); … } a cyclic thread, which produces an endless stream of new events, is aptly called a reactor.

There is no longer a specific comparison of Reactor characteristics, advantages and disadvantages. For those who are interested, please refer to my previous article, Understanding High-performance Network Models.

Netty threading model

Netty makes some modifications to the multithreaded model of Master/slave Reactors (as shown in the following figure). There are multiple Reactors in the multithreaded model of master/slave Reactors:

MainReactor takes care of client connection requests and passes them on to SubReactor.

The SubReactor is responsible for THE I/O requests of the corresponding channel.

Tasks that are not IO requests (specific logical processing) are written directly to the queue and wait for worker Threads to process them.

Scalable IO models for Scalable primary/secondary reactors in Java




Master/slave Rreactor multithreading model

It should be noted that although Netty’s threading model is based on master/slave Reactor threading, it borrows the structure of MainReactor and SubReactor. But actually the SubReactor and Worker threads are in the same thread pool:

EventLoopGroup bossGroup = newNioEventLoopGroup();

EventLoopGroup workerGroup = newNioEventLoopGroup();

ServerBootstrap server= newServerBootstrap();

server.group(bossGroup, workerGroup)

.channel(NioServerSocketChannel. class)

The bossGroup and workerGroup in the code above are the two objects passed in the Bootstrap constructor. Both groups are thread pools:

The bossGroup thread pool simply binds a port to one of the threads as a MainReactor that processes Accept events for the port, with one Boss thread for each port.

The workerGroup thread pool is fully utilized by each SubReactor and Worker thread.

Asynchronous processing

Asynchrony is the opposite of synchronization. When an asynchronous procedure call is made, the caller does not get the result immediately. The part that actually handles the call notifies the caller through status, notifications, and callbacks after completion.

Netty I/O operations are asynchronous. Bind, Write, and Connect operations simply return a ChannelFuture.

The caller does not get the result immediately, but through the Future-listener mechanism, the user can easily obtain the IO operation result actively or through the notification mechanism.

When the Future object is newly created and in an incomplete state, the caller can obtain the state of the operation execution through the returned ChannelFuture, registering the listener function to perform the completed operation.

Common operations are as follows:

The isDone method is used to check whether the current operation is complete.

The isSuccess method is used to determine whether the completed current operation was successful.

Use the getCause method to get the cause of the failure of the current completed operation.

The isCancelled method is used to determine whether the completed current operation has been cancelled.

The addListener method is used to register listeners. When the operation completes (isDone returns complete), the specified listener is notified. If the Future object is completed, the specified listener is understood to be notified.

For example, in the following code, the binding port is an asynchronous operation. When the binding operation is finished, the corresponding listener processing logic will be invoked.

serverBootstrap.bind(port).addListener(future -> {

if(future.isSuccess()) {

System.out. println(newDate() + “: port [“+ port + “] bound successfully! );

} else{

System.err. println(” port [“+ port + “] binding failed!” );

}

});

Compared to traditional blocking I/O, the thread blocks after an I/O operation until the operation is complete. The benefits of asynchronous processing are that threads do not block and can execute other programs during I/O operations, resulting in greater stability and throughput in high concurrency situations.

Netty architecture design

After introducing some Netty theories, the following describes the architecture design of Netty in terms of features, modules, components, and operating processes.

features




The Netty features are as follows:

Transport services, supporting BIO and NIO.

Container integration, support OSGI, JBossMC, Spring, Guice containers.

Protocol support: HTTP, Protobuf, binary, text, WebSocket, and other common protocols are supported. It also supports implementing custom protocols by implementing encoding and decoding logic.

Core, extensible event model, common communication API, zero-copy ByteBuf buffer object.

Module components

The Bootstrap, ServerBootstrap

Bootstrap means Bootstrap. A Netty application usually starts with a Bootstrap, which is used to configure the entire Netty program and connect various components. In Netty, the Bootstrap class is used to Bootstrap the client program. ServerBootstrap is the server boot class.

The Future, ChannelFuture

As mentioned earlier, in Netty all IO operations are asynchronous and it is not immediately clear whether the message was properly processed.

However, you can either wait for it to complete or register a listener directly. This is done with Future and ChannelFutures. They can register a listener and the listener will automatically trigger the registered listener event when the operation succeeds or fails.

Channel

Netty A component of network communication that performs network I/O operations. Channel provides users with:

The status of the channel currently connected to the network (for example, open? Is it connected?

Configuration parameters for the network connection (such as receive buffer size)

Provides asynchronous network I/O operations (such as establishing connections, reading and writing, binding ports). Asynchronous calls mean that any I/O calls are returned immediately and there is no guarantee that the requested I/O operation has been completed at the end of the call. The call immediately returns an instance of ChannelFuture, and by registering listeners on ChannelFuture, the caller can be notified of a successful, failed, or canceled I/O callback.

Supports associated I/O operations with corresponding handlers.

Connections with different protocols and blocking types have different Channel types. Here are some common Channel types:

NioSocketChannel, asynchronous client TCP Socket connection.

NioServerSocketChannel, an asynchronous server TCP Socket connection.

NioDatagramChannel, asynchronous UDP connection.

NioSctpChannel, asynchronous client Sctp connection.

NioSctpServerChannel, asynchronous Sctp server side connection, these channels cover UDP and TCP network IO and file IO.

Selector

Netty implements I/O multiplexing based on a Selector object, through which a thread can listen for Channel events of multiple connections.

When a Channel is registered with a Selector, the mechanism inside the Selector automatically and continuously checks whether those registered channels have I/O events ready (such as readable, writable, network connection completed, etc.). This makes it easy to use a single thread to efficiently manage multiple channels.

NioEventLoop

NioEventLoop maintains a thread and task queue, which supports asynchronous submission of tasks. When the thread is started, NioEventLoop’s RUN method is called to execute I/O tasks and non-I /O tasks:

I/O tasks, the ready events in selectionKey, such as Accept, Connect, read, write, and so on, are triggered by the processSelectedKeys method.

Non-io tasks, tasks added to the taskQueue, such as register0 and bind0, are triggered by the runAllTasks method.

The execution time ratio of the two tasks is controlled by ioRatio. The default value is 50, indicating that the time allowed for non-I/O tasks is the same as that for I/O tasks.

NioEventLoopGroup

NioEventLoopGroup, which manages the life cycle of eventLoop, can be understood as a thread pool. It maintains a group of threads internally. Each thread (NioEventLoop) is responsible for processing events on multiple channels, and a Channel corresponds to only one thread.

ChannelHandler

ChannelHandler is an interface that processes I/O events or intercepts I/O operations and forwards them to the next handler in its ChannelPipeline(business processing chain).

ChannelHandler itself does not provide many methods, because the interface has a number of methods that need to be implemented, which can be subclassed during use:

ChannelInboundHandler Is used to handle inbound I/O events.

ChannelOutboundHandler Is used to process outbound I/O operations.

Or use the following adapter classes:

ChannelInboundHandlerAdapter handles inbound I/O events.

ChannelOutboundHandlerAdapter used to handle the outbound I/O operations.

ChannelDuplexHandler is used to handle inbound and outbound events.

ChannelHandlerContext

Saves all context information associated with a Channel, along with a ChannelHandler object.

ChannelPipline

Saves a List of ChannelHandlers that handle or intercept inbound events and outbound operations for a Channel.

ChannelPipeline implements an advanced form of intercepting filter pattern that gives the user complete control over how events are handled and how the various Channelhandlers in a Channel interact with each other.

The following figure illustrates how the ChannelHandler in the ChannelPipeline typically handles I/O events, citing the ChannelPipeline description in Netty’s Javadoc 4.1.

I/O events are handled by the ChannelInboundHandler or ChannelOutboundHandler and propagated by calling the event method defined in the ChannelHandlerContext.

For example ChannelHandlerContext. FireChannelRead (Object) and ChannelOutboundInvoker. Write (Object) is forwarded to its recent handler.




Inbound events are handled by the bottom-up inbound handler, as shown on the left. Inbound Handler handlers typically process inbound data generated by the I/O thread at the bottom of the diagram.

Inbound data is usually read remotely from an actual input operation (such as socketchannel.read (ByteBuffer)).

Outbound events are handled up and down, as shown on the right. Outbound Handler handlers typically generate or transform outbound transports, such as write requests.

I/O threads typically perform actual output operations, such as socketChannel.write (ByteBuffer).

In Netty, each Channel has only one Channel pipeline corresponding to it, and their composition relationship is as follows:




A Channel contains a ChannelPipeline, which maintains a two-way list of ChannelHandlerContext, And each ChannelHandlerContext is associated with a ChannelHandler.

Inbound events and outbound events In a bidirectional list, inbound events are passed from the head to the last inbound handler, and outbound events are passed from tail to the last outbound handler. The two types of handlers do not interfere with each other.

Netty working principle architecture

The process for initializing and starting the Netty server is as follows:

publicstaticvoidmain(String[] args) {

/ / create mainReactor

NioEventLoopGroup boosGroup = newNioEventLoopGroup();

// Create a worker thread group

NioEventLoopGroup workerGroup = newNioEventLoopGroup();

final ServerBootstrap serverBootstrap = newServerBootstrap();

serverBootstrap

/ / assembly NioEventLoopGroup

. group(boosGroup, workerGroup)

// Set the channel type to NIO

.channel(NioServerSocketChannel.class)

// Set connection configuration parameters

.option(ChannelOption.SO_BACKLOG, 1024)

.childOption(ChannelOption.SO_KEEPALIVE, true)

.childOption(ChannelOption.TCP_NODELAY, true)

// Configure inbound and outbound event handlers

.childHandler( newChannelInitializer<NioSocketChannel>() {

@ Override

protectedvoidinitChannel(NioSocketChannel ch) {

// Configure inbound and outbound event channels

ch.pipeline().addLast(…) ;

ch.pipeline().addLast(…) ;

}

});

// Bind ports

intport = 8080;

serverBootstrap.bind(port).addListener(future -> {

if(future.isSuccess()) {

System.out. println(newDate() + “: port [“+ port + “] bound successfully! );

} else{

System.err.println(” port [“+ port + “] binding failed!” );

}

});

}

The basic process is as follows:

Initialize two NioEventLoopGroups, boosGroup for Accetpt connection establishment events and request distribution, and workerGroup for I/O read and write events and business logic.

Configure EventLoopGroup, Channel type, connection parameters, inbound and outbound event handlers based on ServerBootstrap(Server boot bootstrap).

Bind the port and start working.

Based on the Netty Reactor model, this paper describes the working diagram of server Netty:




Server Netty Reactor Working diagram

The Server contains one Boss NioEventLoopGroup and one Worker NioEventLoopGroup.

A NioEventLoopGroup is equivalent to a group of event loops containing multiple NioEventloops, each of which contains a Selector and an event loop thread.

Each Boss NioEventLoop executes a task consisting of three steps:

Poll the Accept event.

Handle Accept I/O events, establish a connection with the Client, generate a NioSocketChannel, and register the NioSocketChannel with the Selector of a Worker NioEventLoop.

Process tasks in the task queue, runAllTasks. Tasks in the task queue include tasks performed by users calling Eventloop. execute or Schedule, or submitted to the Eventloop by other threads.

Each Worker NioEventLoop consists of three steps:

Poll Read and Write events.

Handle I/O events, that is, Read and Write events, when NioSocketChannel readable and writable events occur.

Process tasks in the task queue, runAllTasks.

There are three typical application scenarios for tasks in the Task queue.

① Common tasks defined by user programs

ctx.channel().eventLoop().execute( newRunnable() {

@Override

publicvoidrun(){

/ /…

}

});

(2) Various methods of calling Channel by non-current Reactor thread

For example, in the business thread of the push system, the corresponding Channel reference is found according to the user’s identity, and then the Write class method is called to push the message to the user, which will enter into this scenario. The final Write is submitted to the task queue and consumed asynchronously.

③ You can customize scheduled tasks

ctx.channel().eventLoop().schedule( newRunnable() {

@Override

publicvoidrun(){

}

}, 60, TimeUnit.SECONDS);

conclusion

Currently, the mainstream version of ForkJoinPool is recommended for stable use. The use of ForkJoinPool in Netty5 increases code complexity, but does not provide significant performance improvements. This version is not recommended and is not available for download.

Netty has a relatively high barrier to entry because there is less information on the subject, not because it is difficult, but because you can figure it out like Spring.

Before learning, it is recommended to understand the whole framework principle structure and operation process, so that many detours can be avoided.