In this article, we introduce the principles of the Netty memory model. There are few information about the Netty out-of-heap memory leakage caused by the improper use of Netty. Therefore, we write this article to introduce the knowledge points, diagnostic tools and troubleshooting ideas of the Netty out-of-heap memory

The phenomenon of

The main phenomenon of out-of-heap memory leakage is that the process occupies a high amount of memory (which can be queried using the top command in Linux), but the Java heap does not occupy a high amount of memory (which can be queried using the jmap command). This section describes how to troubleshoot memory leaks outside the Netty heap

Off-heap memory release low-level implementation

1 Java.nio off-heap memory release

Netty out-of-heap memory is implemented based on the native Java.nio DirectByteBuffer object, so it is necessary to understand how it is freed

The DirectByteBuffer provided by Java.nio provides the Clean () method of the Sun.misc.cleaner class. There are two ways to trigger the clean() method when a system call is made to free out-of-heap memory

  • (1) The application actively calls
ByteBuffer buf = ByteBuffer.allocateDirect(1);
((DirectBuffer) byteBuffer).cleaner().clean();Copy the code
  • (2) Based on GC collection

The Cleaner class inherits Java.lang.ref. Reference, and the GC thread sets internal variables of Reference by setting pending variables for the head node of the linked list and discovered variables for the next node of the linked list. Organize unreachable Reference objects that can be reclaimed in a linked list

The internal daemon thread of Reference consumes data from the head of the list. If the Reference object consumed is also of the Cleaner type, the thread calls the clean() method (Reference#tryHandlePending()).

2 Netty noClaner policy

Before introducing the noClaner strategy, you need to understand what DirectByteBuffer with Cleaner objects does when initialized:

The Cleaner object is initialized only in the DirectByteBuffer(int Cap) constructor, which checks if the current memory exceeds the allowed maximum out-of-heap memory (configurable by -xx :MaxDirectMemorySize)

If it does, it first tries to add unreachable Reference objects to the Reference linked list, and the internal daemons that rely on Reference trigger Cleaner run() methods that can be recycled for DirectByteBuffer associations

If memory is still insufficient, system.gc () is executed to trigger full GC to reclaim DirectByteBuffer objects in heap memory to trigger out-of-heap memory collection. If the limit is still exceeded, The thrown Java. Lang. OutOfMemoryError (code in Java. Nio. Bits# reserveMemory () method)

Netty introduced the noCleaner strategy in 4.1: Create a DirectByteBuffer object without Cleaner. The advantage of this is to bypass some of the extra overhead of the DirectByteBuffer constructor with Cleaner execution and the Cleaner Clean () method execution. When out-of-heap memory is not enough, Does not trigger system.gc (), improving performance

The main differences between hasCleaner DirectByteBuffer and noCleaner DirectByteBuffer are as follows:

  • Different constructors:

NoCleaner objects: Create hasCleaner objects by calling private DirectByteBuffer(long Addr, int Cap) by reflection

  • There are different ways to free memory

NoCleaner objects: use unsafe.freememory (address); HasCleaner objects: Cleaner () methods that use DirectByteBuffer

Note:Unsafe, a class in the Sun. misc package, provides native methods for memory operations, object operations, and thread scheduling. These methods are important for improving the efficiency of Java operations, but improper use of the Unsafe class makes programs more likely to go wrong. The program is no longer “safe”, so it is not officially recommended and may be removed in a future JDK release

Netty needs to check whether the current environment and environment configuration parameters allow noCleaner policies (the specific logic is in the static code block of PlatformDependent) during startup. For example, when running on Android, there is no Unsafe class. The noCleaner strategy is not allowed, and if not, the hasCleaner strategy is used

Note:Can call PlatformDependent. UseDirectBufferNoCleaner () method to check the current Netty application whether use noClaner strategy

Bytebuf.release () trigger mechanism

There is a misconception in the industry that ByteBuf allocated by Netty framework is automatically released by the framework and services do not need to be released. ByteBuf created by services needs to be released, but Netty framework does not

The Netty framework calls bytebuf.release () in a few scenarios:

1 Inbound message processing

When an inbound message is processed, Netty creates a ByteBuf to read the message on a channel and trigger a call to the ChannelHandler on the pipeline. The application-defined ChannelHandler that uses ByteBuf is responsible for release().

public void channelRead(ChannelHandlerContext ctx, Object msg) { ByteBuf buf = (ByteBuf) msg; try { ... } finally { buf.release(); }}Copy the code

If the ByteBuf is not handled by the current ChannelHandler, it is passed to the next pipeline handler:

public void channelRead(ChannelHandlerContext ctx, Object msg) { ByteBuf buf = (ByteBuf) msg; . ctx.fireChannelRead(buf); }Copy the code

Commonly used we will through inheritance ChannelInboundHandlerAdapter defines the handler of inbound message processing, in this case if all hanler program not call release () method, the inbound message Netty finally will not release (), Can cause memory leaks;

When an exception is thrown in a Pipeline handler, the Netty framework finally catches the exception and bytebuf.release (); A complete process in AbstractNioByteChannel. NioByteUnsafe# read (), extracting key segments below:

try { do { byteBuf = allocHandle.allocate(allocator); allocHandle.lastBytesRead(doReadBytes(byteBuf)); / / the inbound message has finished reading the if (allocHandle. LastBytesRead () < = 0) {/ /... break; } // Trigger a handler on the pipline to process pipeline.fireChannelRead(byteBuf); byteBuf = null; } while (allocHandle.continueReading()); / /... } catch (Throwable t) {bytebuf. release() handleReadException(pipeline, byteBuf, t, close, allocHandle); }Copy the code

However, commonly used is defined through inheritance SimpleChannelInboundHandler inbound message processing, in the class will ensure that message was eventually release:

@Override public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception { boolean release = true; If (acceptInboundMessage(MSG)) {I imsg = (I) MSG; channelRead0(ctx, imsg); } else {// Not handled by the current handler, passed to the next pipeline handler release = false; ctx.fireChannelRead(msg); Finally}} {/ / trigger the release if (autoRelease && release) {ReferenceCountUtil. Release (MSG); }}}Copy the code

2 Outbound message processing

Unlike inbound messages, which are automatically created by the Netty framework, outbound messages are typically created by the application and then call the channel-based write() or writeAndFlush() methods, which internally take care of calling the release() method of the incoming byteBuf

Note:The write() method has a problem before netty-4.0.0.cr2 and will not call bytebuf.release ()

3 Release (

  • (1) Reference count

There is a common misconception is that as long as the call the ByteBuf release () method, or ReferenceCountUtil. Release () method, the object’s memory is ensure released, it’s not

Since Netty’s ByteBuf reference count manages the life cycle of ByteBuf objects, ByteBuf inherits the ReferenceCounted interface and provides retain() and release() methods to increase or decrease the reference count value. When release() is called, The internal count is reduced to 0 to trigger the reclaim action

  • (2) derived ByteBuf

ByteBuf is derived from bytebuf.duplicate (), bytebuf.slice () and bytebuf.order (ByteOrder). The created ByteBuf shares the reference count with the original ByteBuf, and the release() method call of the original ByteBuf also causes these objects to be reclaimed

Instead, the bytebuf.copy () and bytebuf.readBytes (int) methods create objects that are not Derived ByteBuf and do not share reference counts with the original ByteBuf. The original ByteBuf call to the release() method does not cause these objects to be reclaimed

Out-of-heap memory size control parameter

Configuration of the parameters of the external memory size has – XX: MaxDirectMemorySize and – Dio.net ty. MaxDirectMemory, what is the difference between these two parameters?

  • -XX:MaxDirectMemorySize

The default value is the maximum amount of memory that the JVM can request from the operating system. If the memory itself is not present, The value is long.max_value bytes (the default value is returned by Runtime.geTruntime ().maxMemory()), and the code is in the java.nio.bits #reserveMemory() method

Note:-xx :MaxDirectMemorySize Cannot be limited in Netty
NoCleaner strategy DirectByteBufferThe size of out-of-heap memory

  • -Dio.netty.maxDirectMemory

Used to limit the size of the maximum out-of-heap memory allocated by Netty’s DirectByteBuffer under the noCleaner strategy. If the value is 0, the hasCleaner strategy is used. The code is located in the PlatformDependent#incrementMemoryCounter() method

Out-of-heap memory monitoring

How do I get the usage of off-heap memory?

1 Code Tools

  • (1) hasCleaner DirectByteBuffer monitoring

With the hasCleaner policy DirectByteBuffer, the Java.nio.bits class keeps track of the usage of out-of-heap memory, but this class is package-level access and cannot be obtained directly, it can be obtained through MxBeans

Note:Mxbeans, a series of special beans provided by Java for monitoring statistics, through different types of MXBeans can obtain JVM process memory, thread, class loading information and other monitoring indicators

List<BufferPoolMXBean> bufferPoolMXBeans = ManagementFactoryHelper.getBufferPoolMXBeans(); BufferPoolMXBean directBufferMXBean = bufferPoolMXBeans.get(0); / / the number of DirectBuffer hasCleaner long count = directBufferMXBean. GetCount (); / / hasCleaner DirectBuffer heap memory footprint size, unit bytes long memoryUsed = directBufferMXBean. GetMemoryUsed ();Copy the code

Note:MappedByteBuffer: Is another type of out-of-heap memory ByteBuffer derived from mmap memory mapping (an implementation of zero copy) based on FileChannelImpl. Map. Can pass ManagementFactoryHelper. GetBufferPoolMXBeans (). Get outside the reactor (1) access to the memory monitoring indicator

  • (2) noCleaner DirectByteBuffer monitoring

Netty noCleaner DirectByteBuffer monitoring is simpler, directly through PlatformDependent. UsedDirectMemory () to visit

2 The Netty provides a tool for detecting memory leaks

Netty also provides a memory leak detection tool, which can be used to detect memory leaks when ByteBuf objects are reclaimed by GC but the memory managed by ByteBuf is not released. However, this tool does not apply to memory leaks when ByteBuf objects are not reclaimed by GC, such as a backlog of tasks

To help users detect memory leaks, Netty provides four detection levels:

  • Disabled Disables memory leak detection
  • Simple detects leaks at a sampling rate of about 1%, the default level
  • Advanced has the same sampling rate as Simple, but displays a detailed leak report
  • Paranoid sampling rate is 100%, display report information as advanced

Use the command line to set the parameters:

Dio.net - ty. LeakDetectionLevel = [level]Copy the code

An example program is as follows to set the detection level to paranoid:

// -Dio.netty.leakDetectionLevel=paranoid
public static void main(String[] args) {
    for (int i = 0; i < 500000; ++i) {
        ByteBuf byteBuf = UnpooledByteBufAllocator.DEFAULT.buffer(1024);
        byteBuf = null;    
    }
    System.gc();
}Copy the code

You can see the console output leak report:

December 27, 2019 io.net ty 8:37:04 morning. Util. ResourceLeakDetector reportTracedLeak severe: LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html  for more information. Recent access records: Created at: io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:96) io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187) io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178) io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:115) org.caison.netty.demo.memory.BufferLeaksDemo.main(BufferLeaksDemo.java:15)Copy the code

The principle of memory leakage is to use weak references. When weak references are created, a refQueue needs to be specified. By wrapping ByteBuf objects with weak references (the code entry is in the AbstractByteBufAllocator#toLeakAwareBuffer() method)

When GC occurs, if the GC thread detects that the ByteBuf object is only associated with the WeakReference object, the WeakReference will be added to the refQueue. When ByteBuf memory is released normally, WeakReference’s clear() method will be called to remove the reference to ByteBuf, and the subsequent GC thread will not add the WeakReference to the refQueue.

When Netty creates ByteBuf each time, based on the sampling rate, the WeakReference object in refQueue will be polled when the sampling hits. The ByteBuf of the non-null WeakReference association returned by the polling is the leaked out-of-heap memory (the code entry is located in the ResourceLeakDetector#track() method)

Graphic chemical tools

On the basis of code acquisition of off-heap memory, custom access to some monitoring tools, such as the popular Prometheus or Zabbix, can periodically detect the acquisition and draw graphs

The underlying principle is to access the monitoring indicators in MXBean, and can only obtain the usage of hasCleaner DirectByteBuffer

In addition, out-of-heap memory allocations resulting from JNI calls can be monitored using Google-PerfTools

Out-of-heap memory leak diagnosis

There are many specific causes of out-of-heap memory leakage. This paper first introduces the monitoring of task queue accumulation, and then introduces the general out-of-heap memory leak diagnosis idea

1 Task queues accumulate

NioEventLoop Queue<Runnable> taskQueue NioEventLoop Queue<Runnable> taskQueue

  • (1) User-defined common tasks
ctx.channel().eventLoop().execute(runnable);Copy the code
  • (2) Write to channel
channel.write(...)
channel.writeAndFlush(...)Copy the code
  • (3) User-defined scheduled tasks
ctx.channel().eventLoop().schedule(runnable, 60, TimeUnit.SECONDS);Copy the code

A memory leak occurs when messages cannot be written to a channel and then released due to a backlog of tasks in the queue

The diagnosis procedure is to monitor the number of tasks in the task queue, the size of the backlogged ByteBuf, and the information about tasks. The monitoring program is as follows (code address github.com/caison/cai….). :

public void channelActive(ChannelHandlerContext ctx) throws NoSuchFieldException, IllegalAccessException { monitorPendingTaskCount(ctx); monitorQueueFirstTask(ctx); monitorOutboundBufSize(ctx); } /** Monitor the number of tasks piled up in the task queue. Tasks in the task queue include I/O read and write tasks. Public void monitorPendingTaskCount(ChannelHandlerContext CTX) {int totalPendingSize = 0; public void monitorPendingTaskCount(ChannelHandlerContext CTX) {int totalPendingSize = 0; for (EventExecutor eventExecutor : ctx.executor().parent()) { SingleThreadEventExecutor executor = (SingleThreadEventExecutor) eventExecutor; // Note that the pendingTasks() method has a bug in Netty4.1.29. Cause the thread block question / / reference https://github.com/netty/netty/issues/8196 totalPendingSize + = executor. PendingTasks (); } system.out. println(" totalPendingSize = "+ totalPendingSize); Public void monitorQueueFirstTask(ChannelHandlerContext CTX) throws NoSuchFieldException,  IllegalAccessException { Field singleThreadField = SingleThreadEventExecutor.class.getDeclaredField("taskQueue"); singleThreadField.setAccessible(true); for (EventExecutor eventExecutor : ctx.executor().parent()) { SingleThreadEventExecutor executor = (SingleThreadEventExecutor) eventExecutor; Runnable task = ((Queue<Runnable>) singleThreadField.get(executor)).peek(); if (null ! Println (" first task in the queue: "+ task.getClass().getName()); Public void monitorOutboundBufSize(ChannelHandlerContext CTX) {long outBoundBufSize = ((NioSocketChannel) ctx.channel()).unsafe().outboundBuffer().totalPendingWriteBytes(); System.out.println(" Size of backlogged buf in outbound message queue "+ outBoundBufSize); }Copy the code
  • Note: The above program must be at least based on Netty4.1.29 version to use, otherwise there will be performance problems

How to deal with the time-consuming business logic code in the actual Netty based business development?

Conclusion first, it is recommended to define a new set of business thread pools and submit time-consuming services to the business thread pool

The Netty worker thread (NioEventLoop), in addition to being a NIO thread that reads connection data, performs channelHandler logic on pipeline, and consumes tasks submitted in taskQueue, including channel write operations.

Submitting time-consuming tasks to the taskQueue also affects NIO thread processing and tasks in the taskQueue, so it is recommended to isolate processing in a separate business thread pool

2 General Diagnosis roadmap

The causes of memory leaks outside the Netty heap are various. For example, code misses writing calls to release(). The reference count value of ByteBuf was added via retain() but was not cleared when release() was called; Failed to release() because of Exception; The ByteBuf reference object is prematurely GC, and the associated out-of-heap memory is not reclaimed, etc., which cannot be listed here, so we try to provide a set of general diagnostic ideas for reference

First, you need to be able to reproduce the problem. In order not to affect the running of online services, try to simulate in the test environment or local environment. However, these environments typically do not have as much concurrency as online, and requests can be simulated using a pressure gauge tool

For some scenarios that cannot be simulated, Linux traffic replication tools such as Gor, TCPreplay, and tcpCopy can be used to copy online traffic to the test environment without affecting online services

After being able to reappear, the next step is to locate the problem. First, try to find the problem directly through the monitoring means and log information introduced earlier. If not, you need to locate the trigger condition of the out-of-heap memory leak, but sometimes the application is large and provides many traffic entry points, so it is impossible to troubleshoot one by one.

In an off-line environment, you can comment out the traffic entry half at a time, and then run to check whether the problem still exists. If so, you can comment out the remaining half again. Through this dichotomy strategy, you can quickly locate the trigger condition of the trigger problem through several attempts

After locating the trigger condition, check the processing logic of the trigger condition in the program. If the process is very complex and cannot be seen directly, you can also continue to comment out part of the code and conduct dichotomy investigation until the specific problem code block is finally found

The idea is that problem recurrence, monitoring, and elimination can also be used to troubleshoot other problems, such as memory leaks in the heap, 100% CPU, or service process failure

conclusion

The whole article focuses on introducing knowledge points and theories, but lacks practical links. Here are some excellent blog posts:

Netty memory leak troubleshooting party – How to debug the memory leak www.jianshu.com/p/4e9…

Netty Memory Leak Prevention Measures, author of Netty Authoritative Guide, Huawei Li Linfeng Memory Leak Prevention knowledge share mp.weixin.qq.com/s/Iu…

“Suspect tracking: Spring Boot memory leak troubleshooting”, Meituan technical team ji Bing case share mp.weixin.qq.com/s/aY…

“Netty entry and actual combat: imitation write micro letter IM instant messaging system”, the flash nuggets small volume (pay), the individual is to learn this column entry Netty juejin.cn/book/1…