I heard that wechat search “Java fish” will change strong oh!

This article is posted on Github and Gitee, and contains my entire Java series of articles for study and interview purposes

(1) Preface

Recently, I have been busy with the upcoming launch of the system and encountered many problems. The most recent memory is a memory overflow error. My test told me that a certain feature was not working recently, so I went online to check the error log. I was surprised to see a screen full of OutofMemoryErrors. For privacy reasons, I only show a few exceptions here:

(2) Think about the scene

The usual problem finding is to look at the log first, and then to think about the scenario in which the memory overflow exception occurs. So I thought about the logic of this code, here is the data extraction function of an asynchronous thread: through the Dubbo interface, call 1000 pieces of data at a time, and then do some processing on these data before dropping into the library, the total amount of data in tens of thousands to hundreds of thousands.

In this case, if there is an overflow of memory, the only possibility is that the 1000 pieces of data in each call are not emptied after the data processing, resulting in hundreds of thousands of pieces of data being added to the memory, and finally the memory runs out. So I checked this part of the code:

List<EmrTreatment> resultList = new ArrayList<>();
// If the data still exists, the loop is not broken
while (CollectionUtils.isNotEmpty(data.getRecords())) {
    resultList.addAll(dataPage.getRecords());
    // Business processing
    / /...
    // Empty the collection to prevent memory overflow
    resultList.clear();
    // Request the next 1000 entries via the Dubbo interface
    / /... .
   data=searchDataByScroll(dataRequest);  
}
Copy the code

I intentionally cleared the List after 1000 entries were processed, so there was no memory overflow.

(3) View GC logs

Since the heap is out of memory, it immediately occurs to me to look at the GC log, but there is no hint of an out of memory. By the way, let’s review what the content of the GC log means:

Take one of these for example:

GC (Allocation Failure) 2021-10-29T16:37:45.177+ 0900:2686.339  [ParNew: 283195K->3579K(314560K), 0.0256691 secs] 396015K->116915K(1013632K), 0.0258253 secs] [Times: user=0.03 sys=0.02, real=0.03 secs]
Copy the code

GC: Indicates a garbage collection, belonging to MinorGC

Allocation Failure: GC occurs because there is insufficient space for the young generation

ParNew: The ParNew garbage collector is used by the young generation of this GC

283195K->3579K(314560K) : Usage of young generation before GC -> Usage of young generation after GC (total capacity of young generation)

396015K->116915K(1013632K) : Heap usage before garbage collection -> Heap usage after garbage collection (heap size)

[Times: user=0.04 sys=0.00, real= 0.05secs] :

User: All CPU time consumed by the garbage collection thread

Sys: indicates the system wait time

Real: Total application pause time (STW)

Since there is no information about heap overflow in the GC log, it means that it is not our application’s memory overflow. After checking the error message carefully, there is a clear error pointing to Dubbo, indicating that the previous path went wrong.

(4) Check the configuration of duBBO interface

I vaguely remembered that the size of each call was set when the Dubbo interface was called, so I checked the configuration on NACOS, and sure enough, the size of the Dubbo interface was set to 16M, which located the problem. An OutOfMemory Error is returned when 1000 entries are fetched from the Dubbo interface at a time, exceeding 16MB in some cases.

(v) Solutions

Now that the problem has been identified, the solution is simple to first adjust the size of the consumer and producer limits for the Dubbo interface to suit the actual situation, and then make the changes slightly smaller 1000 times per call. Moreover, according to the original design, 1000 pieces of data would not exceed 16M, so I checked the data and found that some pieces of data exceeded 2M. Such data is of little use, so whether to filter these data is considered in the product. The final online verification did not report the same problem, it is solved.

(6) Summary

Although the problem is solved, but still took some crooked road. The mind may not be as clear when faced with an urgent problem as it is afterwards, but the more holes you step in, the more you learn.