A: background

1. Tell a story

A few days ago, a friend and wx said that his program suffered memory inflation, help how to analyze?

After chatting with this friend, this dump is also taken from a HIS system. As my friend said, I really connected with the hospital at 🤣, which is a good idea. Save some resources for yourself.

2. WINDBG analysis

1. Hosted or unhosted?

How big is the commit memory for the current process?

0:00 0 >! address -summary --- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal MEM_FREE 7FFE 'BAAC0000 (127.995 TB) 100.00% MEM_COMMIT 1153 1' 33bd3000 (4.808 GB) 94.59% 0.00% MEM_RESERVE 221 0 '1195d000 ( 281.363 MB) 5.41% 0.00%

As you can see, it’s about 4.8G, but let’s look at the managed heap memory.

0:00 0 >! eeheap -gc Number of GC Heaps: 1 generation 0 starts at 0x00000207a4fc48c8 generation 1 starts at 0x00000207a3dc3138 generation 2 starts at 0x0000020697fc1000 ephemeral segment allocation context: none ------------------------------ GC Heap Size: Size: 0x1241b3858 (4900730968) bytes.

Can be seen from the last line of the managed heap occupy 4900730968/1024/1024/1024 = 4.5 G, two indicators, turned out to be wrong with managed memory, that’s right…

2. Look at the managed heap

Since memory is eaten by the managed heap, let’s see what’s on the managed heap.

0:00 0 >! dumpheap -stat Statistics: MT Count TotalSize Class Name ... 00007ffd00397b98 1065873 102323808 System.Data.DataRow 00000206978b8250 1507805 223310768 Free 00007ffd20d216b8 4668930 364025578 System.String 00007ffd20d22aa8 797 403971664 System.String[] 00007ffd20d193d0 406282 3399800382 System.Byte[] Total 9442152 objects

Byte[] takes up about 3.3G of memory, so it eats up almost all of the GC heap. This means that there must be a large object in the GC heap. Apart from scripting byte[] for violence grouping statistics, are there any other tricks for pure human flesh? B: Sure, you can use it! HeapStat looks at the generation information of these objects on the managed heap.

0:00 0 >! heapstat Heap Gen0 Gen1 Gen2 LOH Heap0 2252000 18880400 3968704192 910894376 Free space: Percentage Heap0 43128 770160 185203264 39849984SOH: 4% LOH: 4%

As can be seen from the figure, the current head is on Gen2. Next, EEHEAP-GC can be used to find the segment address interval of Gen2, so as to minimize the content on the heap.

0:00 0 >! eeheap -gc Number of GC Heaps: 1 generation 0 starts at 0x00000207a4fc48c8 generation 1 starts at 0x00000207a3dc3138 generation 2 starts at 0x0000020697fc1000 ephemeral segment allocation context: none segment begin allocated size 0000020697fc0000 0000020697fc1000 00000206a7fbec48 0xfffdc48(268426312) 00000206bbeb0000 00000206bbeb1000 00000206cbeaef50 0xfffdf50(268427088) 00000206ccc40000 00000206ccc41000 00000206dcc3f668 0xfffe668(268428904) 00000206dcc40000 00000206dcc41000 00000206ecc3f098 0xfffe098(268427416) 0000020680000000 0000020680001000 000002068ffff8c0 0xfffe8c0(268429504) 00000206ff4d0000 00000206ff4d1000 000002070f4cf588 0xfffe588(268428680) 000002070f4d0000 000002070f4d1000 000002071f4cf9f0 0xfffe9f0(268429808) 000002071f4d0000 000002071f4d1000 000002072f4cfef0 0xfffeef0(268431088) 000002072f4d0000 000002072f4d1000 000002073f4cf748 0xfffe748(268429128) 000002073f4d0000 000002073f4d1000 000002074f4ce900 0xfffd900(268425472) 00000207574d0000 00000207574d1000 00000207674cfe70 0xfffee70(268430960) 00000207674d0000 00000207674d1000 00000207774ceaf8 0xfffdaf8(268425976) 00000207774d0000 00000207774d1000 00000207874cf270 0xfffe270(268427888) 00000207874d0000 00000207874d1000 00000207974cf7a8 0xfffe7a8(268429224) 00000207974d0000 00000207974d1000 00000207a51ea5a8 0xdd195a8(231839144)

In general, the first segment is Gen0 + Gen1, and the next segment is Gen2, then I will select the segment: 00000206DCC41000-00000206ECC3F098. DUMPHEAP exports all objects in the range.

0:00 0 >! dumpheap -stat 00000206dcc41000 00000206ecc3f098 Statistics: MT Count TotalSize Class Name 00007ffd00397b98 191803 18413088 System.Data.DataRow 00007ffd20d216b8 662179 37834152 System.String 00007ffd20d193d0 23115 187896401 System.Byte[]

Byte[] bytes [] bytes [] bytes [] bytes [] bytes [] bytes [] bytes [] bytes [] bytes [] bytes [] bytes

0:00 0 >! dumpheap -mt 00007ffd20d193d0 00000206dcc41000 00000206ecc3f098 Address MT Size 00000206dcc410e8 00007ffd20d193d0 8232 00000206dcc43588 00007ffd20d193d0 8232 00000206dcc45a48 00007ffd20d193d0 8232 00000206dcc47d78 00007ffd20d193d0 8232 00000206dcc4a028 00007ffd20d193d0 8232 00000206dcc4c4b0 00007ffd20d193d0 8232 00000206dcc4eb08 00007ffd20d193d0 8232 00000206dcc50e88 00007ffd20d193d0 8232 00000206dcc535b0 00007ffd20d193d0 8232 00000206dcc575d8 00007ffd20d193d0 8232 00000206dcc5a5a8 00007ffd20d193d0 8232 00000206dcc5cbf8 00007ffd20d193d0 8232 00000206dcc5eef8 00007ffd20d193d0 8232 00000206dcc611f8 00007ffd20d193d0 8232 00000206dcc634e8 00007ffd20d193d0 8232 00000206dcc657f0 00007ffd20d193d0 8232 00000206dcc67af8 00007ffd20d193d0 8232 00000206dcc69e00 00007ffd20d193d0 8232 ...

It’s 99% 8232byte. It’s all 8K byte arrays. So who’s using it? Gcroot looks up the reference root.

0:00 0 >! gcroot 00000206dcc410e8 Thread 8c1c: rsi: -> 00000206983d5730 System.ServiceProcess.ServiceBase[] ... -> 000002069dcb6d38 OracleInternal.ConnectionPool.OraclePool ... -> 000002069dc949c0 OracleInternal.TTC.OraBufReader -> 000002069dc94a70 System.Collections.Generic.List`1[[OracleInternal.Network.OraBuf, Oracle.ManagedDataAccess]] -> 00000206ab8c2200 OracleInternal.Network.OraBuf[] -> 00000206dcc41018 OracleInternal.Network.OraBuf -> 00000206dcc410e8 System.Byte[]

From the point of reference, seems to be OracleInternal.Net work. OraBuf [] holding, this is very confusion, is Oracle Sdk out bugs to memory to break up? So I’m curious, what’s the number of elements and what’s the size?

0:00 0 >! do 00000206ab8c2200 Name: OracleInternal.Network.OraBuf[] MethodTable: 00007ffcc7833c68 EEClass: 00007ffd20757728 Size: 4194328(0x400018) bytes Array: Rank 1, Number of elements 524288, Type CLASS (Print Array) Fields: None 0:000> ! objsize 00000206ab8c2200 sizeof(00000206ab8c2200) = -1086824024 (0xbf3861a8) bytes (OracleInternal.Network.OraBuf[])

The current array has 52W, and the totalSize is directly negative 😓.

3. Look for problem code

After you know the phenomenon, then use ILSpy to decompile Oracle SDK to see, the final comparison, as shown in the figure below:

So it turns out that m_tempOBList is the main culprit of memory inflation, which is embarrassing, why is it skyrocketing? Why not? Since I am not familiar with Oracle, I can only turn to the magic StackOverflow for help. Huge managed memory allocation when reading (iterating) data with DbDataReader

The Oracle SDK has a bug in reading CLOB-type fields. The solution is simple: use CLOB-type fields and then release them.

4. Seek the truth

If there is a problem with reading a CLOB type, call up all the threads in the stack and see if there are any CLOBs in the stack.

On the thread stack, the code converts the IDataReader to a DataTable using the toDataTable method, and as it reads the big field, it gets a getCompleteGlobal Data, which means that the perfect hit thread says, to make the conclusion more accurate, I’m just going to dig up how many lines has the current dataReader read?

0:02 8 >! clrstack -a OS Thread Id: 0xbab0 (28) 000000e78ef7d520 00007ffd00724458 System.Data.DataTable.Load(System.Data.IDataReader, System.Data.LoadOption, System.Data.FillErrorEventHandler) PARAMETERS: this = <no data> reader (<CLR reg>) = 0x00000206a530ac20 loadOption = <no data> errorHandler = <no data> 0:028> ! do 0x00000206a530ac20 Name: Oracle.ManagedDataAccess.Client.OracleDataReader MethodTable: 00007ffcc7933b10 EEClass: 00007ffcc78efd30 Size: 256(0x100) bytes File: D:\xxx.dll Fields: 00007ffd20d23e98 4000337 d0 System.Int32 1 instance 1061652 m_RowNumber

From m_rowNumber, 106W rows have already been read. It is not common to read 100W + records at once. If there are any large fields, it is 🐂.

Three:

The problem was caused by reading millions of bytes of data with a large field into the dataTable at once. The solution is simple: read the dataReader by yourself through the for and release it immediately after processing the OracleLob type. See the post code:


var item = oracleDataReader.GetOracleValue(columnIndex);

if (item is OracleClob clob)
{
    if (clob != null)
    {
        // use clob.Value ...

        clob.Close();
    }
}

For more high-quality dry goods: see my GitHub:dotnetfly