This series introduces memory monitoring schemes around the following aspects:

  • Number of FD
  • Number of threads
  • Virtual memory
  • The Java heap
  • Native memory

The last section introduced how to carry out FD monitoring, thread number monitoring, and from the perspective of Koom source code, detailed introduction of how to monitor Java, native thread stack memory leak.

This section will introduce how to monitor virtual memory and Java heap memory, and talk about two mainstream Java memory leak monitoring methods based on matrix and Koom sources respectively.

Virtual memory

When an application allocates memory, it gets virtual memory, and only when the block is actually written does a page miss interrupt occur and physical memory is allocated. The size of virtual memory is limited by CPU architecture and kernel.

The 32-bit CPU architecture has a maximum address space of 4GB, while the arm64 architecture has a maximum address space of 512GB. The status of virtual memory usage can be obtained from the VmSize field of /process/ PID /status. This method is the same as the number of native Threads in the previous section:

File(String.format("/proc/%s/status", Process.myPid())).forEachLine { line ->
    when {
        line.startsWith("VmSize") -> {
            Log.d("mjzuo", line)
        }
    }
}
Copy the code

For further analysis, read /process/ PID /smaps, which records all virtual memory allocations in the process. Of course, we can also run the command to get the current smaps file:

#[com.blog.a]: Package name, [22082]: process ID
adb shell "run-as com.blog.a cat /proc/22082/smaps" > smaps.txt
Copy the code

Because the content is not readable, we usually use the py script to sort it. Here is some of the original content:

7d4a434000-7d4a435000 rw-p 00010000 fe:00 4380  /system/vendor/lib64/libqdMetaData.so
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB
AnonHugePages:         0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB
VmFlags: rd wr mr mw me ac 
Copy the code

Libqdmetadata. so is mapped to libqdmetadata. Size is the Size of the allocated linear address space (virtual memory). Rss indicates the size of physical memory occupied. Pss indicates that the occupied physical memory includes shared memory and is related to:

Rss = Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty
Copy the code

monitoring

Take matrix as an example: a matrix. GCSST thread is started to periodically check whether VmSize exceeds the threshold. The default threshold is 3 minutes.

if (vmSize > 4L * 1024 * 1024 * 1024 * mCriticalVmSizeRatio)
Copy the code

Heap memory

The size of the Java heap is set by the system for the application. You can obtain a larger limit by setting the application.largeHeap attribute in the AndroidManifest. In addition, we can obtain some heap memory status directly through the Runtime interface to help troubleshoot problems with memory snapshot:

javaHeap.max = Runtime.getRuntime().maxMemory()
javaHeap.total = Runtime.getRuntime().totalMemory()
javaHeap.free = Runtime.getRuntime().freeMemory()
javaHeap.used = javaHeap.total - javaHeap.free
javaHeap.rate = 1.0 f * javaHeap.used / javaHeap.max
Copy the code

The virtual machine stack is also covered in the previous blog: how classes are loaded, how objects are allocated and reclaimed, and how methods are invoked by the JVM.

  • The Class loading process and the Class loader

  • The life and death of an object in the JVM

  • How are methods invoked within the JVM

monitoring

The memory of the Java heap is allocated and released by virtual machines. Our concern is to avoid memory leaks. The following two monitoring schemes are introduced:

Plan 1

Principle: During Activity onDestroy, encapsulate the Activity as a weak reference object, add it to the queue, and create a sentinel object. Then, manually GC the sentinel object. When the sentinel object is reclaimed, check whether the Activity in the queue is also reclaimed. Finally, the memory snapshot is dumped.

The sentinel object was created because manual GC does not guarantee that the JVM will do garbage collection immediately, the timing is controlled by the virtual machine, and the virtual machine will do garbage collection at the appropriate time.

Take the matrix code as an example:

@Override
public void init(Application app, PluginListener listener) {
    super.init(app, listener); .// Initialize the listener and create the mHandlerThread("matrix_res")
    // Set the mode of jump information to dumpmode. MANUAL_DUMP
    mWatcher = new ActivityRefWatcher(app, this); 
}

@Override
public void start(a) {
    super.start(); . mWatcher.start();// Start monitoring
}

@Override
public void start(a) {
    stopDetect();
    final Application app = mResourcePlugin.getApplication();
    if(app ! =null) {
        // Listen for the lifecycle
        app.registerActivityLifecycleCallbacks(mRemovedActivityMonitor);
        Retryabletask.status Status = task.execute()
        / / execution RetryableTask# the execute ()
        / / the status = = RetryableTask. Status. RETRY pollingscheduleDetectProcedure(); }}Copy the code

Update the interval when switching between front and back.

public void onForeground(boolean isForeground) {
    if (isForeground) {
        // The interval is 1 minutemDetectExecutor.setDelayMillis(mFgScanTimes); .// Stop the current task, reset the timer check, and clear the failedAttempts
    } else {
        // The background timer interval is 20 minutesmDetectExecutor.setDelayMillis(mBgScanTimes); }}Copy the code

RetryableTask#execute():

// Block the current thread if there is no Activity already onDestroy
if (mDestroyedActivityInfos.isEmpty()) {
    synchronized (mDestroyedActivityInfos) {
        try {
            while (mDestroyedActivityInfos.isEmpty()) {
                // Block and release the lockmDestroyedActivityInfos.wait(); }}... }// Returns RETRY, and mHandlerThread polls periodically
    return Status.RETRY;
}
Copy the code

OnActivityDestroyed:

@Override public void onActivityDestroyed(Activity activity) {
    // Collect all activities for onDestroy
    pushDestroyedActivityInfo(activity);
    mHandler.postDelayed(new Runnable() {
        @Override
        public void run(a) {
            triggerGc(); // Manually trigger a GC every two seconds.}},2000);
}
Copy the code

PushDestroyedActivityInfo (Activity) :

private void pushDestroyedActivityInfo(Activity activity) {...// Wrap the Activity information for onDestroy into ConcurrentLinkedQueue
    mDestroyedActivityInfos.add(destroyedActivityInfo);
    // Wake up the matrix_res thread
    synchronized(mDestroyedActivityInfos) { mDestroyedActivityInfos.notifyAll(); }}Copy the code

RetryableTask#execute(): The old versionThe sentry object is used to determine whether GC has been performed:

// This is the sentinel object that checks whether the JVM has performed GC
final WeakReference<Object[]> sentinelRef = new WeakReference<>(new Object[1024 * 1024]); // alloc big object
triggerGc(); // Manually trigger gc
if(sentinelRef.get() ! =null) {
    return Status.RETRY;
}
Copy the code

RetryableTask#execute(): The new version calls the GC method three times directly:

triggerGc(); // manually triggerGc, sleep 1s triggerGc(); triggerGc();Copy the code

RetryableTask#execute():

final Iterator<DestroyedActivityInfo> infoIt = mDestroyedActivityInfos.iterator();
// Iterate through the onDestroy collection
while (infoIt.hasNext()) {
    final DestroyedActivityInfo destroyedActivityInfo = infoIt.next();
    // Manually trigger gc
    triggerGc();
    // If a weak reference to the activity is missing, then it has been reclaimed and no leakage has occurred
    if (destroyedActivityInfo.mActivityRef.get() == null) {
        infoIt.remove();
        continue;
    }

    ++destroyedActivityInfo.mDetectedCount;
    // Repeat the mMaxRedetectTimes check. If the activity has not been recovered, it will leak
    if(destroyedActivityInfo.mDetectedCount < mMaxRedetectTimes && ! mResourcePlugin.getConfig().getDetectDebugger()) { destroyedActivityInfo.mKey, destroyedActivityInfo.mDetectedCount); triggerGc();continue; }... triggerGc();/ / execution ManualDumpProcessor# process
    // Enable the foreground Notification function
    // Note other modes: jump memory snapshot and parse hprof information
    if(mLeakProcessor.process(destroyedActivityInfo)) { infoIt.remove(); }}Copy the code

Advantages: More accurate detection of memory leakage;

Disadvantages: Manual GC has performance overhead.

Scheme 2

Principle: Periodically checks whether the current memory reaches the threshold. If the current memory usage reaches the threshold for several consecutive times, memory dump is triggered.

The following takes KOOM as an example, the principle is the same, but the standard is slightly different, the code is as follows:

override fun startLoop(clearQueue: Boolean, postAtFront: Boolean, delayMillis: Long){...// After initialization, the thread mLoopRunnable is enabled and polling is performed once every 15 seconds
  super.startLoop(clearQueue, postAtFront, delayMillis)
  getLoopHandler().postDelayed({ async { processOldHprofFile() } }, delayMillis)
}
Copy the code

LoopMonitor# startLoop:

open fun startLoop(
    clearQueue: Boolean = true,
    postAtFront: Boolean = false,
    delayMillis: Long = 0L
){... getLoopHandler().postDelayed(mLoopRunnable, delayMillis) ... }private val mLoopRunnable = object : Runnable {
  override fun run(a) {
    // Execute the call method and judge the return value to decide whether to interrupt the polling
    if (call() == LoopState.Terminate) {
      return}... getLoopHandler().removeCallbacks(this)
    // poll once for 15s
    / / OOMMonitor rewrite getLoopInterval = OOMMonitorConfig mLoopInterval, default 15 s
    getLoopHandler().postDelayed(this, getLoopInterval())
  }
}
Copy the code

OOMMonitor#call:

override fun call(a): LoopState {
  // Currently only Android 5.0-11 is supported
  if (Build.VERSION.SDK_INT < Build.VERSION_CODES.LOLLIPOP
    || Build.VERSION.SDK_INT > Build.VERSION_CODES.S
  ) {
    return LoopState.Terminate
  }
  ...
  return trackOOM() // Check heap memory
}
Copy the code

OOMMonitor#trackOOM:

private fun trackOOM(a): LoopState {
  SystemInfo.refresh() // Get the current heap memory

  mTrackReasons.clear() // Clear the track collection
  for (oomTracker in mOOMTrackers) {
    if (oomTracker.track()) { // For HeapOOMTracker, add the collection after taking care of the heap memory leak condition
      mTrackReasons.add(oomTracker.reason())
    }
  }

  if (mTrackReasons.isNotEmpty() && monitorConfig.enableHprofDumpAnalysis) {
    ... / / dump information
    return LoopState.Terminate // Stop polling
  }
  return LoopState.Continue // Continue polling
}
Copy the code

HeapOOMTracker# track:

override fun track(a): Boolean {
  val heapRatio = SystemInfo.javaHeap.rate Or heap memory usage

  // Leakage condition: the set threshold has been exceeded for 3 consecutive times, and each time is not less than (last memory usage -5%)
  // The memory is high and there is a leak
  if (heapRatio > monitorConfig.heapThreshold
      && heapRatio >= mLastHeapRatio - HEAP_RATIO_THRESHOLD_GAP) {
    mOverThresholdCount++
  } else {
    // If the conditions are not met, clear the Count and Ratio and re-count the statistics
    reset() 
  }
  mLastHeapRatio = heapRatio
  // maxOverThresholdCount = 3 times by default
  return mOverThresholdCount >= monitorConfig.maxOverThresholdCount
}
Copy the code

Advantages: Performance has little impact on user experience.

Disadvantages: Not accurate for detecting leakage, there are false positives.

The following section introduces Native memory monitoring from three perspectives:

  1. So large memory application monitoring.

  2. Large picture application monitoring.

  3. Native memory leak monitoring.

In this section.