• How to define the occurrence of lag:
If the average FPS of an App is less than 30, and the minimum is less than 24, the App has stalled.Copy the code
  • It is difficult to reproduce offline, and it is strongly related to the scene (so we need to do lag monitoring and collect on-site information).

CPU Knowledge

  • The latest mainstream models use multi-stage energy efficient CPU architectures (multi-core layered architectures)
  • From CPU to GPU to AI chip NPU, as the overall performance of mobile CPU leaps forward, we can take full advantage of mobile computing power to reduce high server costs;
  • To evaluate the performance of a CPU, we need to look at the main frequency, core number, cache and other parameters. Specifically, it shows the computing capability and instruction execution capability, that is, the number of floating point computations and instructions executed per second.
  • There are many reasons for the lag (involving code, memory, drawing, IO, CPU, etc.), which are eventually reflected in CPU time, which can be divided into user time and system time.
    • User time: time spent executing user-mode application code;
    • System time: to perform the many hours spent in kernel mode system calls, including I / | O, locks, interruption, and other system calls;
  • Common commands:
Adb shell / / gets CPU core for cat/sys/devices/system/CPU/possible / / get the first CPU the maximum frequency of the cat / sys/devices/system/CPU/cpu0 / cpufreq/the files cpuinfo_max_freq < / / get the second CPU minimum frequency of the cat / sys/devices/system/CPU/cpu1 / cpufreq/cpuinfo_min_freq hold < / / the CPU usage of the whole system cat/proc/pid/stat top command can help us to check which process is the CPU A large consumer of; The vmstat command can dynamically monitor virtual memory and CPU activity of the operating system in real time. The strace command can track all system calls to the vmstat command in a process or the /proc/[pid]/schedstat file to see the number of CPU context switches in the /proc/[pid]/stat/process CPU usage /proc/[pid]/task/[tid]/stat // CPU usage of each thread in the process /proc/[pid] /sched// Process CPU scheduling related /proc/loadavg// Average load of the system, file corresponding to the uptime commandCopy the code
  • No low priority threads, such as high priority threads, exist. For example, the main thread is waiting for a lock from a background thread
  • Three types of CPU-related problems:
    1. Redundant CPU resources:
      • Low algorithm efficiency
      • No caching
      • The calculation is using the wrong basic types (e.g., long when int is sufficient, the calculation pressure is 4 times more)
    2. CPU resource competition:
      • Rob the CPU resources of the main thread
      • Grab the CPU resources of audio and video,
        • Audio and video codec itself will consume a lot of CPU resources, and its decoding speed is a hard requirement, if not up to it may produce playback fluency problems;
        • There are two ways to optimize:
          1. Try to exclude non-core business consumption.
          2. Optimize its own performance consumption and convert CPU load to GPU load, such as using RenderScript to process image information in video.
      • 3. Three monks have no water.
    3. Low CPU usage:
      • There’s disk and network I/O, there’s lock operations, sleep, and so on. Lock optimization is usually to reduce the scope of the lock as much as possible.

The Caton check tool

  • Traceview and Systrace are both familiar tools for troubleshooting failures, and they fall into two schools of implementation
    • Instrument: Obtain the process of calling all functions within a period of time. You can analyze the process of calling functions within this period of time to further analyze the point to be optimized.
    • Sample: Observe the process of some function calls selectively or by sampling. You can infer the suspects in the process through the limited information, and then continue to refine the analysis;
  1. Traceview

    • Type: instrument;
    • Principle: Use the event event of the Android Runtime function call to write the elapsed time and calling relationship of the function to the trace file.
    • Features:
      • It is used to see what functions are called in the whole process, but the performance overhead of the tool itself is too large to reflect the real situation sometimes;
      • After the Android 5.0, added startMethodTracingSampling method, can use based on the sample analysis, the way to reduce the impact on the performance of runtime analysis.

      With the addition of the SAMPLE type, we need to make a tradeoff between overhead and information richness.

      • Neither Traceview has good support for release packages, such as anti-obfuscating
  2. Nanoscope

    • Type: instrument
    • Principle: Directly modify the source code of Android virtual machine, add buried code in the execution entry and end of ArtMethod, write all information to memory first, and generate the result file uniformly after trace ends;
    • Features: Low performance cost, suitable for automatic analysis that takes time to start, but it takes a long time to trace the end of generating the result file. On the other hand, it can support analysis of any application, can be used to do competitive analysis. But it has some limitations:
      • You need to flush your own ROM and currently only support Nexus 6P or an emulator with its x86 architecture;
      • By default, only the main thread is supported for collection. Other threads need to be manually set.

      Given memory size limitations, the memory array can only support time periods of about 20 seconds per thread.

      • We can run automated startup tests on a regular basis every day to see if any new time points are added
  3. systrace

    • Type: the sample
    • Performance analysis tools added to Android 4.1. I usually use Systrace to track I/O operations, CPU load, Surface rendering, GC, and so on.
    • Features: Can only monitor the time of specific system calls, low performance overhead, but does not support the time analysis of application code; However, the system reserves the Trace. BeginSection interface to listen to the call time of the application program. We can increase the monitoring of the application program time based on systrace by inserting piles into each function at compile time
  4. Simpleperf

    • Type: the sample
    • If we want to analyze Native function calls, none of the above tools will do the job. Android 5.0 has added Simpleperf for performance analysis
    • Using hardware perF events provided by the CPU’s Performance Monitoring unit (PMU), you can see the time of all Native code, while encapsulating systrace’s monitoring function
    • Android Studio 3.2 also supports Simpleperf directly in profilers
  • Collect all
    • If you need to analyze the time of Native code, you can choose Simpleperf;
    • If you want to analyze system calls, select Systrace.
    • If you want to analyze the execution time of the entire program, you can choose either Traceview or the pile-in version of Systrace.
  1. Visualization method
  • Several performance profiling tools are directly integrated into Profiler for Android Studio 3.2:
    • Sample Java Methods function similar to the Sample type of Traceview
    • Trace Java Methods is similar to the instrument type of Traceview
    • Trace System Calls functions similar to Systrace
    • SampleNative (API Level 26+) is similar in functionality to Simpleperf
  • While not comprehensive and powerful, it significantly lowers the barrier to entry for developers
  • Presentation of analysis results: These analysis tools support Call Chart and Flame Chart presentation
    • Call Chart is the default display method used by Traceview and Systrace. It is shown in the order in which the functions of the application are executed. It is suitable for analyzing the calls of the entire process
    • The Flame Chart, also known as the Flame Chart, gives a global view of the distribution of calls over time, combining information from both spatial and temporal dimensions into a single Chart
  1. StrictMode
    • Draconian mode is a tool class introduced in Android 2.3. It is a runtime detection mechanism provided by Android that helps developers detect irregularities in their code.
    • It is mainly used to detect two major problems:
      1. Thread policy: The detection content is some custom time-consuming calls, disk read operations, network requests, etc.
      2. Virtual machine policy: The detection content includes Activity leak, SqLite object leak, and number of instances detected.
    • Using: Configure StrictMode in the onCreate method of Application and use StrictMode to filter out the corresponding logs in the log input column
    Private void initStrictMode() {// 1. Only in the offline environment and use StrictMode if (BuildConfig. IsDebug) {/ / 2, set up thread strategy StrictMode. SetThreadPolicy (new StrictMode. ThreadPolicy. Builder (.) detectCustomSlowCalls () / / API grade 11, Use strictmode.noteslowcode.detectDiskreads ().detectDiskWrites().detectNetwork() // or.detectall () for all levels Problem.penaltylog () // Print error information in Logcat //.penaltydialog () // Can also jump the alarm dialog //.penaltydeath () // or crash.build(); / / 3, set up a virtual machine strategy StrictMode. SetVmPolicy (new StrictMode. VmPolicy. Builder () detectLeakedSqlLiteObjects () / / 1. SetClassInstanceLimit (person.class, 1). DetectLeakedClosableObjects () / / API grade 11. PenaltyLog (). The build ()); }}Copy the code

Caton monitoring

  1. The message queue
    • Method 1: By replacing the Printer of Looper;
      1. First, we need to use looper.getMainLooper ().setMessagelogging () to set up our own Printer implementation class to print out logging. Thus, the Printer implementation class that we set is called before and after each message execution.
      2. We are Dispatching a task in the child thread after the specified time threshold. The task is to obtain the stack information of the current main thread and some current scenario information, for example: Memory size, computer, network status, etc.
      3. If “<<<<< Finished to “is matched within the specified threshold, then the message has been executed, which indicates that there is no lag effect, then we can cancel the child thread task
    • Mode 2: Insert an empty message into the header of the main thread message queue every second through a monitoring thread. If we need to monitor a 3-second lag, we can determine that there is a 3-second lag on the main thread if the header message is still not consumed in the fourth poll;
  2. instrumentation
    • Message queue-based lag monitoring is not accurate and the running function may not be the real time consuming function.
    Suppose that A, B, and C are executed sequentially in A message loop. When the entire message is executed for more than 3 seconds, because A and B have been executed, we can only get A stack of functions C that are executing. In fact, this may not take much time, but for big data online, Because functions A and B are relatively time-consuming, the probability of capturing them is higher, and more logs of functions A and B can be captured through background aggregation. If, like Traceview, you can get the elapsed time of all functions in the entire process, you can clearly see that functions A and B are the main cause of the lag. Can you make a custom Traceview++ using the callback event of the Android Runtime function call?Copy the code
    • Inline Hook technique is required. We can implement a write-in-memory solution like Nanoscope; Two points need to be noted:
      • Avoid the number of methods exploding
      • Filter simple functions
    • Implementation reference: Wechat Matrix
    • Although the effect of piling scheme on performance is generally acceptable, it is only used in gray package;
    • Weaknesses: You can only monitor the application’s own function time, not the system’s function calls. The stack seems to be “missing”
  3. Profilo
    • JVM AsyncGetCallTrace, and then adapted to The Android Runtime implementation
    • Facebook open source library, which collects the best of each solution
      1. Integrate atrace functionality
        Ftrace writes all performance buried data to the kernel buffer through trace_marker file, Profilo intercepts the write operation through PLT Hook, and selects some concerned events for analysis. All systrace probes are available, such as four component lifecycle, lock wait time, class validation, GC time, etc.Copy the code
      2. Get the Java stack quickly
        • The cost of getting the stack is huge, it suspends the main thread, and Profilo is a very clever implementation, it does something similar

        Native crash capture is a quick way to get the Java stack and send the SIGPROF signal through the interval.

  4. AndroidPerformanceMonitor
    • A non-intrusive performance monitoring component that can pop up pop-up messages in the form of notifications.
    • Advantages: non-invasive, convenient and accurate, able to locate a single line of code.
    • Use:
    / / 1. Build. Gradle rely on API configure it under the 'com. Making. Markzhai: blockcanary - android: 1.5.0' / / only in the debug package enable blockcanary caton monitoring and prompt, So to use debugApi 'com. Making. Markzhai: blockcanary - android: 1.5.0' / / 2. The Application of open caton monitoring in the onCreate method BlockCanary.install(this, new AppBlockCanaryContext()).start(); /** * @author: LiuJinYang * @createdate: /** * @author: LiuJinYang * @createdate: 2020/12/9 */ public class AppBlockCanaryContext extends AppBlockCanaryContext {// Implements a variety of contexts, including the application identifier, user UID, network type, and latency threshold, /** * provides the application identifier ** @return identifier can be specified at installation time, */ @override public String provideQualifier() {return "unknown"; } @override public String provideUid() {Override public String provideUid() {return "uid"; } /** * Provide current network type ** @return {@link String} like 2G, 3G, 4G, wifi, etc. */ @Override public String provideNetworkType() { return "unknown"; } /** * Configure the monitoring interval, beyond this interval, BlockCanary will be discontinued, use * with {@code BlockCanary}'s isMonitorDurationEnd * * @return monitor last duration (in hour) */ @Override public int provideMonitorDuration() { return -1; } /** * specify threshold (in millis), * * @return threshold in mills */ @override public int provideBlockThreshold() {return 1000; } /** * set the interval for the thread stack dump to be used when blocking occurs. BlockCanary will dump the stack information on the main thread according to * the current cycle * <p> * because of the Looper implementation, * </p> * * @return dump interval (in millis) */ @override public int provideDumpInterval() { return provideBlockThreshold(); } /** * save log path, such as "/blockcanary/", if permission permits, * * @override public String providePath() {return "/blockcanary/"; } /** * Do you need to notify the user of blocking ** @return true if need, else if not need. */ @Override public boolean displayNotification() { return true; Zip file ** @param SRC files before compress * @param dest files compressed * @return true if compression is successful */ @Override public boolean zip(File[] src, File dest) { return false; ** @override public void upload(file) ** @override public void upload(file) ** @override public void upload(file zippedFile) { throw new UnsupportedOperationException(); } /** * is used to set the package name, default is the process name,  * * @return null if simply concern only package with process name. */ @Override public List<String> concernPackages() {  return null; } /** * Use @{code concernPackages} method to specify stack information for filtering ** @return true if filter, false it not. */ @Override public boolean filterNonConcernStack() { return false; } /** * specify a whitelist, * * @return return null if you don't need white-list filter. */ @override public list <String> provideWhiteList() { LinkedList<String> whiteList = new LinkedList<>(); whiteList.add("org.chromium"); return whiteList; } /** * use whitelist, * * @return true if delete, false it not. */ @Override public boolean deleteFilesInWhiteList() { return true; } / Override public void onBlock(Context Context, BlockInfo BlockInfo) {}}Copy the code
Other monitoring
  • In addition to the main thread being too long, what other problems should we be concerned about?
  • Android Vitals is the official performance monitoring service of Google Play. There are three catton-related monitoring services: ANR, startup, and frame rate
  1. Frame rate
    • Choreographer is used by the industry to monitor the frame rate of an application;
    • Do statistics only when the interface is being drawn.
      GetWindow ().getDecorView().getViewTreeObserver().addonDrawListenerCopy the code
    • Average frame rate: measures interface smoothness;
    • Frozen frame rate: calculates the proportion of frozen frame time in all time;
      • Frozen frames: Android Vitals defines a frozen frame as losing frames for more than 700 milliseconds, which means losing 42 or more frames in a row.
      • In case of frame loss, we can obtain the current page information, View information and operation path and report it to the background to reduce the difficulty of secondary investigation
  2. Life cycle monitoring
    • Activity, Service, and Receiver component lifecycle time and number of calls is also a performance concern;
      • For example, an Activity’s onCreate() should not take longer than 1 second, otherwise it will affect the user’s view of the page
      • The life cycle of each component should be monitored more strictly and reported in full. The startup time and startup times of each life cycle of each component should be checked in the background.
    • In addition to the life cycle of the four components, we also need to monitor the number of startup times and time spent in each process life cycle;
    • Lifecycle monitoring recommends the use of compile-time piling techniques, such as Aspect, ASM and ReDex.
  3. Thread monitoring
    • Java thread management is a headache for many applications, where dozens or hundreds of threads are created during the application startup process. And most of the threads are not managed by the thread pool, are running freely;
    • On the other hand, some threads have high priority or high activity and occupy too much CPU. This makes the main thread less responsive, and we need to focus on optimizing for these threads.
    • There are two things to monitor for threads
      1. The number of threads, and how to create them: The nativeCreate() function of the got Hook thread is used for thread convergence, that is, to reduce the number of threads.
      2. Monitor the user time utime, system time stime, and priority of threads
  • There are many reasons for this, such as time-consuming functions, slow I/O, contention between threads, or locking. In fact, many times the problem of lag is not difficult to solve, compared to the solution, the more difficult is how to quickly find these lag points, and with more auxiliary information to find the real reason for the lag.

Caton site

  • With AssetManager. OpenNonAsset function takes as an example for analysis
  1. Solution 1: Java implementation:
    • As you can see from the source code, there are a number of synchronized locks within AssetManager;
    • Step 1: Get the Java Thread state from thread. getState, which confirms that the main Thread was BLOCKED at the time.
      WAITING, TIME_WAITING, and BLOCKED are states that require special attention. BLOCKED means the thread is waiting to acquire a lock, which corresponds to case 1 in the following code; //WAITING is a thread WAITING for another thread to "wake up"; Synchronized (object) {// BLOCKED doSomething(); object.wait(); // When a thread enters the WAITING state, it not only releases CPU resources, but also releases the object lock it is holding.Copy the code
    • Step 2: Get all Thread stacks: via thread.getallStackTraces ()
      • Note: In Android 7.0, getAllStackTraces do not return to the main thread stack
      • By analyzing the collected logs, it is found that the thread related to AssetManager is BackgroundHandler
        "BackgroundHandler" RUNNABLE at android.content.res.AssetManager.list at com.sample.business.init.listZipFiles // Look at assetManager. list and find that the same synchronized lock is used, and the list function needs to traverse the entire directory. Public String[] list(String path) throws IOException {synchronized (this) {ensureValidLocked(); return nativeList(mObject, path); } // On the other hand, the "BackgroundHandler" thread is a low-priority background thread, which is a bad phenomenon mentioned in the previous article, where the main thread waits for a low-priority background threadCopy the code
  2. Solution 2: ANR log implementation (SIGQUIT signal)
    • The above Java implementation is fine, but it looks like ANR logs have much more information. What if you use ANR logs directly?
    // Thread name; The priority; Thread id; "Main" prio=5 TID =1 Suspended // thread group; Thread suspend count; Thread debug suspend count; | group = "main" sCount = 1 dsCount = 0 obj = 0 x74746000 self = 0 xf4827400 / native/thread id; Process priority; Scheduler priority; | sysTid = nice = 28661-4 CGRP = default sched handle = 0/0 = 0 xf72cbbec / / native thread state; Dispatcher state; User time utime; System time stime; The scheduling of CPU | schedstat state = D = (3137222937, 94427228, 5819) utm STM = 95 HZ core = 2 = = 218 | 100 / / stack related information stack=0xff717000-0xff719000 stackSize=8MBCopy the code
    • Native thread state
      In the ANR log above, the "main" thread is in a Suspended state. There are no Suspended states in the Java thread. In fact, Suspended represents the Native thread state. How do you understand that? In Android, Java threads are delegated to a Standard Linux thread, pThread. In Android, threads that Attach to the virtual machine and those that do not Attach to the virtual machine can be divided into two types. The threads managed in the virtual machine are all managed threads, so essentially the state of Java threads is actually a mapping of Native threads. Different Versions of Android Native have different thread states. For example, Android 9.0 defines 27 thread states, which can more clearly distinguish the current situation of the thread.Copy the code
    • How to get card’s ANR log?
      • Step 1: When the main thread is monitored, send SIGQUIT signal to the system actively.
      • Step 2: Wait for /data/anr/traces. TXT file to be generated.
      • Step 3: Report the generated file
      • From the ANR log, we can see directly that the lock on the main thread is held by the “BackgroundHandler” thread. In contrast, with the getAllStackTraces method, we can only guess from thread to thread.
        / / the stack information at android. Content. Res., AssetManager. Open (AssetManager. Java: 311) - waiting to lock < 0 x41ddc798 > (android.content.res.AssetManager) held by tid=66 (BackgroundHandler) at android.content.res.AssetManager.open(AssetManager.java:289)Copy the code
    • Existing problems:
      • Availability: many higher versions of the system do not have permission to read /data/anr/traces.
      • Performance: Getting all the thread stacks and all the information is time consuming and not necessarily appropriate for stalling scenarios, which may further aggravate user stalling;
  3. Solution 3: Hook implementation
    • We implement a set of “lossless” methods to get all the Java thread stacks and details with Hook method:
      1. Fork the child process, so that even if the child process crash will not affect our main process running, and the process of getting all the threads stack can be done completely without interrupting our main process;
      2. Use libart.so and dlsym to call ThreadList::ForEach to get all Native thread objects.
      3. Traverse the list of Thread objects, calling the Thread::DumpState method;
  4. Online ANR monitoring mode:
    • Several common types of ANR:
      1. KeyDispatchTimeout: Indicates that button events are not processed within 5 seconds.
      2. BroadcastTimeout: The broadcast receiver is in the foreground for 10 seconds and the background for 60 seconds.
      3. ServiceTimeout: indicates that the service is not processed within 20 seconds in the foreground and 200 seconds in the background.
    • The previous crash optimization mentioned “how to find ANR exceptions in an application”, so is there a better way to do this?
    • Anr-watchdog: A non-invasive ANR monitoring component that can be used to monitor ANR online
      / / 1. Build. Gradle configuration it depend on the implementation of the 'com. Making. Anrwatchdog: anrwatchdog: 1.4.0' / / 2. Initialize ANR-WatchDog new ANRWatchDog().start(); Private static final int DEFAULT_ANR_TIMEOUT = 5000; //3. private volatile long _tick = 0; private volatile boolean _reported = false; private final Runnable _ticker = new Runnable() { @Override public void run() { _tick = 0; _reported = false; }}; @ Override public void the run () {/ / 1, the first, named threads | ANR - WatchDog |. setName("|ANR-WatchDog|"); // 2. Next, declare a default timeout interval. The default value is 5000ms. long interval = _timeoutInterval; // 3, then post a _ticker Runnable via _uiHandler in the while loop. while (! IsInterrupted ()) {// 3.1 here _tick defaults to 0, so needPost is true. boolean needPost = _tick == 0; _tick += interval; if (needPost) { _uiHandler.post(_ticker); } // Next, the thread will sleep for a period of time. The default value is 5000ms. try { Thread.sleep(interval); } catch (InterruptedException e) { _interruptionListener.onInterrupted(e); return ; } // 4. If the main thread does not process Runnable, that is, _tick is not assigned to 0, then ANR has occurred. The second _reported bit is to avoid repeating ANR that has already been processed. if (_tick ! = 0 &&! _reported) { //noinspection ConstantConditions if (! _ignoreDebugger && (Debug.isDebuggerConnected() || Debug.waitingForDebugger())) { Log.w("ANRWatchdog", "An ANR was detected but ignored because the debugger is connected (you can prevent this with setIgnoreDebugger(true))"); _reported = true; continue ; } interval = _anrInterceptor.intercept(_tick); if (interval > 0) { continue; } final ANRError error; if (_namePrefix ! = null) { error = ANRError.New(_tick, _namePrefix, _logThreadsWithoutStackTrace); } else {// 5, if the ANR_Watchdog thread name is not set, ANRError NewMainOnly method will be used by default to handle ANR. error = ANRError.NewMainOnly(_tick); } // 6. The default processing will throw the current ANRError, causing the program to crash. _anrListener.onAppNotResponding(error); interval = _timeoutInterval; _reported = true; }}} // However, getting all the thread stack and all the information in the Java layer is time-consuming and may not be appropriate for a stalling scenario, which may further aggravate user stalling. For applications with high performance requirements, the stack information of all threads can be obtained by Hook Native layer. Refer to "Scheme 3: Hook Implementation" above.Copy the code
  • Site information:
    • Is it possible to further enrich caton’s “field information” than the system ANR log? We can further increase this information:
      • CPU usage and scheduling information: Refer to Homework 1 below;
      • Memory information: You can add total system memory, available memory, and memory for each application process. If debug.startalloccounting or ATrace is enabled, you can add GC information.
      • I/O is network related: you can also collect details of all I/O and network operations during the caton period
  • With Android 8.0, Android virtual machines finally support the JVM’s JVMTI mechanism. Many modules such as memory collection in profilers also switch to this mechanism.

Caton single point problem detection scheme

  • Common single point problems include main thread IPC(interprocess communication), DB operations, and so on
  • IPC single point problem detection scheme:
    1. Add burial points before and after the IPC. But this approach is not elegant enough
    2. Offline monitoring can be done using the ADB command
      Adb shell am trace-ipc start adb shell am trace-IPC start adb shell am trace-IPC start adb shell am trace-IPC start adb shell am trace-IPC start adb shell am trace-IPC start Save the monitored information to the specified file adb shell am trace-ipc stop-dump-file /data/local/ TMP /ipc-trace. TXT // 3. Export the monitored IPC -trace to the computer and check ADB pull  /data/local/tmp/ipc-trace.txtCopy the code
    3. ARTHook
      • AspectJ can only be used for non-system methods, that is, our App’s own source code, or some jar or AAR package that we refer to;
      • ARTHook can be used to Hook some of the system’s methods because we can’t change the system code, but we can Hook one of its methods and add some of our own code to its method body.
      // Use the PackageManager to get some information about our application, Or to get the device DeviceId information and AMS related information and so on, which will eventually call android.os.BinderProxy // Use ARTHook to Hook the transact method of the Android.os.BinderProxy class in the onCreate method of the Application in the project. DexposedBridge.findAndHookMethod(Class.forName("android.os.BinderProxy"), "transact", int.class, Parcel.class, Parcel.class, int.class, new XC_MethodHook() { @Override protected void beforeHookedMethod(MethodHookParam param) throws Throwable { LogHelper.i(  "BinderProxy beforeHookedMethod " + param.thisObject.getClass().getSimpleName() + "\n" + Log.getStackTraceString(new Throwable())); super.beforeHookedMethod(param); }}); } catch (ClassNotFoundException e) { e.printStackTrace(); }Copy the code
      • In addition to the problem of IPC call, there are a series of single point problems such as IO, DB, View drawing and so on, which need to establish the corresponding detection scheme
      • For the construction of the detection scheme for the stuck problem, ARTHook is mainly used to improve the offline detection tools and Hook the corresponding operations as far as possible to expose and analyze the problem.

Interface time was measured using the Lancet

  • The Lancet is a lightweight Android AOP framework that compiles quickly and supports incremental compilation.
  • Use Demo as follows
Dependencies {classpath 'me. Ele :lancet-plugin:1.0.6'} //2. Add apply plugin to 'build.gradle' : 'me. Ele. Lancet 'dependencies {provided 'me. /** * @author: LiuJinYang * @createdate: 2020/12/10 */ public class LancetUtil {// @proxy specifies the target method I to be woven into code, @proxy (" I ") //TargetClass specifies the TargetClass android.util.log that will be woven into the code @TargetClass("android.util.Log") public static int anyName(String tag, String msg){ msg = "LJY_LOG: "+msg ; // origin.call () represents the target method log.i () return (int) origin.call (); /** * @author: LiuJinYang * @createdate: 2020/12/10 */ public class LancetUtil { public static ActivityRecord sActivityRecord; static { sActivityRecord = new ActivityRecord(); } @Insert(value = "onCreate",mayCreateSuper = true) @TargetClass(value = "android.support.v7.app.AppCompatActivity",scope = Scope.ALL) protected void onCreate(Bundle savedInstanceState) { sActivityRecord.mOnCreateTime = System.currentTimeMillis(); // Call the original logic origin.callvoid () in the current Hook class method; } @Insert(value = "onWindowFocusChanged",mayCreateSuper = true) @TargetClass(value = "android.support.v7.app.AppCompatActivity",scope = Scope.ALL) public void onWindowFocusChanged(boolean hasFocus) { sActivityRecord.mOnWindowsFocusChangedTime = System.currentTimeMillis(); LjyLogUtil.i(getClass().getCanonicalName() + " onWindowFocusChanged cost "+(sActivityRecord.mOnWindowsFocusChangedTime -  sActivityRecord.mOnCreateTime)); Origin.callVoid(); } public static class ActivityRecord {/** ** * public static class ActivityRecord {/** ** public Boolean isNewCreate; public long mOnCreateTime; public long mOnWindowsFocusChangedTime; }}Copy the code

Caton analysis

  • After the client captures the cattons, finally the data needs to be uploaded to the background for unified analysis
  • Caton rate
    • Assess the impact of lag: UV lag rate = Lag occurred UV/turn on the lag capture UV, a user will continue to collect data for a day if the capture is hit
    • Evaluate the severity of the lag: PV lag rate = PV that has experienced lag rate/STARTUP collection PV. For users that hit the collection PV lag rate, they need to report each startup as the denominator

Homework after class

  1. Mimic processCPUTracker. Java to get the time share of each thread over time
  2. Use PLTHook technology to obtain Atrace logs
  3. Use the PLTHook technique to get the stack created by the thread

reference

  • Cadon Optimization (part 1) : You need to master cadon analysis
  • CPU usage of processes in Linux
  • Traceview
  • Uber’s open source Nanoscope
  • systrace
  • atrace
  • Simpleperf
  • How do you monitor app jams?
  • Matrix
  • Profilo
  • Ftrace profile
  • Debugging the Linux Kernel with FTrace
  • Debug the Linux kernel using Ftrace (2)
  • Atrace introduction
  • Atrace implementation
  • Android Choreographer source analysis
  • Android Development Master class – Catton Optimization (supplement) : Catton site and Catton analysis
  • Gnawing concurrency (4) : Java thread Dump analysis
  • Manual Q Android thread deadlock monitoring and automated analysis practices
  • JVMTI and Agent implementation
  • Android thread creation process
  • Android Performance Monitor(BlockCanary)
  • BlockCanary — Easy to find out what’s causing your Android App interface to fail
  • ANR-WatchDog

I’m Jinyang, if you want to advance and learn more about dry goods, welcome to follow the public account, “jinyang said,” receive my latest articles