Moment For Technology

To talk about the pthread oom problem | performance optimization

Posted on Sept. 23, 2022, 8:47 p.m. by 鞠龍
Category: android Tag: android

preface

Will Thread also get OOM?

We talked about pThread OOM once before. Based on the scenario and analysis of Rxjava, only a small part of the problem was solved. But in fact, as long as we abuse threads, especially Huawei equipment, there may be corresponding problems.

So I'm going to expand it again this time and share some of my recent work with you.

This time we start from two aspects, to see if we can effectively solve this part of the problem.

  1. throughdebugTool hook AllDefaultThreadFactoryThe nameless thread created
  2. throughplugin+asmDo thread pool replacement and arrest rule-breakers

The body of the

If this part of your line is already starting to show up, where to start can be a real headache, because it's hard to analyze the problem just by looking at the stack on the line, and because it's a contingency line, there's no way to steadily reproduce this part of the problem.

By the way, this article doesn't help you solve the thread overflow problem at native

This is a very tricky problem for development.

My idea would be to grab all of the current unnamed thread pools and name each one so that when we run into a similar problem again we can count by thread name and see who started the most threads. And then we'll see if these guys can optimize their code.

Epic Hook

I've checked for thread OOM issues online via Bugly, and these issues can't be seen in isolation. The last stack was just the straw that broke the camel's back. I took a look at the other adjacent threads and listed them and found that there were many pool-x-thread-x related threads that were created as a result of ThreadFactory in the default thread pool construction.

In iocanary, we introduced some of the capabilities of overly dynamic hooks. Our debugging tool is also based on Epic, which is a little different from XHook.

Android IO monitor | performance monitor series

Epic provides hook constructors and methods. Here we will mainly use hook constructors. DexposedBridge. HookAllConstructors is also this method.

DexposedBridge.hookAllConstructors(Executors.defaultThreadFactory().javaClass, object : XC_MethodHook() {
    @Throws(Throwable::class)
    override fun beforeHookedMethod(param: MethodHookParam) {
        super.beforeHookedMethod(param)
    }
})
Copy the code

Our goal is the Executors of hook. DefaultThreadFactory (). The javaClass constructor. Through DexposedBridge hookAllConstructors method, we can get to all need to hook the class constructor calls.

Because the constructor for DefaultThreadFactory is private, it's cumbersome

And then what do we need to do? It would be perfect if we could get the stack before the constructor call. But how do I get the stack? My first thought is to throw an exception print. So where is the stack held ????

private static ListStackTraceElement stackTraceInCurrentThread(a) {
  return newArrayList(Thread.currentThread().getStackTrace());
}
Copy the code

We can actually get an idea of this by just following the print function of the exception. This was obtained in Throwables. From this we can actually see that the stack information is stored on the thread.

It makes sense that threads are called gcroot. Because the virtual machine holds all the live thread instances and stacks.

DexposedBridge.hookAllConstructors(Executors.defaultThreadFactory().javaClass, object : XC_MethodHook() {
              @Throws(Throwable::class)
              override fun beforeHookedMethod(param: MethodHookParam) {
                  super.beforeHookedMethod(param)
                  val thread = Thread.currentThread()
                  val stackTraceElements = thread.stackTrace
                  if (checkLegalStack(stackTraceElements)) {
                      instance.addStack(stackTraceElements)
                      Log.i(TAG, "stack: ${stackTraceElements.toString()}")}}})Copy the code

Here we hook ThreadFactory, log the call stack for this part of the unnamed thread pool, and write the stack information to a file.

We then asked the test students to execute the monkey with us, and we simply exported the file to list the thread pools that were being used illegally in the current project.

Then we just need to replace this part of the unnamed thread pool with a thread pool with naming rules, and then we can solve this part of the online interference problem.

Magic changed the tripartite SDK

Github demo link, although I have seen through the essence of white whao, but I still want to say, really don't come three even brothers

Once we have done this, we can almost ensure that we have modified all threadFactories within our control. But what if this part of the thread pool construction is in a third-party SDK at this point? How to adjust the thread pool construction of this part of the three-party SDK without wu De?

Let's use ASM again. Most of the scenarios we used with ASM were to create a new function call. This time we will adopt the rule of class substitution.

Simply put, we do a class scan, and when we find that the current row is executing a thread-pool-constructed init function, we replace it with our safe and legal thread-pool construct. So we can fix the third party SDK code.

This part of the code I can't write with the ClassVisitor at all, so I'll rely on classNodes.

class ThreadAsmHelper : AsmHelper {
    @Throws(IOException::class)
    override fun modifyClass(srcClass: ByteArray?).: ByteArray {
        val classNode = ClassNode(Opcodes.ASM5)
        val classReader = ClassReader(srcClass)
        //1 converts the bytes read into classNode
        classReader.accept(classNode, 0)
        //2 Processing logic for classNode
        val iterator: IteratorMethodNode = classNode.methods.iterator()
        while (iterator.hasNext()) {
            valmethod = iterator.next() method.instructions? .iterator()? .forEach {if (it.opcode == Opcodes.INVOKESTATIC) {
                    if (it is MethodInsnNode) {
                        it.hookExecutors()
                    }
                }
            }
        }
        val classWriter = ClassWriter(0)
        //3 Convert classNode to a byte array
        classNode.accept(classWriter)
        return classWriter.toByteArray()
    }

    private fun MethodInsnNode.hookExecutors(a) {
        when (this.owner) {
            EXECUTORS_OWNER - {
                info("owner:${this.owner}  name:${this.name} ")
                ThreadPoolCreator.poolList.forEach {
                    if (it.name == this.name  this.name == it.name  this.owner == it.owner) {
                        this.owner = Owner
                        this.name = it.methodName
                        this.desc = it.replaceDesc()
                        info("owner:${this.owner}  name:${this.name} desc:${this.desc} ")}}}}}}Copy the code

The default thread pool configuration is based on the Executors static method. From bytecode, we can be sure that the first operator must be INVOKESTATIC.

This is all the code for asm scan insertions. So the logic is pretty simple, we get ClassNode, we go through all the methods, we start reading line by line, and then we decide if the operator is INVOKESTATIC, and then we just say Desc,methodName, Are name and owner in line with the rules we need to change? If so, replace them. That completes the function.

You can go to see the specific code I wrote AndroidAutoTrack, there are operations of this part of the code.

Connect to the Apm system

This part of content is still in my imagination and has not been put into actual development. I will try this part of ability when I have time.

Apm as an APP performance analysis tool, the main purpose is to assist the development of children boots quickly locate the problem, while helping you to optimize the code.

I personally feel that this part can also be used as a small part of APM capabilities. We can collect the current total number of threads for the page based on the Activity dimension.

In addition, what we need to do is to set several thresholds. When the number of threads reaches low risk, medium risk and high risk, the operation of thread name reporting will be carried out.

Also, because we collect thread data from the Activity dimension, we can evaluate the health of the page to see if a particular page has an operation problem that causes the number of threads to spike.

As an aside, before Ali was asked such a threshold design of APM, the alarm design at that time I did not take into account ah, vegetable dog shrimp officially under.

conclusion

I think every once in a while development needs to take stock of what you've done before. For example, whether what we have done before is good enough, whether there is room for optimization, and whether there may be technical means to monitor and prevent future problems.

I heard one of the big guys say look back at the code you wrote three months ago, and if you think it's perfect, you've been swimming for three months.

Search
About
mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.