Implementation of ASM bytecode staking for thread regulation

Public number: byte array

Hope to help you 🤣🤣

Recently, I read an article published by JINGdong Retail Technology: AOP technology in the multi-scenario practice of APP development. In the article, I introduced a usage scenario of AOP technology: thread use number optimization. After reading the feeling of very practical significance, but the article did not give specific implementation code, so I did a practical operation, the basic implementation of the article introduced the effect, this article will introduce my implementation in detail, and give specific implementation code. Part of this article is also directly quoted from this article, thanks here

In the process of implementing Android APP stability governance, thread-related OOM problems are inevitable. Threads are an expensive system resource, “expensive” not only for the cost of creating them, but also for the cost of using them. A system that can support the total number of threads running at the same time is limited by the number of processors and memory size of the system have hardware conditions, such as the operation of the thread take up processor time piece, the more threads running in the system, each thread internal energy per unit time is assigned to the less time slice will thread scheduling the more the number of context switches will bring, As a result, the processor will spend less time actually computing. In addition, in real scenarios, the number of tasks that need to be executed by threads in the whole life cycle of a program is often far more than the maximum number of threads that can be run simultaneously supported by the system. If the number of threads keeps increasing, OOM will inevitably be thrown out. When the number of threads overload will throw an exception in the android virtual machine layer: Java. Lang. OutOfMemoryError: pthread_create (1040 KB stack) failed

The causes of OOM due to thread overload can be classified into three categories:

Too many thread pools. Each thread pool requires daemons and too much space overhead, and too much thread pool usage is too expensive for memory resources
Too many resident threads. Resident threads are threads that are waiting, blocked, and runnable, which can occur when thread pools are used. Each thread pool contains a certain number of core threads and non-core threads. By default, the core thread will not be reclaimed even if it is idle, that is, it is not restricted by KeepAliveTime by default, which makes the core thread may remain idle and not be released, which greatly wastes system resources
Lots of anonymous threads. Anonymous threads are threads that are started anywhere in the code, and while this approach can achieve fast, highest priority asynchronization, too many anonymous threads are a challenge to troubleshooting difficulty and stability

For a project that has been iterated for a long time, the above problems will not only occur in self-development business, but also involve multiple third-party SDK or open source library, and it is unrealistic to promote the thread optimization of third-party SDK and open source library at the same time. At this point, bytecode staking, which is not limited to specific business forms and non-invasive, becomes a more reliable choice

There are two things you want to achieve with thread remediation:

Each anonymous thread is named as follows: the class name of the class in which the thread is created + the increasing thread number + the thread name set in the original code (if any), so that the problem can be quickly located when exceptions occur
Unify timeout mechanisms for thread pools. Hook each newXXXXThreadPool under Executors, name each thread in the thread pool according to the rules of anonymous threads, set a unified timeout period for the threads, and allow core threads to be recycled

Anonymous threads

Start by declaring a random Thread object that corresponds to the various anonymous threads that exist in your project

Decompile to see the corresponding bytecode instructions

From this, we can see several information:

Lines 23 through 27 are called accordinglynew Thread(Runnable , String)This process
The new directive on line 23 explicitly points to Java /lang/Thread
The LDC instruction on line 26 represents the operation of getting a constant value from the constant pool, that is, a thread name stringthread name
The invokespecial directive in line 27 corresponds to the call to the Thread constructor and contains two input parameters

This anonymous thread sets the thread name when it declares, whereas other anonymous threads in the project may not set the thread name and pass in only Runnable. Thus, at the bytecode level, there are two different instruction structures to be considered: those that contain one input parameter and those that contain two

Manually generating or concatenating threadnames at the bytecode level would be a hassle, so I took a more subtle approach: Replace the declared Thread object with our custom Thread subclass, and declare one more construction parameter for the subclass name. This way, we don’t need to worry about how many construction parameters are passed in when we declare the Thread object. You only need to pass in a String at the end of the invokespecial directive to represent the name of the current class

Let’s look at the actual coding implementation

Define a Thread subclass, declare one more input parameter className for it, and then manually concatenate the Thread name and className set in the original code as the final Thread name according to the specified rules

/ * * *@Author: leavesC
 * @Date: 2021/12/18 17:57
 * @Desc: * @ public number: byte array */
class OptimizedThread(runnable: Runnable? , name: String? , className: String) : Thread(runnable, generateThreadName(name, className)) {companion object {

        private val threadId = AtomicInteger(0)

        private fun generateThreadName(name: String? , className:String): String {
            return className + "-" + threadId.getAndIncrement() + if (name.isNullOrBlank()) {
                ""
            } else {
                "-$name"}}}constructor(runnable: Runnable, className: String) : this(runnable, null, className)

    constructor(name: String, className: String) : this(null, name, className)

}
Copy the code

After that, the Transform phase simply iterates through the TypeInsnNode that points to Java /lang/Thread and is not contained within ThreadFactory, which is the anonymous Thread to process

/ * * *@Author: leavesC
 * @Date: 2021/12/16 * 3@Desc:
 * @Github: https://github.com/leavesC * /
class OptimizedThreadTransform(private val config: OptimizedThreadConfig) : BaseTransform() {

    companion object {

        private const val threadClass = "java/lang/Thread"

        private const val threadFactoryClass = "java/util/concurrent/ThreadFactory"

        private const val threadFactoryNewThreadMethodDesc =
            "newThread(Ljava/lang/Runnable;) Ljava/lang/Thread;"

    }

    override fun modifyClass(byteArray: ByteArray): ByteArray {
        val classNode = ClassNode()
        val classReader = ClassReader(byteArray)
        classReader.accept(classNode, ClassReader.EXPAND_FRAMES)
        val methods = classNode.methods
        val taskList = mutableListOf<() -> Unit> ()if(! methods.isNullOrEmpty()) {for (methodNode in methods) {
                valinstructionIterator = methodNode.instructions? .iterator()if(instructionIterator ! =null) {
                    while (instructionIterator.hasNext()) {
                        val instruction = instructionIterator.next()
                        when (instruction.opcode) {
                            Opcodes.NEW -> {
                                val typeInsnNode = instruction as? TypeInsnNode
                                if(typeInsnNode? .desc == threadClass) {// If the thread is initialized in a ThreadFactory, it is not processed
                                    if(! classNode.isThreadFactoryMethod(methodNode)) { taskList.add { transformNew( classNode, methodNode, instruction ) } } } } } } } } } taskList.forEach { it.invoke() }val classWriter = ClassWriter(ClassWriter.COMPUTE_MAXS)
        classNode.accept(classWriter)
        return classWriter.toByteArray()
    }

    private fun ClassNode.isThreadFactoryMethod(methodNode: MethodNode): Boolean {
        return this.interfaces?.contains(threadFactoryClass) == true
                && methodNode.nameWithDesc == threadFactoryNewThreadMethodDesc
    }

}
Copy the code

After finding the target instruction, replace Java /lang/Thread with OptimizedThread, and then continue traversing backwards to find the instruction that called Thread to construct parameters. Insert an additional String method parameter declaration for this instruction. Then pass className as the construction parameter to the OptimizedThread, and the replacement is complete

private fun transformNew(
    classNode: ClassNode,
    methodNode: MethodNode,
    typeInsnNode: TypeInsnNode
) {
    val instructions = methodNode.instructions
    val typeInsnNodeIndex = instructions.indexOf(typeInsnNode)
    // Start with the typeInsnNode directive, find the directive calling the Thread constructor, and replace it
    for (index in typeInsnNodeIndex + 1 until instructions.size()) {
        val node = instructions[index]
        if (node is MethodInsnNode && node.isThreadInitMethodInsn()) {
            // Replace Thread with OptimizedThread
            typeInsnNode.desc = config.formatOptimizedThreadClass
            node.owner = config.formatOptimizedThreadClass
            // Insert an additional String method parameter declaration for the instruction calling the Thread constructor
            node.insertArgument(String::class.java)
            // Pass ClassName as a construction parameter to the OptimizedThread
            instructions.insertBefore(node, LdcInsnNode(classNode.simpleClassName))
            break}}}Copy the code

Thread pools

Java. Util. Concurrent. Executors class used to get the thread pool method is more than a dozen

public class Executors {

    public static ExecutorService newFixedThreadPool(int nThreads)

    public static ExecutorService newWorkStealingPool(int parallelism)

    public static ExecutorService newWorkStealingPool(a)

    public static ExecutorService newFixedThreadPool(int nThreads, ThreadFactory threadFactory)

    public static ExecutorService newSingleThreadExecutor(a)

    public static ExecutorService newSingleThreadExecutor(ThreadFactory threadFactory)

    public static ExecutorService newCachedThreadPool(a)

    public static ExecutorService newCachedThreadPool(ThreadFactory threadFactory)

    public static ScheduledExecutorService newSingleThreadScheduledExecutor(a)

    public static ScheduledExecutorService newSingleThreadScheduledExecutor(ThreadFactory threadFactory)

    public static ScheduledExecutorService newScheduledThreadPool(int corePoolSize)

    public static ScheduledExecutorService newScheduledThreadPool(int corePoolSize, ThreadFactory threadFactory)

}
Copy the code

There are three static methods that can be distinguished by input parameters:

No input parameter is included
Contains an input parameter: an int representing the number of threads or a ThreadFactory
Contains two input arguments: an int representing the number of threads and ThreadFactory

There are two types of differentiation according to the return value:

ExecutorService
ScheduledExecutorService

When you hook, you need to consider how many input parameters are passed to the code and the method return value type. To unify the thread naming rules in the thread pool, replace the ThreadFactory passed in and pass in a String argument that represents the name of the current class. In order to be able to set a uniform timeout for threads and to allow core threads to be recycled, you need to be able to get the ThreadPoolExecutor object

There is no way to hook the source code in the JDK, and it would be a bit cumbersome to make the above changes at the bytecode level, so I take a similar approach to dealing with anonymous threads: * / * If the following information is true: * / If the following information is true: * / If the following information is true: * / If the following information is true: * / If the following information is true: * / If the following information is true: * / If the following information is true: * / If the following information is true, add another String parameter to each method. Change the bytecode pointing to Executors to Optimizedexectrue / * so that we can configure thread pool parameters freely in OptimizedexecTrue / *

Let’s look at the actual coding implementation

Start by defining the OptimizedExecutors to replace / *, set a five-second timeout for each Generated ThreadPoolExecutor and allow core threads to be recycled, and set thread naming rules for NamedThreadFactory

/ * * *@Author: leavesC
 * @Date: 2021/12/16 therefore *@Desc: * /
object OptimizedExecutors {

    private const val defaultThreadKeepAliveTime = 5000L

    @JvmStatic
    fun newFixedThreadPool(nThreads: Int, className: String): ExecutorService {
        return newFixedThreadPool(nThreads, null, className)
    }

    @JvmStatic
    fun newFixedThreadPool(
        nThreads: Int,
        threadFactory: ThreadFactory? , className:String
    ): ExecutorService {
        return getOptimizedExecutorService(
            nThreads, nThreads,
            0LTimeunit.milliseconds, LinkedBlockingQueue(), threadFactory, className)} ···private fun getOptimizedExecutorService(
        corePoolSize: Int,
        maximumPoolSize: Int,
        keepAliveTime: Long,
        unit: TimeUnit,
        workQueue: BlockingQueue<Runnable>,
        threadFactory: ThreadFactory? = null,
        className: String.): ExecutorService {
        val executor = ThreadPoolExecutor(
            corePoolSize, maximumPoolSize,
            keepAliveTime, unit,
            workQueue,
            NamedThreadFactory(threadFactory, className)
        )
        executor.setKeepAliveTime(defaultThreadKeepAliveTime, TimeUnit.MILLISECONDS)
        executor.allowCoreThreadTimeOut(true)
        return executor
    }

    private class NamedThreadFactory(
        private valthreadFactory: ThreadFactory? .private val className: String
    ) : ThreadFactory {

        private val threadId = AtomicInteger(0)

        override fun newThread(runnable: Runnable): Thread {
            valoriginThread = threadFactory? .newThread(runnable)val threadName =
                className + "-" + threadId.getAndIncrement() + if(originThread ! =null) {
                    "-" + originThread.name
                } else {
                    ""
                }
            valthread = originThread ? : Thread(runnable) thread.name = threadNameif (thread.isDaemon) {
                thread.isDaemon = false
            }
            if(thread.priority ! = Thread.NORM_PRIORITY) { thread.priority = Thread.NORM_PRIORITY }return thread
        }

    }

}
Copy the code

After that, during the Transform phase, replace the Executors in the command by optimizedexecTrue if the static method in Executors is invoked. Then add the multi-insert parameter

/ * * *@Author: leavesC
 * @Date: 2021/12/16 * 3@Desc:
 * @Github: https://github.com/leavesC * /
class OptimizedThreadTransform(private val config: OptimizedThreadConfig) : BaseTransform() {

    companion object {

        private const val executorsClass = "java/util/concurrent/Executors"

    }

    override fun modifyClass(byteArray: ByteArray): ByteArray {
        val classNode = ClassNode()
        val classReader = ClassReader(byteArray)
        classReader.accept(classNode, ClassReader.EXPAND_FRAMES)
        val methods = classNode.methods
        val taskList = mutableListOf<() -> Unit> ()if(! methods.isNullOrEmpty()) {for (methodNode in methods) {
                valinstructionIterator = methodNode.instructions? .iterator()if(instructionIterator ! =null) {
                    while (instructionIterator.hasNext()) {
                        val instruction = instructionIterator.next()
                        when (instruction.opcode) {
                            Opcodes.INVOKESTATIC -> {
                                val methodInsnNode = instruction as? MethodInsnNode
                                if(methodInsnNode? .owner == executorsClass) { taskList.add { transformInvokeStatic( classNode, methodNode, instruction ) } } } } } } } } taskList.forEach { it.invoke() }val classWriter = ClassWriter(ClassWriter.COMPUTE_MAXS)
        classNode.accept(classWriter)
        return classWriter.toByteArray()
    }

    private fun transformInvokeStatic(
        classNode: ClassNode,
        methodNode: MethodNode,
        methodInsnNode: MethodInsnNode
    ) {
        val pointMethod = config.threadHookPointList.find { it.methodName == methodInsnNode.name }
        if(pointMethod ! =null) {
            // Replace Executors with OptimizedExecutors
            methodInsnNode.owner = config.formatOptimizedThreadPoolClass
            // Insert an additional String method input parameter declaration for directives calling methods such as newFixedThreadPool
            methodInsnNode.insertArgument(String::class.java)
            // Pass ClassName as an input parameter to methods such as newFixedThreadPool
            methodNode.instructions.insertBefore(
                methodInsnNode,
                LdcInsnNode(classNode.simpleClassName)
            )
        }
    }

}
Copy the code

Three, need to be careful

It should be noted that not all of the above methods can be applied in a project

It makes sense to have a uniform and easily identifiable thread name for threads to help quickly locate problems when exceptions occur. However, setting a uniform super long time for the thread pool and allowing core threads to be recycled, which needs to consider the usage scenario, may increase the probability of OOM

In most cases, if the thread pool needs to handle fewer tasks and does not allow the recycling of the core thread, the core thread will be blocked for most of the time and cannot be released, wasting system resources. In this case, it is meaningful to allow the recycling of the core thread through hook. And if the thread pool to deal with the task it is to belong to perform regularly, such as fixed every six seconds there will be a new task coming, at this time through setting the thread hook timeout for five seconds, then could lead to a thread has just been recovery soon and needs to be created, this repeatedly creating and recycling were increasing overhead, Instead, allow the core thread to stay. In addition, it takes some time for a thread to be created until it is selected by the system. This is not suitable for some scenarios that require high response speed, and it is not appropriate to allow the recycling of core threads

Therefore, when regulating the thread pool, you need to consider what business scenarios the thread pool applies to and configure it according to its usage

For details on the implementation of thread pools, see my ThreadPoolExecutor undecipher, which explains the design and implementation in detail

Four, the source code

Finally also give the complete source code: ASM_Transform

Which contains the full implementation of this article source code, also contains the last article: ASM bytecode staking implementation of double – click anti – shake realization of the double – click anti – shake function, hope to help you 🤣🤣

Implementation of ASM bytecode staking for thread regulation

Anonymous threads

Thread pools

Three, need to be careful

Four, the source code

Related Posts

Get rid of ugly toasts. Moving toasts are more interesting!

Android build time inserts, lets the application write its own code (2)

Linux0.11 kernel source code analysis 1-main function run before preparation