Related articles parse the ClassLoader series

preface

In Android application development, hot repair technology is used by more and more developers, there are also many hot repair framework, such as AndFix, Tinker, Dexposed and Nuwa and so on. It doesn’t mean much if we just know how to use these thermal repair frameworks. We also need to understand their principles, so that no matter how thermal repair frameworks change, as long as the basic principles remain the same, we can quickly master them. This series will not parse the source code of some hotfix frameworks, but rather explain the general principles of hotfix frameworks.

1. Overview of the generation of thermal repair

During development we encounter the following situations:

  1. There are serious bugs in the newly released version, which requires fixing bugs, testing, and re-releasing channels in various app markets, which can be costly and costly.
  2. We have corrected the bugs of the previous version. If the next version is a big version, the interval between the two versions will be very long, so we have to wait until the next big version is released to fix the bugs, which will affect users for a long time.
  3. The version upgrade rate is not high, and it takes a long time to complete the version coverage, and the bugs of the previous version will continue to affect users who do not upgrade the version.
  4. There is a small but important feature that requires a short period of time to complete version coverage, such as holiday events.

To solve the above problems, the thermal repair framework was created. For Bug handling, developers should not rely too much on the hotfix framework. In the development process, they should do a good job of self-testing according to the standard process and cooperate with testers to complete the test process.

2. Comparison of thermal repair frames

There are various types of thermal repair frameworks, which are divided into the following types according to the company team:

category Members of the
Ali is AndFix, Dexposed, Alibaichuan, Sophix
Tencent is Wechat Tinker, QQ space super patch, mobile QQ QFix
Well-known companies Meituan Robust, Ele. me Amigo, Meituan Mushroom Street Aceso
other RocooFix, Nuwa, AnoleFix

Although many hot fix frame, hot fix framework, the core technology of the three main code fixes, dynamic link library resources to repair and repair, each of which the core technology and has a lot of different technical solutions, each technical solution is different, in addition the restoration heat framework continues to update iteration, As you can see, the technical implementation of a hotfix framework is variable. As a development requirement, it is necessary to understand the rationale of these technical solutions, so that they can be changed.

A comparison of some thermal repair frames is shown in the table below.

features AndFix Tinker/Amigo QQ space Robust/Aceso
Effective immediately is no no is
Methods to replace is is is is
Class to replace no is is no
Class structure modification no is no no
Resources to replace no is is no
So to replace no is no no
Support gradle no is no no
Support for ART is is is is
Support Android7.0 is is is is

We can choose the appropriate hot repair framework according to the above table and specific business. Of course, the information in the above table is difficult to be completely accurate, because some hot repair frameworks are still being updated and iterated. We can also see from the table that Tinker and Amigo have the most features. Should WE pick them? Not necessarily, having a lot of features means a lot of code for the framework, and we need to choose the best one for our business, assuming we’re just going to use method substitution, then Tinker and Amigo are overkill. In addition, if the project needs to be implemented immediately, Tinker and Amigo will not suffice. For immediate effect, AndFix, Robust, and Aceso all satisfy this, because the code fix of AndFix uses low-level substitution, while the code fix of Robust and Aceso borrows from the principle of Instant Run, and we will learn about code fix now.

3. Code fixes

There are three main schemes for code repair, namely the underlying replacement scheme, class loading scheme and Instant Run scheme.

3.1 Class loading scheme

Class loading schemes are based on Dex subcontracting schemes. What is Dex subcontracting schemes? So let’s start with the 65536 constraint and the LinearAlloc constraint. As the application functions become more and more complex, the amount of code increases, and more and more libraries are introduced, the following exceptions may be displayed during compilation:

com.android.dex.DexIndexOverflowException: method ID not in [0, 0xffff]: 65536
Copy the code

This indicates that the number of methods referenced in the application exceeds the maximum number of 65536 methods. The reason for this problem is the 65536 restriction of the system, which is mainly caused by the restriction of DVM Bytecode. The invoke-kind index of the method invocation instruction of the DVM instruction set is 16bits and can reference a maximum of 65535 methods. LinearAlloc limits may prompt INSTALL_FAILED_DEXOPT at installation time. The LinearAlloc in DVM is a fixed cache. When the number of methods exceeds the size of the cache, an error will be reported.

To address 65536 constraints and LinearAlloc constraints, the Dex subcontracting scheme was created. The Dex subcontracting scheme mainly divides the application code into multiple Dex during packaging, and puts the classes that must be used when the application is started and the directly referenced classes of these classes into the main Dex, and other codes into the secondary Dex. When the application starts, load the primary Dex first, and load the secondary Dex dynamically after the application starts, thus alleviating the 65536 limitation of the primary Dex and the LinearAlloc limitation.

Dex subcontracting schemes are mainly divided into two types: Google official scheme, Dex automatic unpacking scheme and dynamic loading scheme. Since the Dex subcontracting scheme is not the focus of this chapter, we will not cover it too much here, and we will move on to the class loading scheme. The loading process of the ClassLoader is described in Android ClassLoader, one of the links is to call the DexPathList findClass method, as shown below. libcore/dalvik/src/main/java/dalvik/system/DexPathList.java

 publicClass<? > findClass(String name, List<Throwable> suppressed) {for (Element element : dexElements) {/ / 1Class<? > clazz = element.findClass(name, definingContext, suppressed);/ / 2
            if(clazz ! =null) {
                returnclazz; }}if(dexElementsSuppressedExceptions ! =null) {
            suppressed.addAll(Arrays.asList(dexElementsSuppressedExceptions));
        }
        return null;
    }
Copy the code

Element encapsulates DexFile, which is used to load dex files. Therefore, each DEX file corresponds to one Element. Multiple elements form an ordered Element array, dexElements. To find a class, the Element array dexElements is iterated at comment 1 (equivalent to iterating through an array of dex files), and Element’s findClass method is called at comment 2, which calls DexFile’s loadClassBinaryName method inside to find the class. The class is returned if it is found in Element (dex file), and continues to look for it in the next Element if not. According to the above search process, we will modify the buggy class key. class, and package the key. class into a Patch package containing dex, and put it in the first Element of Element dexElements. In this way, the key. class in patch. dex will be found first to replace the previously buggy key. class. The buggy key. class in dex file next to the array will not be loaded according to the parent delegate mode of ClassLoader, which is the class loading scheme, as shown below.

The class loading solution requires the App to restart and then the ClassLoader to reload the new class. Why do you need to restart? This is because classes cannot be uninstalled, so reloading a new class requires a restart of the App, so a hot fix framework using class loading is not immediately effective. Although many hotfix frameworks use class loading solutions, there are some differences in the implementation details and steps. For example, qzone’s super patch and Nuwa put the patch package in the first Element of the Element array as described above to get priority loading. Wechat Tinker diff the old and new APK to get patch.dex, and then merge patch.dex with the classes.dex of APK in the phone to generate the new classes.dex. Classes.dex is then reflected at run time into the first Element of the Element array. Ele. me Amigo takes the Element corresponding to each dex in the patch pack and then forms a new Element array, replacing the existing Element array with the new Element array by reflection at runtime.

The class loading scheme is mainly used by Tencent, including wechat Tinker, Qzone super patch, mobile QQ QFix, Ele. me Amigo and Nuwa and so on.

3.2 Underlying Replacement Schemes

Different from the class loading scheme, the low-level replacement scheme will not load new classes again, but directly modify the original class in the Native layer. Since modification in the original class is limited, methods and fields of the original class cannot be added or deleted. If we increase the number of methods, the number of method indexes will also increase. Accessing methods in this way will not find the correct method through the index, and the same is true for the same field. Underlying replacement scheme and the principle of reflection some connection, take the replacement method, the method of reflection we can invoke the Java lang. Class. GetDeclaredMethod, suppose that we should reflect the Key show method, invoked as shown below.

   Key.class.getDeclaredMethod("show").invoke(Key.class.newInstance());
Copy the code

The Invoke method of Android 8.0 is shown below. libcore/ojluni/src/main/java/java/lang/reflect/Method.java

    @FastNative
    public native Object invoke(Object obj, Object... args)
            throws IllegalAccessException, IllegalArgumentException, InvocationTargetException;

Copy the code

The invoke method is a native method. The Jni layer code is art/ Runtime /native/ java_lang_reflect_method.cc

static jobject Method_invoke(JNIEnv* env, jobject javaMethod, jobject javaReceiver, jobject javaArgs) {
  ScopedFastNativeObjectAccess soa(env);
  return InvokeMethod(soa, javaMethod, javaReceiver, javaArgs);
Copy the code

Method_invoke function again call InvokeMethod function: art/runtime/reflection. The cc

jobject InvokeMethod(const ScopedObjectAccessAlreadyRunnable& soa, jobject javaMethod,
                     jobject javaReceiver, jobject javaArgs, size_t num_frames) {... ObjPtr<mirror::Executable> executable = soa.Decode<mirror::Executable>(javaMethod);const bool accessible = executable->IsAccessible();
  ArtMethod* m = executable->GetArtMethod();/ / 1. }Copy the code

The ArtMethod structure contains all the information about the Java method, including the execution entry, access rights, class and code execution address, etc. The structure of ArtMethod is shown below. art/runtime/art_method.h

class ArtMethod FINAL {.protected:
  GcRoot<mirror::Class> declaring_class_;
  std::atomic<std: :uint32_t> access_flags_;
  uint32_t dex_code_item_offset_;
  uint32_t dex_method_index_;
  uint16_t method_index_;
  uint16_t hotness_count_;
 struct PtrSizedFields {
    ArtMethod** dex_cache_resolved_methods_;/ / 1
    void* data_;
    void* entry_point_from_quick_compiled_code_;/ / 2
  } ptr_sized_fields_;
}
Copy the code

The more important fields in the ArtMethod structure are dex_cache_resolved_methods_ at comment 1 and entry_point_from_quick_compiled_code_ at comment 2, which are the entry points to the method. When we call a method (such as Key’s show method), we get the entry to the show method, which we can skip over to execute. Replace the fields in the ArtMethod structure or replace the entire ArtMethod structure, that’s the underlying substitution. AndFix replaces fields in the ArtMethod structure, which can cause compatibility problems because vendors can modify the ArtMethod structure and cause method replacements to fail. Sophix replaces the entire ArtMethod structure so there are no compatibility issues. The underlying replacement replaces the method directly and takes effect immediately without a restart. The underlying replacement scheme is mainly ali system, including AndFix, Dexposed, Alibaichuan, Sophix.

3.3 Instant Run Solution

In addition to resource repair, code repair can also learn from the principles of Instant Run. It can be said that the emergence of Instant Run promoted the development of hot repair frameworks. When Instant Run first builds APK, it uses ASM to inject code in each method like the following:

IncrementalChange localIncrementalChange = $change;/ / 1
		if(localIncrementalChange ! =null) {/ / 2
			localIncrementalChange.access$dispatch(
					"onCreate.(Landroid/os/Bundle;) V".new Object[] { this,
							paramBundle });
			return;
		}
Copy the code

Note 1 shows a member variable localIncrementalChange with a value of $change, which implements the IncrementalChange abstract interface. When we click InstantRun, if the method does not change then $change is null and return is called without doing anything. If the method changes, the replacement class is generated. Here we assume that MainActivity’s onCreate method has been modified, and the replacement class MainActivity$Override is generated. This class implements the IncrementalChange interface. It also generates an AppPatchesLoaderImpl class, whose getPatchedClasses method returns a list of modified classes (including MainActivity), The list sets MainActivity$change to MainActivity$override. Therefore, if the conditions in comment 2 are met, the access$dispatch method is executed. The access$dispatch method uses the “onCreate.(Landroid/ OS /Bundle;) The onCreate method is modified by executing MainActivity$override. Robust and Aceso are the hot repair frameworks that refer to the principle of Instant Run.