preface

Hot repair technology is a relatively advanced and popular knowledge point in Android development at present, and it is a skill that intermediate developers must master in leading to advanced development. At the same time, the current Android industry, thermal repair technology is also a hundred flowers bloom, major manufacturers have launched their own thermal repair solutions, the use of technical solutions are also different, of course, each solution has its own limitations. I hope that through this paper, I can understand the comparison and realization principle of these thermal repair solutions, master the essence of thermal repair technology, and apply the practice to actual projects to help everyone put what they have learned into practice (study notes are shared at the end of the article). 六四运动

What is hot repair

In simple terms, in order to fix the online problems proposed by the patch, the process does not need to be reissued!

Normal release development vs. hot fix development process

Why learn hotfix

** In normal software development process, offline development -> go online -> Find bugs -> emergency fix go online. ** However, this approach is too costly and will never avoid the following problems:

  1. Is the release Bug free?

  2. Will the fixed version keep users up to date?

  3. How do you minimize the impact of online bugs on your business?

** In comparison, the development process of hot repair is more flexible, no need to re-issue, real-time and efficient hot repair, no need to download a new application, the cost is small, the most important thing is to fix the bug in time. ** And with the development of hot repair technology, it is now possible to repair not only code, but also resource files and SO libraries.

How to choose the appropriate thermal repair technology solution?

At the beginning of the article, it is said that now major factories have launched their own thermal repair solutions, so how should we choose a set of suitable thermal repair technology to learn? Next, I will give you the answer by comparing the current mainstream thermal repair schemes.

Domestic mainstream thermal repair technology solutions

1. Ali Department

Sophix Not open source, commercial fee, real-time/cold start fix

HotFix is an optimized version of AndFix, and Sophix is an optimized version of HotFix. Ali’s current flagship is the Sophix.

2. Tencent

Qzone super patch Qzone, not open source, cold start fix QFix hand Q team, open source, cold start fix Tinker wechat team, open source, cold start fix. Provide distribution management, basic version free

3, other

Robust Meituan, open source, real-time fix Nuwa Dianping, open source, cold start fix Amigo ele. me, open source, cold start fix

Comparison of various thermal repair schemes

How to choose the right thermal repair program how to choose? This can only say that everything depends on demand. If the comprehensive strength of the company is strong, it is no problem to completely consider the research, but the cost and maintenance need to be considered comprehensively. Two suggestions are given as follows:

  1. The project requirements
  • Just a simple method level Bug fix?

  • Need resources and so library fixes?

  • Requirements for platform compatibility and success rate?

  • Need to control distribution, monitor data statistics, patch pack management?

  • Do company resources support commercial payments?

  1. Cost of learning and use
  • Integration of the difficulty

  • Code intrusion

  • Debugging maintenance

  1. Choose giant
  • Technical performance is guaranteed

  • Special maintenance

  • High popularity, active open source community

  1. If you are considering payment, I recommend the Sophix from Ali. Sophix is the product of comprehensive optimization, complete features, simple and transparent development, distribution and monitoring management. If payment is not considered, only method level Bug fixes are supported, resources and SO are not supported, and Robust is recommended. If you need to support both resources and SO, Tinker is recommended. Finally, if the company has strong comprehensive strength, it can consider self-development, which is the most flexible and controllable.

Principle of thermal repair technical solution

The technical classification

image

NativeHook principle

Principle and Implementation

The principle of NativeHook is to exchange the structure information of the method directly in the native layer, so as to realize the perfect replacement of the old and new methods, so as to realize the thermal repair function. Here is a piece of JNI code from AndFix:

void replace_6_0(JNIEnv* env, jobject src, Art ::mirror::ArtMethod* smeth = (art::mirror::ArtMethod*) env->FromReflectedMethod(src); art::mirror::ArtMethod* dmeth = (art::mirror::ArtMethod*) env->FromReflectedMethod(dest); reinterpret_cast<art::mirror::Class*>(dmeth->declaring_class_)->class_loader_ = reinterpret_cast<art::mirror::Class*>(smeth->declaring_class_)->class_loader_; //for plugin classloader reinterpret_cast<art::mirror::Class*>(dmeth->declaring_class_)->clinit_thread_id_ = reinterpret_cast<art::mirror::Class*>(smeth->declaring_class_)->clinit_thread_id_; reinterpret_cast<art::mirror::Class*>(dmeth->declaring_class_)->status_ = reinterpret_cast<art::mirror::Class*>(smeth->declaring_class_)->status_-1; //for reflection invoke reinterpret_cast<art::mirror::Class*>(dmeth->declaring_class_)->super_class_ = 0; // Replace all member variables of the old function with smeth->declaring_class_ = dmeth->declaring_class_; smeth->dex_cache_resolved_methods_ = dmeth->dex_cache_resolved_methods_; smeth->dex_cache_resolved_types_ = dmeth->dex_cache_resolved_types_; smeth->access_flags_ = dmeth->access_flags_ | 0x0001; smeth->dex_code_item_offset_ = dmeth->dex_code_item_offset_; smeth->dex_method_index_ = dmeth->dex_method_index_; smeth->method_index_ = dmeth->method_index_; smeth->ptr_sized_fields_.entry_point_from_interpreter_ = dmeth->ptr_sized_fields_.entry_point_from_interpreter_; smeth->ptr_sized_fields_.entry_point_from_jni_ = dmeth->ptr_sized_fields_.entry_point_from_jni_; smeth->ptr_sized_fields_.entry_point_from_quick_compiled_code_ = dmeth->ptr_sized_fields_.entry_point_from_quick_compiled_code_; LOGD("replace_6_0: %d , %d", smeth->ptr_sized_fields_.entry_point_from_quick_compiled_code_, dmeth->ptr_sized_fields_.entry_point_from_quick_compiled_code_); }void setFieldFlag_6_0(JNIEnv* env, jobject field) { art::mirror::ArtField* artField = (art::mirror::ArtField*) env->FromReflectedField(field); artField->access_flags_ = artField->access_flags_ & (~0x0002) | 0x0001; LOGD("setFieldFlag_6_0: %d ", artField->access_flags_); }Copy the code

Each Java method corresponds to an ArtMethod in art, which records all information about the Java method, including access rights and code execution addresses. Env ->FromReflectedMethod to get the real starting address of the ArtMethod corresponding to the method, and then force the ArtMethod pointer to modify all its members.

Later calls to this method will go directly to the implementation of the new method, achieving the effect of a hot fix.

advantages

  • Effective immediately

  • There is no performance overhead, no editor staking or code rewriting required

disadvantages

  • Stability and compatibility problems exist. The structure of ArtMethod basically refers to the open source code of Google, and the ROM of various manufacturers may be changed, which may lead to structural inconsistency and repair failure.

  • You can’t add variables or classes, you can only fix bugs at the method level, and you can’t release new features

JavaHook principle

Principle and Implementation

Taking Meituan’s Robust as an example, the principle of the Robust can be simply described as follows:

Insert a static variable of type ChangeQuickRedirect in front of each method when building the base package. The insert process is completely transparent to business development

2. When loading the patch, read the class to be replaced and the method to be replaced from the patch package, and create a ClassLoader to load the patch dex. When the changeQuickRedirect is not null, accessDispatch may be executed to replace the old logic for the purpose of fix

The following analysis is carried out through the source code of Robust. First take a look at the code logic inserted into the base package as follows:

public static ChangeQuickRedirect u; Protected void onCreate(Bundle Bundle) {// Automatically insert repair logic for each method. If ChangeQuickRedirect is empty, do not execute if (u! = null) { if (PatchProxy.isSupport(new Object[]{bundle}, this, u, false, 78)) { PatchProxy.accessDispatchVoid(new Object[]{bundle}, this, u, false, 78); return; } } super.onCreate(bundle); . }Copy the code

The core source code of Robust repair is as follows:

public class PatchExecutor extends Thread { @Override public void run() { ... applyPatchList(patches); . } /** * protected void applyPatchList(List<Patch> patches) { for (Patch p : patches) { ... currentPatchResult = patch(context, p); . }} /** * protected Boolean patch(Context Context, patch patch) {... DexClassLoader ClassLoader = new DexClassLoader(patch.gettemppath) context.getCacheDir().getAbsolutePath(), null, PatchExecutor.class.getClassLoader()); patch.delete(patch.getTempPath()); . try { patchsInfoClass = classLoader.loadClass(patch.getPatchesInfoImplClassFullName()); patchesInfo = (PatchesInfo) patchsInfoClass.newInstance(); } catch (Throwable t) { ... }... ChangeQuickRedirect for (PatchedClassInfo PatchedClassInfo: patchedClasses) {// Change the value of the ChangeQuickRedirect object by iterating through the class information. try { oldClass = classLoader.loadClass(patchedClassName.trim()); Field[] fields = oldClass.getDeclaredFields(); for (Field field : fields) { if (TextUtils.equals(field.getType().getCanonicalName(), ChangeQuickRedirect.class.getCanonicalName()) && TextUtils.equals(field.getDeclaringClass().getCanonicalName(), oldClass.getCanonicalName())) { changeQuickRedirectField = field; break; }}... try { patchClass = classLoader.loadClass(patchClassName); Object patchObject = patchClass.newInstance(); changeQuickRedirectField.setAccessible(true); changeQuickRedirectField.set(null, patchObject); } catch (Throwable t) { ... } } catch (Throwable t) { ... } } return true; }}Copy the code

advantages

  • The Robust is highly compatible (only using DexClassLoader normally) and stable, and the repair success rate is as high as 99.9%

  • The patch takes effect in real time and no restart is required

  • Support for method-level fixes, including static methods

  • Support for adding methods and classes

  • ProGuard obfuscation, inlining, and optimization are supported

disadvantages

  • The code is intrusive, adding related code to the existing class

  • Replacement of SO and resources is currently not supported

  • It increases the size of apK by 17.47 bytes per function and 1.67m per 100,000 functions

Java mulitdex principle

Principle and Implementation

Android uses BaseDexClassLoader, PathClassLoader and DexClassLoader to read class data from DEX file. PathClassLoader and DexClassLoader inherit from BaseDexClassLoader. The dex file is converted into a dexFile object and stored in the Element[] array. Findclass iterates through the Element array to obtain dexFile, and then executes the FindClass of dexFile. The source code is as follows:

Public class findClass(String name, List<Throwable> suppressed) {// Iterate over the dex queried from dexPath and resource Element for (Element Element: dexElements) { DexFile dex = element.dexFile; // If the current Element is a dex file Element if (dex! = null) {/ / using DexFile loadClassBinaryName loading Class clazz = dex. LoadClassBinaryName (name, definingContext, suppressed); if (clazz ! = null) { return clazz; } } } if (dexElementsSuppressedExceptions ! = null) { suppressed.addAll(Arrays.asList(dexElementsSuppressedExceptions)); } return null; }Copy the code

So the principle of this solution is the Hook this pathList. DexElements [], insert the patches of dex at the forefront of the array. Because the findClass of the ClassLoader finds classes by iterating through the dex in dexElements[]. So the classes that are fixed will be found first. So as to achieve the effect of repair.

Nuwa’s key implementation source code is used as follows:

public static void injectDexAtFirst(String dexPath, String defaultDexOptPath) throws NoSuchFieldException, IllegalAccessException, ClassNotFoundException {// Create a ClassLoader to load the patch Dex DexClassLoader DexClassLoader = new DexClassLoader(dexPath, defaultDexOptPath, dexPath, getPathClassLoader()); Object baseDexElements = getDexElements(getPathList(getPathClassLoader())); DexElements array Object newDexElements = getDexElements(getPathList(dexClassLoader)); Object allDexElements = combineArray(newDexElements, baseDexElements); Object pathList = getPathList(getPathClassLoader()); // Update the old ClassLoader Element array reflectionutils.setfield (pathList, pathlist.getClass (), "dexElements", allDexElements); } private static PathClassLoader getPathClassLoader() { PathClassLoader pathClassLoader = (PathClassLoader) DexUtils.class.getClassLoader(); return pathClassLoader; } private static Object getDexElements(Object paramObject) throws IllegalArgumentException, NoSuchFieldException, IllegalAccessException { return ReflectionUtils.getField(paramObject, paramObject.getClass(), "dexElements"); } private static Object getPathList(Object baseDexClassLoader) throws IllegalArgumentException, NoSuchFieldException, IllegalAccessException, ClassNotFoundException { return ReflectionUtils.getField(baseDexClassLoader, Class.forName("dalvik.system.BaseDexClassLoader"), "pathList"); } private static Object combineArray(Object firstArray, Object secondArray) { Class<? > localClass = firstArray.getClass().getComponentType(); int firstArrayLength = Array.getLength(firstArray); int allLength = firstArrayLength + Array.getLength(secondArray); Object result = Array.newInstance(localClass, allLength); for (int k = 0; k < allLength; ++k) { if (k < firstArrayLength) { Array.set(result, k, Array.get(firstArray, k)); } else { Array.set(result, k, Array.get(secondArray, k - firstArrayLength)); } } return result; }Copy the code

advantages

  • There is no need to consider adapting the Dalvik virtual machine to the ART virtual machine

  • The code is non-invasive and has little impact on APK volume

disadvantages

  • Need the next startup to repair

  • High performance loss. In order to avoid CLASS_ISPREVERIFIED for the class, a peg is used to place a separate help class in an independent DEX for other classes to call.

Dex to replace

Principle and Implementation

In order to avoid performance loss caused by DEX pile insertion, dex replacement is adopted in another way. The principle is to provide dex differential package to replace DEX as a whole. The patch.dex and the classes.dex of the application are merged into a complete dex. The complete dex is loaded with a dexFile object as a parameter to construct an Element object, and the old dex-Elements array is replaced as a whole.

This is also the scheme adopted by wechat Tinker, and Tinker developed the DexDiff/DexMerge algorithm. Tinker also supports resource and So package updates. So patch packages are generated using BsDiff, resource patch packages are directly generated using MD5 file comparison, and BsDiff is used to generate differential patches for files with large resources (large files are defined as files larger than 100KB by default).

Of course, the specific implementation algorithm is very complex. We only look at the key implementation. Finally, the repair tryPatch method in UpgradePatch is as follows:

@override public Boolean tryPatch(Context Context, String tempPatchPath, PatchResult PatchResult) { . // The following is the key diff algorithm and the merge implementation. The implementation is relatively complex. We use destPatchFile instead of patchFile, because patchFile may be deleted during the patch process if (! DexDiffPatchInternal.tryRecoverDexFiles(manager, signatureCheck, context, patchVersionDirectory, destPatchFile)) { TinkerLog.e(TAG, "UpgradePatch tryPatch:new patch recover, try patch dex failed"); return false; } if (! BsDiffPatchInternal.tryRecoverLibraryFiles(manager, signatureCheck, context, patchVersionDirectory, destPatchFile)) { TinkerLog.e(TAG, "UpgradePatch tryPatch:new patch recover, try patch library failed"); return false; } if (! ResDiffPatchInternal.tryRecoverResourceFiles(manager, signatureCheck, context, patchVersionDirectory, destPatchFile)) { TinkerLog.e(TAG, "UpgradePatch tryPatch:new patch recover, try patch resource failed"); return false; } // check dex opt file at last, some phone such as VIVO/OPPO like to change dex2oat to interpreted if (! DexDiffPatchInternal.waitAndCheckDexOptFile(patchFile, manager)) { TinkerLog.e(TAG, "UpgradePatch tryPatch:new patch recover, check dex opt file failed"); return false; } if (! SharePatchInfo.rewritePatchInfoFileWithLock(patchInfoFile, newInfo, patchInfoLockFile)) { TinkerLog.e(TAG, "UpgradePatch tryPatch:new patch recover, rewrite patch info failed"); manager.getPatchReporter().onPatchInfoCorrupted(patchFile, newInfo.oldVersion, newInfo.newVersion); return false; } TinkerLog.w(TAG, "UpgradePatch tryPatch: done, it is ok"); return true; }Copy the code

advantages

  • Compatibility of high

  • Small patches

  • Development is transparent and code is non-invasive

disadvantages

  • Cold start repair, next start repair

  • Dex merge memory consumes vm head, which is easy to OOM, and finally leads to merge failure

Principles of Resource Recovery

Instant Run

Build a new AssetManager and add the complete new resource bundle to the AssetManager by calling addAssertPath via reflection. This results in an AssetManager with all the new resources

2, find all valuable references to the original AssetManager, by reflection, replace the reference with AssetManager

public static void monkeyPatchExistingResources(Context context, String externalResourceFile, Collection activities) { if (externalResourceFile == null) { return; } try {// Reflect a newAssetManager AssetManager newAssetManager = (AssetManager) assetManager.class.getconstructor (new Class[0]).newInstance(new Object[0]); / / reflection addAssetPath add new resource bundle Method mAddAssetPath = AssetManager. Class. GetDeclaredMethod (" addAssetPath ", new Class[]{String.class}); mAddAssetPath.setAccessible(true); if (((Integer) mAddAssetPath.invoke(newAssetManager, new Object[]{externalResourceFile})).intValue() == 0) { throw new IllegalStateException( "Could not create new AssetManager"); } Method mEnsureStringBlocks = AssetManager.class.getDeclaredMethod("ensureStringBlocks", new Class[0]); mEnsureStringBlocks.setAccessible(true); mEnsureStringBlocks.invoke(newAssetManager, new Object[0]); If (Activities! = null) { for (Activity activity : activities) { Resources resources = activity.getResources(); try { Field mAssets = Resources.class.getDeclaredField("mAssets"); mAssets.setAccessible(true); mAssets.set(resources, newAssetManager); } catch (Throwable ignore) { Field mResourcesImpl = Resources.class.getDeclaredField("mResourcesImpl"); mResourcesImpl.setAccessible(true); Object resourceImpl = mResourcesImpl.get(resources); Field implAssets = resourceImpl.getClass().getDeclaredField("mAssets"); implAssets.setAccessible(true); implAssets.set(resourceImpl, newAssetManager); } Resources.Theme theme = activity.getTheme(); try { try { Field ma = Resources.Theme.class.getDeclaredField("mAssets"); ma.setAccessible(true); ma.set(theme, newAssetManager); } catch (NoSuchFieldException ignore) { Field themeField = Resources.Theme.class.getDeclaredField("mThemeImpl"); themeField.setAccessible(true); Object impl = themeField.get(theme); Field ma = impl.getClass().getDeclaredField("mAssets"); ma.setAccessible(true); ma.set(impl, newAssetManager); } Field mt = ContextThemeWrapper.class.getDeclaredField("mTheme"); mt.setAccessible(true); mt.set(activity, null); Method mtm = ContextThemeWrapper.class.getDeclaredMethod("initializeTheme", new Class[0]); mtm.setAccessible(true); mtm.invoke(activity, new Object[0]); Method mCreateTheme = AssetManager.class.getDeclaredMethod("createTheme", new Class[0]); mCreateTheme.setAccessible(true); Object internalTheme = mCreateTheme.invoke(newAssetManager, new Object[0]); Field mTheme = Resources.Theme.class.getDeclaredField("mTheme"); mTheme.setAccessible(true); mTheme.set(theme, internalTheme); } catch (Throwable e) { Log.e("InstantRun", "Failed to update existing theme for activity " + activity, e); } pruneResourceCaches(resources); } } Collection references; if (Build.VERSION.SDK_INT >= 19) { Class resourcesManagerClass = Class.forName("android.app.ResourcesManager"); Method mGetInstance = resourcesManagerClass.getDeclaredMethod("getInstance", new Class[0]); mGetInstance.setAccessible(true); Object resourcesManager = mGetInstance.invoke(null, new Object[0]); try { Field fMActiveResources = resourcesManagerClass.getDeclaredField("mActiveResources"); fMActiveResources.setAccessible(true); ArrayMap arrayMap = (ArrayMap) fMActiveResources.get(resourcesManager); references = arrayMap.values(); } catch (NoSuchFieldException ignore) { Field mResourceReferences = resourcesManagerClass.getDeclaredField("mResourceReferences"); mResourceReferences.setAccessible(true); references = (Collection) mResourceReferences.get(resourcesManager); } } else { Class activityThread = Class.forName("android.app.ActivityThread"); Field fMActiveResources = activityThread.getDeclaredField("mActiveResources"); fMActiveResources.setAccessible(true); Object thread = getActivityThread(context, activityThread); HashMap map = (HashMap) fMActiveResources.get(thread); references = map.values(); } for (WeakReference wr : references) { Resources resources = (Resources) wr.get(); if (resources ! = null) { try { Field mAssets = Resources.class.getDeclaredField("mAssets"); mAssets.setAccessible(true); mAssets.set(resources, newAssetManager); } catch (Throwable ignore) { Field mResourcesImpl = Resources.class.getDeclaredField("mResourcesImpl"); mResourcesImpl.setAccessible(true); Object resourceImpl = mResourcesImpl.get(resources); Field implAssets = resourceImpl.getClass().getDeclaredField("mAssets"); implAssets.setAccessible(true); implAssets.set(resourceImpl, newAssetManager); } resources.updateConfiguration(resources.getConfiguration(), resources.getDisplayMetrics()); } } } catch (Throwable e) { throw new IllegalStateException(e); }}Copy the code

So repair principle

Interface call replacement

The SDK provides an interface to replace System’s default interface for loading so library

SOPatchManger. LoadLibrary (String libName) replace System. LoadLibrary (String libName)Copy the code

SOPatchManger. LoadLibrary interface load so library priority when trying to load the SDK directory specified patch so. If it does not exist, then load the so library under the apK directory

Advantages: There is no need to be compatible with different SDK versions, so all SDK versions use the system. loadLibrary interface

Disadvantages: Need to invade the business code, replace the System default loading so library interface

Reflection injection

Using a similar class repair reflection injection mode, as long as the patch so library path is inserted at the front of the nativeLibraryDirectories array, it will be able to load so library is the patch so library instead of the original so directory, so to achieve the repair.

public String findLibrary(String libraryName) { String fileName = System.mapLibraryName(libraryName); for (NativeLibraryElement element : nativeLibraryPathElements) { String path = element.findNativeLibrary(fileName); if (path ! = null) { return path; } } return null; }Copy the code

Advantages: No need to invade the user interface to call

Disadvantages: need to do version compatibility control, poor compatibility

What are the issues to be aware of when using hot repair technology?

Version management

Due to the changes in the release process after the use of hot fix technology, the corresponding branch management is also required for control.

The usual branch management for mobile development is the feature branch, as follows:

Branch description Master main branch (only merge, not commit, set permissions), used to manage online version, timely set the corresponding Tagdev development branch, each new version of the development is created based on the version number of the main branch, after the test passes the verification, online join the master branch function X function branch, Set as required. Based on the development branch creation, after the completion of function development, join dev development branch

After hot repair is enabled, you are advised to use the following branch policies:

Branch Description Master Master branch (merge, not commit, permission setting), used to manage online versions and set the corresponding Tag (usually 3-bit version number) hot_fix hot_fix branches. Hot_fix will be merged into the master branch after the test push. Tag the Master branch again. Dev development branch. The r&d of each new version is created based on the main branch according to the version number. After passing the test and verification, the branch is connected to the master branch, and the function X branch is set as required. Based on the development branch creation, after the completion of function development, join dev development branch

Notice The test and release process of the hot repair branch is consistent with that of the normal version to ensure the quality.

Distributed monitoring

Current mainstream hotfix solutions like Tinker and Sophix provide patch distribution and monitoring. This is also one of the key factors we need to consider when choosing the thermal repair technology solution. After all, distribution control and real-time monitoring are essential to ensure the quality of the online version.

The last

If you want to have an in-depth understanding of hot fix, you need to understand the class loading mechanism, Instant Run, Multidex, Java low-level implementation details, JNI, AAPT and virtual machine knowledge, which requires a huge store of knowledge for in-depth understanding. Of course, the implementation details of Android Framwork are very important. Being familiar with the principle of hot repair helps us to provide our programming level and improve our ability to solve problems. Finally, hot repair is not a simple client SDK, but also includes security mechanism and server control logic, and the whole link can not be completed in a short time.

Therefore, in order to facilitate friends to learn and master Android hot repair technology more intuitively and quickly, I collect and organize a set of video + e-book hot repair series learning materials here. The video course is delivered by Lance, a senior engineer of IQiyi. The Qzone thermal repair actual combat project is taken as an example to explain the thermal repair technology in a comprehensive way. The e-book comes from Ali’s “In-depth Exploration of The Principles of Android Hotfix Technology”, which has a very in-depth interpretation of the hotfix technology.

Due to space reasons, here only do some screenshots show, need complete information of friends, you can like + comment, I free access to the background private letter!

Hotfix learning video

Hotfix video content

An in-depth exploration of the principles of Android hotfix technology ebook

Hot Repair technology principles ebook content

Need to complete the information of friends, you can like + comment after the backstage private letter I get free!