preface

I refused to do SDK hotfix at first

One day, after solving an online bug, I came up with an idea: let’s make our SDK hot-fix.

However, there is little online information and many hot fixes are only for app…

But I was beating my chest to the boss boast, how there is reason to retreat? !

Plus, if my hands shake, some big bug or compatibility issue happens, my career will be over! ?

My dear, save your life! Let’s work on a safety net.

1. Background and purpose

The effect we want to achieve is very simple, as shown in scenario 3:

Ii. Technical solutions

First of all, there is no best solution, only the most appropriate. Although I finally selected the fourth plan, if your team has resources, experience in other plans, and rich demand for SDK, you can choose other plans by yourself.

Solution 1: JAR replacement

steps

Download the JAR from the server -> Load the JAR through reflection -> Create the associated object and manipulate it.

Solution reference: Android SDK hot fix mechanism analysis to achieve

The advantages and disadvantages

Advantages:

  1. No compatibility problem

Disadvantages:

  1. Reflection consumption performance;
  2. If the JAR package is large, the entire download is unfriendly;
  3. Determining the scope of the changed code is cumbersome and cumbersome to maintain.

Solution 1 improvement: child JAR replacement

steps

In the case of large JAR packages, we can consider unpacking the SDK project into small JAR packages and the main package

The main package is responsible for reflection loading. If hot repair is needed, the sub-JAR can be delivered, which is relatively light.

The advantages and disadvantages

Advantages:

  1. Deliver subpacks only, light weight

Disadvantages:

  1. More suitable for the main package change small case;
  2. Strong coupling between main package and subpackage;
  3. Again, you have to use reflection.

Solution 2: plug-in

steps

By subcontracting the SDK, the host package only provides the API and the plug-in package that loads the core implementation, the plug-in package can be even hotter.

The advantages and disadvantages

Advantages:

  1. flexible

Disadvantages:

  1. The main project project is too dependent, often some basic configuration needs to rely on the main project source code;
  2. The access cost is high and the configuration is troublesome, but the service access party of SDK needs fast access.
  3. Plug-in frameworks can have unpredictable effects on how the system’s native code runs;
  4. You have to rely on a lot of plug-in framework functionality that you don’t need.

Solution 3: Business side hot change

Desperate, I think, ah! Many app hot more solution is not said to support lib hot more! Let’s make it a safety net.

steps

Use the business app to heat up the lib package.

The advantages and disadvantages

Advantages:

  1. Thermal more power in the hands of the business side, transparent to the business side

Disadvantages:

  1. When the lib package is too large, downloading it is still very traffic-intensive
  2. The Diff algorithm cannot calculate the difference between the old lib and the new lib, and can only replace the whole lib
  3. The steps are quite tedious, as shown below:

Scheme 4: Modify the existing APP hotfix scheme

1. What are the points to consider when choosing a hotfix?

  1. Demand for hot more projects
  • Simple method level Bug fixes?
  • Need resource and so library repair?
  • Need Native fixes?
  • Platform compatibility and success rate requirements?
  • Do you need to manage patch packages?
  • Do company resources support business payments?
  1. Cost of learning and use
  • Integration difficulty and complexity
  • Code intrusion
  • Debugging maintenance
  1. Select the focus of the framework
  • As far as possible big
  • Performance through
  • It is maintained by special personnel.
  • The heat is high and the open source community is active

2. Summarize the SDK features that need to be heated

  1. The main reason is that the code is hot and there is no need for so library and resource update.
  2. Real-time requirements are high, because once problems occur, the impact on the business side is great;
  3. Compatibility requirements are high, and you never know what the active users on the business side have.

3. Let’s have a quick look at the existing APP hotfixes.

3.1 Product of Comprehensive Optimization — Sophix (abandoned)

Sophix features perfect, simple and transparent development, but unfortunately not open source, can not be modified.

3.2 Underlying Replacement Scheme (Deprecated)

The underlying replacement solution inevitably has compatibility problems and should be abandoned.

3.3 Class loading scheme — Tinker

Advantages:

  1. The user more
  2. In comparison, all the other frameworks open source on Github have a star count of under 7000 and were last updated a year ago or even two years ago.

Disadvantages:

  1. Dex synthesis occupies a large ROM
  2. Not real time
  3. The business side is aware of the need to transform the Application. (You can also refer to InstantRun to achieve the effect of replacing Application without modifying Application. However, this scheme has a large number of HOOK system apis and is not stable enough. There is about 1/1W probability that the replacement will fail. So Tinker didn’t use InstantRun’s approach after all.)

I’ll leave you with two more questions to ponder:

  1. Will service hardening be affected?
  2. Is there any conflict with the business party?
  • Scheme Reference: SDK global hot Update scheme based on Tinker (unique in the whole network)

  • Extension: How InstantRun dynamically replaces Application, summarized in two steps:

  1. When packaging, replace the Application tag and insert BootstrapApplication
  2. The runtime hooks the system API and swaps BootstrapApplication back to MyApplication
The pile – 3.4Meituan Robust

The principle of Robust can be simply described as:

  1. Insert if(changeQuickRedirect==null)-else logic before each method;
  2. When loading the patch, read the class to be replaced and the specific method implementation from the patch package. Create a new ClassLoader to load the patch dex. When the target method is executed, changeQuickRedirect! = null, the method logic flow is changed and the old logic is replaced to achieve the fix purpose.

Advantages:

  1. Compatibility hardening is optimal
  2. Immediate effect
  3. Fine-grained to support method level fixes
  4. High stability, repair success rate up to 99.9%

Disadvantages:

  1. The plug-in intrudes into the production code during compilation, with some side effects on performance, number of methods, and package size. (Support for specifying certain classes without insertion)
  2. Replacement of SO and resources has not been implemented at present
  3. Unable to add variables
  4. There is no patch management and security verification, which needs to be implemented by developers themselves

Think about:

  1. Does it conflict with other pile inserts?

Three, implementation,

Just as I was happily tapping into the Robust, the question came!

The Robust needs to be applied before piling and patching. It still needs a round of transformation to be used on SDK.

How to transform? I’ll explain this in my next blog post, as well as release a packaged library that lets SDK developers hot-fix their SDKS in just 5 minutes. Stay tuned.

Fourth, in addition to heat more technology itself, we should also care about

Of course, our focus is not limited to the technical implementation, there are many things to consider:

How do we control distribution? Collect statistics on monitoring data? If a patch causes a crash, how do we fix it in the first place?

1. Precise distribution

Combined with the external dimension system, different delivery is done according to the user dimension, such as channel, system version, etc.

2. Data analysis

Our main concern after launch was patch compatibility and success rate.

  1. Success rate of patch pulling = User that successfully requests the patch/user that initiates the patch pulling
  2. Patch download success rate = Users who download the patch successfully/users who attempt to download the patch
  3. Patch application success rate = User whose patch is successfully applied – User whose patch is successfully downloaded

3. Patch rollback mechanism

We need to support automatic monitoring of the crash. If it is caused by the patch that is delivered, the patch will automatically fail in the next startup to avoid expanding the scope of the impact.

These thoughts will be realized in my next blog post, please pay attention!

This task took 6 minutes (180 minutes)


I am FeelsChaotic, a program girl who can write code and p pictures, cut video and draw pictures. I am committed to the pursuit of code elegance, architecture design and T-shaped growth.

Welcome to FeelsChaotic’s short books and nuggets, and if my post is even a little helpful to you, ❤️! Your encouragement is the biggest motivation for me to write!

Most importantly, please give your suggestions or comments. Please correct any mistakes!