Clear problem

The first problem that every Android plug-in framework addresses is the Activity lifecycle. Activities represent services and other components that need to be registered. The difference is that the premise of the plug-in framework to solve this problem is not exactly the same. Our business requirements are more demanding, and the non-public API restrictions of Android 9.0, so the premises are as follows:

  1. The plug-in code should also compile, install, and run properly.
  2. The plug-in code is existing business code and does not need to be modified to plug in the plug-in framework (that is, the plug-in framework needs to be code non-intrusive).
  3. Only a limited number (about 10) of components can be registered in the host’s Androidmanifest.xml. Hosting AndroidManifest can cause host installation to be slow and cross-process communication to fail.
  4. You cannot use a private API.

Choice of general direction

In fact, we already use a plug-in framework that is also based on proxy component transfer plug-in components. However, the plugin framework uses a lot of reflection using the proprietary API, which is unlikely to continue on Android 9.0. We also looked at the best replugins out there. Therefore, there are roughly two directions: one is to register the proxy Activity as the shell in the host and really run it, and then let it hold the plug-in Activity, and try to find a way to call the corresponding plug-in Activity life cycle method when it receives the system life cycle method call. Second, Hack modifies the host PathClassLoader so that it can return the Activity class of the plug-in when it receives the class of the Activity registered in the AndroidManifest.

The second is RePlugin’s key technology. It takes advantage of features of the JVM. I’m not sure if this is a bug, but the loadClass method of a ClassLoader can return an actual class with a different name than the class it was asked to load. For example, the host androidmanifest.xml registers an Activity named A, and the plugin has an Activity named B. There is no such thing as A class in host code or APK, just A name registered in AndroidManifest. When you want to load the plug-in Activity B, you issue an Intent that starts Activity A. After receiving the Intent, the system checks the host’s installed AndroidManifest information to determine which APK installed A, and then finds the host’s PathClassLoader. The system then tries to load class A from the PathClassLoader and use it as an Activity object (which is normal). So if we give the host’s PathClassLoader to the Hack, control its loading logic so that when it receives the load call, it actually returns the Activity B class. Since B is really a subclass of Activity, the system has no problem taking it back as an Activity type. If C returns B from A, C will find that the name of B is not A. The implementation of RePlugin, a key piece of technology, was a bit of a problem when it came to research. RePlugin selects to copy a PathClassLoader and then replace the PathClassLoader held by the system. So copying a PathClassLoader requires reflection to use the private API of the PathClassLoader, taking out its data, and replacing the PathClassLoader held by the system also requires reflection to modify the private API. We had already implemented a “fully dynamic plug-in framework” where dynamic use of the proxy shell Activity solved this problem, and our choice was to add a Parent ClassLoader to the host PathClassLoader. Because PathClassLoader is also a ClassLoader with normal “parent delegate” logic, it will ask the parent ClassLoader to load any classes first. So the parent ClassLoader we’ve added can also do what RePlugin wants. But we use it because we don’t want the shellActivity wrapped in the host to take up too many methods of the host and not be able to update it. This may be a separate point later. For this replacement implementation, a PR has recently been given to RePlugin: github.com/Qihoo360/Re… If you are interested, take a look.

Another aspect of the RePlugin solution that is particularly inappropriate for our business is that the “pit” activities registered in the host AndroidManifest, such as Activity A, cannot be used by multiple plug-in activities at the same time. I can’t register an Activity A in the host AndroidManifest and have it support both Activity B and C. This is because the ClassLoader receives only the class name of A when it calls the loadClass. There is no way to pass more information to the ClassLoader to tell whether it should return B or C in this loadClass call. So this scenario requires registering a large number of activities in the host, which is unacceptable for our host. The first method is to use the proxy Activity to hold the plug-in Activity callback scheme. You can start the proxy Activity by passing many parameters through the Intent. The proxy Activity can use the parameters in the Intent to decide whether to construct a B or a C. This makes the shell of this scheme reusable.

The other point is that we have already designed the “full dynamic plug-in framework” on top of the old framework, so by developing the new plug-in framework in the direction of method one, we can update the plug-in framework without changing any of the host code and without the host version. This will be discussed in a follow-up article.

So the direction we’re going to explore is in the direction of method one.

Why did the old framework use reflection and a private API?

Our old framework was a proxy for the Activity toggling plug-in Activity, and there are many plug-in frameworks on the market that do the same. They just differ in the means by which they achieve the transposition. I won’t go into the details of how they are implemented here, but the purpose is to allow plug-in activities to receive life cycle callbacks. Our old framework and others had a scheme that had the shellActivity directly reflect the lifecycle method that calls the plug-in Activity. Doing so solves an additional problem.

The onCreate method is not called directly after the Activity is instantiated by the system. Its attach method is called first. The Attach method is essentially the initializer of the Activity, which is used to inject private variables into the Activity, such as Window, ActivityThread, and so on. Since the plug-in Activity is new by itself, the system will not call the attach method of the plug-in Activity to initialize it. What happens if an Activity is called onCreate without being initialized? The onCreate method of the plug-in Activity must be called super.oncreate (). The onCreate method of the plug-in Activity must be called super.oncreate (). We also want to be non-intrusive to the plug-in code, so there is no “if (not plug-in mode)” around this call. In the plug-in environment, the super.oncreate () will be executed. The onCreate method of the Activity base class uses private variables that should have been initialized, but are not. So this kind of plug-in framework solution is to solve this problem. So either reflection calls the Attach method, passing in a private variable taken from the shell Activity, such as an ActivityThread object that reflects the shell Activity, to the attach method of the plug-in Activity. Or you can simply use reflection enumerations to read and write the private variables of the shell Activity and plug-in Activity and write them the same. So that’s why older frameworks need to use reflection and private apis.

How does Shadow solve the plug-in Activity lifecycle problem

In fact, from the previous analysis, we found that there is no need for the Activity plug-in to execute the super.oncreate () method at all. The original purpose of this solution is to register and start a shell Activity in the host. This shell Activity does nothing by itself. Try to make the code of the plug-in Activity’s various life cycle methods become the code of the shell Activity’s various life cycle methods. So we don’t need the plug-in Activity at all. It’s a subclass of system Activity. We only need the plugin Activity to be installed and run properly, so it is a true system Activity subclass.

We also know that if you don’t want the plug-in code to be non-intrusive, and you don’t want the plug-in to be installed and run independently, you can actually make the plug-in Activity inherit from the system Activity and simply inherit from a common class. This ordinary class defines some life cycle methods that are the same as the system Activity class, and the implementation is empty. Then these life cycle methods can be set to public, so that the shell Activity holding the plug-in Activity with this ordinary class type can directly call the plug-in Activity’s life cycle methods. This implementation requires neither reflection nor a private API.

And we actually do not need plug-in apK can be installed and run independently, we hope that the plug-in can be installed and run independently of the essence of the purpose is to save manpower, do not maintain two sets of code. So it looks like we just need to introduce AOP tools and use AOP programming to change the parent class of the plug-in Activity from the system Activity to the normal class we want.

Let’s not say how this AOP approach will be implemented, because the question is not fully understood. Do we really want A to hold B and call whatever call A receives from B? Is that what we really want? Isn’t. Simply invalidating the super.oncreate () call in the plug-in Activity is not perfect. Because this is a little different from when the plug-in Activity is installed and running normally. The code for the plug-in Activity’s onCreate method is now part of the shell Activity’s onCreate method. Such as:

class ShadowActivity { public void onCreate(Bundle savedInstanceState) { } } class PluginActivity extends ShadowActivity  { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); System.out.println("Hello World!"); } } class ContainerActivity extends Activity { ShadowActivity pluginActivity; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); pluginActivity.onCreate(savedInstanceState); }}Copy the code

The ShadowActivity above is the normal class we talked about earlier. Check to see if this is what ContainerActivity is already implemented like:

class ContainerActivity extends Activity {
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        System.out.println("Hello World!"); }}Copy the code

As long as PluginActivity is loaded dynamically, the implementation of ContainerActivity is dynamic. But what if the original PluginActivity code looks like this?

class PluginActivity extends ShadowActivity { @Override public void onCreate(Bundle savedInstanceState) { savedInstanceState.clear(); super.onCreate(savedInstanceState); }}Copy the code

Obviously this code is different when running in a normal installation and when running in a plug-in environment. Because it becomes:

class ContainerActivity extends Activity { @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); savedInstanceState.clear(); }}Copy the code

This means that we don’t want the super.oncreate () call of the Activity plug-in not to be executed. We want the super.oncreate () call of the Activity plug-in to be able to directly tell the shell Activity when to call super.oncreate (). And on second thought, is this inheritance very similar? If the PluginActivity is derived from ContainerActivity, the runtime system calls the PluginActivity instance, The PluginActivity’s super.oncreate () method directly instructs ContainerActivity when to call super.oncreate (). So what we really need here is how to get the inheritance relationship into the possession relationship. So Shadow is implemented like this:

class ShadowActivity { ContainerActivity containerActivity; public void onCreate(Bundle savedInstanceState) { containerActivity.superOnCreate(savedInstanceState); } } class PluginActivity extends ShadowActivity { @Override public void onCreate(Bundle savedInstanceState) { savedInstanceState.clear(); super.onCreate(savedInstanceState); } } class ContainerActivity extends Activity { ShadowActivity pluginActivity; @Override protected void onCreate(Bundle savedInstanceState) { pluginActivity.onCreate(savedInstanceState); } public void superOnCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); }}Copy the code

Take a moment to think about the results of these calls. Do pluginactivities behave the same when they are running normally and when they are running in the plug-in environment?

Now you can consider implementing AOP methods to change the parent class of PluginActivity from system Activity to ShadowActivity. Shadow ultimately uses bytecode editing. Bytecode editing can be done using the Transform API as part of Android’s official build process, so it’s also done using a public API. The details of bytecode editing will be covered in a future article. In short, bytecode editing allows us to change the parent class of the plug-in Activity from system Activity to ShadowActivity without modifying the source code. In order to achieve the same source code application of different compiler options to generate different APK, one can be installed and run normally, one can run in the plug-in environment.

Bytecode editing is not the only way to do this. I didn’t start out with bytecode editing, because bytecode editing involves more tricks on the build project, which will be shared later. The easiest way to achieve this AOP goal is to take advantage of the basic features of the Java language. Java classes are not linked at compile time. Only the names of dependent classes are recorded in the Java class bytecode, and the ClassLoader is asked to find the specific implementation of these classes at runtime. Since we created the plug-in’s ClassLoader ourselves, we can implement the plug-in’s ClassLoader as a ClassLoader that does not follow the “parent delegate” mechanism. When the plug-in Activity looks for its parent system Activity, we return it a dummy system Activity class. Return ShadowActivity as the system Activity. This code implementation is much simpler than bytecode editing. Unfortunately, the Java Virtual Machine for Android is not a standard JVM. For built-in classes, such as system activities, it will use the pre-compiled Native implementation directly. On some phones without this feature in Debug mode, this scheme works fine. A precompiled Native implementation is used in Release mode, causing the JVM to crash. I think whoever wrote the Android JVM didn’t expect someone to return a fake system class, so it didn’t even throw an exception, with a wild pointer. About this, you can refer to this article to know some details: mshockwave.blogspot.com/2016/03/int…

Subsequent developments also took advantage of bytecode editing to do more things that couldn’t be done with ClassLoader techniques. Share about those things in the future.

conclusion

So, this article is basically the key to Shadow’s zero reflection, nothing more than exposing API calls. How’s that? Do you think it’s really a window that breaks when you poke it? The key to solving a problem is not the final code, but the original idea. “Any software engineering problem can be solved by adding an intermediate layer.” ShadowActivity is the intermediate layer Shadow added to solve this problem. This middle layer makes the Android system invisible to the plug-in, and also makes the Android system invisible to the plug-in. This will not break the limitations of the Android system. RePlugin’s scheme, on the other hand, gives the system an uninstalled Activity in the general direction, so it is inevitable that the system will have to continue hacking to solve the problem of seeing an uninstalled Activity. For example, if the system initializes the Context for the plug-in Activity with the host apK, the plug-in framework will have to Hack again. So the choice of general direction is very important.