Why code obfuscation?

When our application is finished and ready to go online, in order to provide users with better experience, we will do some optimization for the application, such as package compression, code optimization and security processing. As the simplest optimization scheme, it is to enable code compression optimization. By simply adding a few configurations to the project configuration file, you can automatically implement code compression, bytecode optimization and code obfuscation during the formal package compilation process, providing users with a smaller and more secure application. In this test, an Android project was created. The package size was 2.7m without obfuscating and compression, but after obfuscating and compression, the package size was only 1.4m, which was reduced by nearly 50%. Therefore, in order to provide users with a smaller and safer application, it is important to understand code obfuscating.

The history of code obfuscation

Prior to 2016, Android code was compiled using the Sun /org compiler as follows:

In 2016 android N Google released its own compiler jack&jill, the jack&jill compilation process

However, the overly simplified compilation process of Jack&Jill made it difficult to add optimization. In 2017, Google abandoned Jack&Jill and returned to the previous compilation process, only optimized and rewritten dex compiler, called D8, to optimize the code together with Proguard.

In January 2019, Google released R8 in AGP 3.3, which integrated deicing, compression, obfuscating, optimization and dex processing into one step. However, at that time, R8 still had a lot of problems, which caused unknowable problems in the application. Until April 2019, Google released AGP 3.4. R8 as the default tool for code optimization, the compilation process is as follows:

R8 has been greatly improved over the D8 generation version. Here is a comparison:

Code obfuscation principle

Android provides R8 tool, this tool only for Android project code optimization, R8 will be registered in the Androidmanifest.xml activity service receiver contentProvider as the entry point, Based on the analysis above, figure out what code is useless and then remove it. Therefore, the four unused components should be deleted from androidmanifest.xml in time, otherwise the entry point and its associated code and resources will not be optimized.

How to use code obfuscation?

1. Configure and use

To use the package tuning tools provided by Google, simply add the following configuration to your project’s build.gradle file

SigningConfigs {debug {keyAlias' alias' keyPassword 'keyPassword' storeFile file("${rootDir}/debug.keystore") StorePassword 'storePassword'} release {keyAlias' alias' keyPassword 'keyPassword' storeFile File ("${rootDir}/release.keystore") storePassword 'storePassword'}} buildTypes {debug {// Enable code compression, optimization, and obfuscating minifyEnabled False // Enable resource compression, MinifyEnabled =true Use shrinkResources false // Specify the obfuscation reservation rule proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro' // package signingConfig SigningConfigs.debug} release {// Enable code compression, optimization and obfuscation minifyEnabled true // enable resource compression, MinifyEnabled =true Use shrinkResources true // Specify the obfuscation reservation rule proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro' // package signingConfig signingConfigs.release}}Copy the code

After adding the above configuration, we can make a formal package and see what changes there are in the package body. A newly built demo has been reduced by nearly half from 2.7m – to 1.4m. Of course, a project that has been maintained for a long time will not be reduced by such a large proportion. Then run the confused APK. For a large project, after startup, it may crash without a few clicks, reporting an exception that cannot find classes or methods. This is because the optimization tool Proguard (or R8) has some problems with its analysis of classes and methods that it removes or renames during optimization. Some of our code uses string matching to find classes and methods to execute. There are a few things you should be aware of in Android:

* Native methods * Java reflection classes * custom controls * enumeration classes * JavaBean * Parcelable, Serializable class * WebView and JS Classes and methods used in the interactionCopy the code

All of these call classes or methods through string matching. For classes and methods that can’t be optimized or confused, Optimized configuration needs to be added in the obturator configuration file (proguardFiles getDefaultProguardFile(‘ proguard-Android-optimize.txt ‘), ‘proguard-rules.pro’). Let’s take a look at how we can write this configuration file to ensure that our application executes properly while minimizing package size.

2. Confusing rules;

The obtrusion file loading is superimposed, we can fill multiple files after proguardFiles (currently two files proguard-Android-optimize.txt, proguard-rules.pro), Proguard (or R8) incorporates these obfuscation rules for package builds, so the obfuscation rules added to components can also affect the obfuscation configuration of the main application.

Let’s take a look at the obfuscation configuration provided by Proguard. The specific configuration parameters and their meanings are as follows:

Optimizationpasses 5 # Code confusion passes between 0 and 7 Default 5 -verbose # log when obfuscated -dontoptimize # turn off class optimization -dontshrink # turn off compression - dontpreVerify # turn off preverification (for Java platform, not required for Android, -dontobfuscate # close obfuscation -ignorewarnings # ignorewarnings -dontwarn com.squareup.okhttp.** # specifies that the class does not output warning messages - # dontusemixedcaseclassnames obfuscated type lower case - dontskipnonpubliclibraryclasses # don't skip the public library class - printmapping mapping. TXT # Generate the original name of the class and confusion after the class name mapping file named/output/release/mapping/mapping. TXT - useuniqueclassmembernames # class method name also in confusion - # allowaccessmodification optimization there allowed to access and modify the modifier when members of the class and class - renamesourcefileattribute SourceFile # will be meaningful in the source class name into SourceFile, To confuse the specific breakdown code - keepattributes SourceFile, LineNumberTable # keep the line Numbers - keepattributes * * an Annotation, InnerClasses, Signature, EnclosingMethod # avoid confusion annotations, inner classes, generics, anonymous classes - optimizations. code/simplification/cast,! field/ ,! Class /merging/ # Specifies the algorithm for mergingCopy the code

The configuration above gives us the flexibility to choose which optimizations to use in the ProGuard tool, and the next step is to see how it can be configured to ensure that our classes are not modified and the application executes. Proguard provides a number of keep options that allow us to flexibly configure class interface methods and variables to keep them primitive. Proguard provides the following keep keyword

keep Prevents being removed or renamed Prevents renaming (unused ones will be removed)
Class and class members -keep -keepnames
Class members only -keepclassmembers -keepclassmembernames
If a class contains a member, preserve the class and its members -keepclasseswithmembers -keepclasseswithmembernames

To facilitate bulk additions, ProGuard provides the following wildcards,

Class name wildcards are as follows:

The wildcard meaning
? Matches a single character, package name delimiter (.) With the exception of
* Matches divide by (.) Any character outside
支那 Matches any character (including.) ** A class that matches all classes under the com.rush package and all its subpackages.

Field and method wildcards are as follows:

The wildcard meaning
<init> Matches all constructors
<fields> Match all fields
<methods> Match all methods
? Matches a single character, package name delimiter (.) With the exception of
* Matches divide by (.) Any character outside

Type wildcards are as follows:

The title
% Matches primitive types, such as int, Boolean, and so on
? Matches any single character
* Matches the division package name separator (.) Any character outside
支那 Matches any character, including the package name delimiter (.)
* * * Matches any type (primitive, non-primitive, array, or non-array type)
Matches any number of parameters and any type of parameters

In addition to the wildcards above, ProGuard also supports the implements and extends keywords, which are reserved for subclasses. Classes supported by ProGuard (or R8) are defined as follows:

[@annotationtype] [[!]public|final|abstract|@ ...] [!] interface|class|enum classname [extends|implements [@annotationtype] classname] [{ [@annotationtype] [[!]public|private|protected|static|volatile|transient ...] <fields> | (fieldtype fieldname); [@annotationtype] [[!]public|private|protected|static|synchronized|native|abstract|strictfp ...] <methods> | <init>(argumenttype,...) | classname(argumenttype,...) | (returntype methodname(argumenttype,...)); [@annotationtype] [[!]public|private|protected|static ... ] *; ... }]Copy the code

These are some examples of the keep rule. Let’s look at an example to illustrate

Example:

-keep com.rush.Test # keep com.rush.Test (can be an interface or a class) -keep interface com.rush # keep com.rush.InterfaceTest -keep class com.rush.** {<init>; Public <fields>; Public <methods>; Public *** get*(); Void set*(***); } -keep public class * extends Android.app. Activity # Keep subclasses of Activity from being confusedCopy the code

With this knowledge, we can take a project and put it into practice.

In android gradle tools to provide some confusion configuration by default, after finished the formal package, we can in the build/intermediates/proguard files/directory, see the use of mixed configuration

At the same time will be in the build/outputs/mapping/release/directory, generate confusion log and mapping file

Mapping.txt → Original and confused class, method, field name between the conversion, in the event of a crash, can be restored through the file; Seeds.txt → Unconfused classes and members; Usage.txt → remove code in APK; Resources.txt → Resource optimization record file, which resources reference other resources, which resources are in use, which resources are removed;Copy the code

For the above four files, not all will be generated, you can set the output by using the following command:

# output mapping. TXT file - printmapping. / build/outputs/mapping/release/mapping. The output TXT # seeds. TXT file - printseeds . / build/outputs/mapping/release/seeds. TXT # 1 output TXT file - printusage. / build/outputs/mapping/release/usage. TXTCopy the code

conclusion

Through the android package body slimming, began to understand the code confusion, with the deepening of the research, found themselves are so shallow understanding of code confusion, even just to use the tip of the iceberg, and for the inside concrete implementation principle and associated knowledge, know even less, and master are some fragments of knowledge, through this research, We have a general look, but we still don’t know much about the details behind it.

reference

  • Juejin. Cn/post / 696652…
  • Juejin. Cn/post / 684490…
  • Blog.si-yee.com/2019/04/12/…

Author: Xue Jianqiang, Research and development Center of Freely Big front end