ProGuard is a free tool for compressing, optimizing, and obfuscating Java bytecode files. ProGuard is one of the most important defenses of the Android platform.

ProGuard function

The main functions and execution process of ProGuard are shown in the following figure:

  1. Enter a collection of Java bytecode files in a jar or AAR format and pass in the custom configuration
  2. The shrink process removes useless classes, methods, and attributes
  3. The Optimize process further optimizes bytecode, primarily by optimizing logic
  4. The Obfuscate process obfuscate the optimized bytecode, renaming class, method, and attribute names to short and meaningless names, reducing package size, and defending
  5. Preverify is the procedure for pre-verifying and adding the verified information to the class file

The working mechanism of the above process is mainly introduced

  • Shrink Shrink is the process of compressing bytecodes, mainly by removing unnecessary class, method, and attribute definitions from the Java bytecode collection. This process is similar to Java’s GC determination process in that it searches from ROOT and follows a path called the call chain, which determines which classes, methods, and properties are not being used and then deletes them. So how exactly are these ROOT’s defined? Take a look at the configuration snippet in Android’s ProGuard file:
-keepattributes *Annotation*
-keepclasseswithmembernames class * {
    native <methods>;
}
-keepclassmembers public class * extends android.view.View {
   void set* * * * (); *** get*(); }Copy the code

All classes, methods, and properties that add the Keep tag are ROOT and are considered call entries.

  • > > Optimize bytecode, a process that optimizes code logic such as if the condition is always true, such as a class that has only one method and is called only once. At this point, you can move the method block to the calling point and delete the original class and method. The process also needs to know the call chain, which can be validated to optimize the code logic.
  • Obfuscate Renaming classes, methods, and properties into short, meaningless names is a common defense in Obfuscate. This process further compresses bytecode size without Obfuscate logic.
  • Preverify Preverify is used to pre-verify and add the verified class information to the class file. This process is disabled by default on Android.

Confusion analysis

At present, The obfuscation function in ProGuard library is mainly used in Android to compress bytecode size and protect bytecode by obfuscation. A sample was analyzed before and opened through JADX as follows:

ProGuard obfuscates logic

Use JADX to open ProGuard.jar and find the entry class:






private String newName(int i) {
        int i2 = this.generateMixedCaseNames ? 52 : CHARACTER_COUNT;
        int i3 = i / i2;
        char charAt = charAt(i % i2);
        if(i3 ! = 0) {return new StringBuffer().append(name(i3 - 1)).append(charAt).toString();
        }
        return new String(new char[]{charAt});
    }
Copy the code

For example, if you use a 26-character set and the generated name is Z, the next name will be AA, and so on. If the character is used, add one character to the end. DictionaryNameFactory generates a name from another branch of DictionaryNameFactory:

  1. How is the list character set generated?
  2. NameFactory character generation method?

Take a look at the DictionaryNameFactory constructor:



public static final String OBFUSCATION_DICTIONARY_OPTION = "-obfuscationdictionary";
public static final String CLASS_OBFUSCATION_DICTIONARY_OPTION = "-classobfuscationdictionary";
public static final String PACKAGE_OBFUSCATION_DICTIONARY_OPTION = "-packageobfuscationdictionary";
Copy the code

The package name, class name, method name, and attribute name can be configured by setting the character set. Here’s a summary:

  • If the obtrusion dictionary is not set, the package, class, method, and attribute names are obtrused by default
  • You can configure the package name, class name, method name, and attribute name to confuse the character set. When the configured character set is not enough, the a-Z name is used

Confusion to upgrade

The obfuscation mechanism and the method of setting a custom obfuscation character set were introduced above. Here are two methods of setting a character set and compare their advantages and disadvantages.

  • Set the character set directly in the configuration file

Add the following information to proguard-rules.pro in DEMO:

-obfuscationdictionary ./dictionary
-classobfuscationdictionary ./dictionary
-packageobfuscationdictionary ./dictionary
Copy the code

Create a dictionary file in the same directory as the test character set:

_
__
___
____
_____
______
Copy the code

Here is the underline, to see the final effect:

  • Modify ProGuard.jar Directly modify the character generation method in SimpleNameFactory to generate a new proGuard.jar

Modifying the ProGuard.jar scheme is relatively easy without having tried it myself. Here’s a comparison of the two schemes:

plan advantages disadvantages
Set the dictionary in the configuration Convenient configuration The default a-z is used when the character set is used up
Modify the ProGuard jar Ability to fully understand character set usage Need to modify library

conclusion

This article mainly introduces the function of ProGaurd library, and how to use obfuscate function to improve application security protection, here reminds: when setting character set, also pay attention to the package size, native A – Z is 1 byte, if setting Chinese or other characters should be measured.