APKThe birth of

The diagram above is also common in other articles, and this one illustrates oneAPKThe birth process can be divided into the following several processes

  1. aaptThe tool converts the resource file to the correspondingRFiles and compiled files,But this type of resource file is not includedassetsFiles in the directory.
  2. aidlTools will beaidlFile conversion toJavaThe code.
  3. Java CompilerThe tool combines the two with our written source code to produce the familiarClassFile.
  4. dexThe tool will combine third-party libraries andClassFile conversion to binarydexFile.
  5. apkbuilderTools will compile good resource files, source binaries as wellassetsThe resource file under the last generation we see is inlineapkFile.
  6. signerTools for signing use (because signing tools are not limited toJarsigner)
  7. zipalignTools help optimize resource indexing.

aapt/ Resource compilation phase

The aapt tool is located in build-Tools in the Android directory

Please refer to AAPT2 for specific usage

AAPT2 supports compiling resources in the RES file directory. When AAPT2 is called for compilation, each call should pass a resource file as input. AAPT2 then parses this file and generates an intermediate binary with a.flat extension. This binary corresponds to Compiled Resources in the figure.

aapt2 compile project_root/module_root/src/main/res/values-en/strings.xml -o compiled/
Copy the code

Compile must be a complete path, but -o must be a relative path. The end result, as described above, is a binary file with the suffix.flat.

Here’s what you get for different file types:

The input The output
XML resource files (such as String and Style) in the res/values/ directory. Resource table with *.arsc.flat as extension.
All other resource files All files except those in the res/values/ directory will be converted to binary XML files with the *.flat extension. Also, by default, all PNG files are compressed and have a *.png. Flat extension. If you choose not to compress PNG, you can use the — no-Crunch option during compilation.

In addition, the link function of AAPT tool will also generate an R file for us to uniquely identify resources.

aapt2 link path-to-input-files [options] -o outputdirectory/outputfilename.apk 
--manifest AndroidManifest.xml
Copy the code

Incremental links can be achieved through targeted links.

Android Studio comes with tools that can be directly parsed by clicking APK.

We can take the wholeintThe value is divided into four bytes:

  1. First byte0x7fsaidpackageIDIs used to define the source of a resource. Application resources are0x7f, the system resource is0x01
  2. Subbyte01saidtypeIDIs used to indicate the resource type, for exampledrawable,layout,menuAnd so on, the next resourcetypeIDWill be02
  3. After 2 bytes0000Refers to each resource in the correspondingtypeIDThe order in which they appear. (The storage space given is relatively large)

But we are inAPKSuch a file will be found in the parsed fileresources.arscThe generation of this file is also accompaniedaapt“.

Throw away the pure value of the file, focus on the layout file can be found V17, watch-V20. There’s actually room for different versions of the layout, and if you look at MipMap or Drawable there’s also room for different screen sizes.

Q1: What are the functions of r.Java and Resources.arSC files?

A1: Resources.arsc makes it possible for applications to support screens of different sizes, densities, and languages at run time. R files set unique identifiers for resources, allowing applications to quickly index matching resources based on the device’s current configuration information.

Java Compile + Dex/ code compilation

In fact, most of our projects have been completed with the Build functionality provided by Android Studio, and the same capabilities provided by Gradle.

GradleWhat is it for?

inA little bit about PythonI shared a picture like this.Actually,GradleOne of our capabilities is to provide us with dependencies between different third-party libraries, and the basis isJavaAnd so onBuildWe often see one like thisTask.

It’s important to understand what Gradle is before we get into the Gradle packaging process. Let’s take a look at the following XML file.

<dependencies>
        <dependency>
            <groupId>org.springframework</groupId>
            <artifactId>spring-context</artifactId>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
        </dependency>
</dependencies>
Copy the code

If you’ve ever done back-end development, you’ve probably come across XML files like this all the time, but the ability to identify file format writing comes from maven, a fellow developer with similar capabilities.

dependencies {
    compile('org. Springframework: spring - the context: 2.5.6')
    testCompile('junit: junit: 4.7')}Copy the code

And likeGradleIs based on their own definition of grammar to complete the dependency analysis, the presentation is more clear at a glance. Well, I haven’t introduced you yetGradleWhat is the purpose of this tool? To put it simply, it is an automatic project builder. But how integral is such a tool in our development process? Let’s take a look at oneapkUntil he does something.What do we focus on when we write code? There are usually the following categories:

  1. Source code files: includedKotlin,Java,C,AIDLAnd so on.
  2. Resource files: images, videos, layouts, etc.
  3. RFiles, unique identifiers of various resources.

With that done, we probably finished coding and used itAndroid StudioThe capabilities offered in.If not, your project will be running happily on your test machine.

You may not feel it by this point, but what if you look at this picture?Whether you can actually feel itGradleBecause for us as developers it’s really just a trigger operation that runs the button, but behind itGradleThe benefits for us are endless.

We know he’s useful here, but why mention his brother Maven? The main purpose is to allow you to change the build tools at hand, according to the website build speed comparison.

For details, see Gradle vs Maven: Performance Comparison

Because projects in a company are typically componentized and there are many, many, many parties involved, it’s probably more comfortable to compare the time of a large build. Gradle is 2-3 times faster for clean builds, about 7 times faster for incremental changes, and 3 times faster for caching Gradle task output. The construction of such a high efficiency for our developers must have a “bad”, is also good for instance I, as a trill developers, originally trill build tools using the Maven his incremental compilation was 20 minutes to complete a build speed, that means I have the fishing time of 20 minutes now, but if I compile 10 to 20 times a day? Overall, the work efficiency of a day can be shortened like a fracture. It may be because the compilation efficiency is too low, so that you can not complete the requirements on time and lose the year-end bonus. However, with Gradle, the efficiency doubled, and each incremental compilation took only 10 minutes to complete. Although the time was shorter, the efficiency improved, and the boss said that you did a good job and gave you a bonus of 3 months.

Going back to this topic,GradleHow does it provide us with capacity?

Proguard + Dex

Dex is a tool that converts a Class file to binary and I’m not going to do that here

In aboutproguardFor 80% of the developers ruan is probably most familiar with the content is confused.

Q1: What are the benefits of confusion?

A1: Why are we confused? Very simple, do not want to let a third party easily access the source code of our app, that his first advantage comes out, let the code lose intuitive semantics, let part of the external amateur hackers want to steal company secrets. In fact, this tool gives us a second advantage, which is that the code content is shortened, which plays a crucial role in the overall package size reduction.

theProguardIs that all it does? Obviously not.As can be seen from the picture,ProguardThe part targeted is thrown away from the system library, so it can be found in the confused graphandroid.supportThe library is still clearly displayed, personally, because if the system library is added to the confusion, it may lead to strangeBug.

We divide the whole into four parts:

  1. shrink— Code cuts
  2. optimize— Instruction optimization
  3. obfuscate— Code confusion
  4. preverify— Code verification

Shrink

As code cuts there must be an entry point for cuts. The ProGuard will start marking according to the Configuration Roots and start diverging according to the Roots for the entrance. When the tag is complete, delete untagged classes or members. The end result is a lean ClassPool.

Q1: So theseRootsWhat is the source of?

A1: Roots includes classes, method fields and method instructions, and there are mainly two sources.

  1. throughkeepAt the same timeallowshrinkingDon’t fortrue. To calculateclass_specificationClass qualifies and qualifies members
  2. throughkeepclasseswithmembersThe keywordallowshrinkingDon’t fortrue. If both class and member qualifiers exist. To calculateclass_specificationClass and member qualification in.

Q2: What code was deleted?

A2: Delete a method or class that has no call point in the global scope and is not kept by KEEP.

Optimize

Optimize analyzes code instructions, stacks, local variables, and data flows at this stage. To optimize and simplify code by simulating what happens when the program runs. Optimize traverses all the bytecode multiple times for data flow analysis needs ProGuard turns on multithreading to speed things up.

The Optimize strategy is detailed in ProGuard’s Initial discussion section Optimize

Obfuscate

Code obfuscation is probably our most common part.The obfuscation brings two benefits:

  1. The code loses its intuitive semantics (because we have rules for naming methods or functions)
  2. The code content is shortened and the overall package size is reduced

Preverify

Prevalidates the code. Check StackMap/StackMapTable attributes. Android VM bytecode verification is not based on StackMap or StackMapTable.

See ProGuard for details

D8

It’s an alternative to Dex

The introduction of this parser is very important for adaptationJava 8The new conceptLambda.JavaThe bottom layer is throughinvokedynamicInstruction to implement becauseDalvik/ARTThere is no supportinvokedynamicInstruction or corresponding alternative function. In a nutshell, it isAndroidthedexNot supported by the compilerinvokedynamicCommand, lead toAndroidCannot directly supportJava 8.

soAndroidThe thing to do is to indirectly support, willLambdaChange to parsed syntax and execute. After compiling the code, we can see that the generated code will generate bothLambdaTo identify the class, that’s how it explains its parsing scheme, and the way the code is implemented is that we inJava 7In the common scheme.

But do you think the new product will stop there? 🤫

  1. Improved compilation speed

2. CompileddexFile size reduction

R8

It is an alternative to Proguard + Dex

R8 contains D8 plus R8

R8As aProguardThe replacement product, inherited the original function and made the expansion.In theR8What breakthroughs have developers made with this tool?It can be seen intuitively from the figure that R8, as an integration, willProGuard+DexCapability integration, not only improves compilation efficiency, but also benefits package size

Apkbuilder is an integration tool

The signature

Why do Android applications need signatures? Do we often encounter such a situation that the same project is run from two machines to the same mobile phone, and we often encounter errors about different signatures? Then we might just delete it and reinstall it, and that would fix the problem, but the problem is actually caused by the signature, and if both machines use the same signature, the problem is automatically resolved.

What benefits does signature bring us?

  1. Special key signatures can be used to obtain a number of different permissions
  2. Verify that data is not tampered with to prevent applications from being overwritten by malicious third parties

throughAndroid StudiotheGenerate Signed Bundle or APKMethods you can see the above two signature methods:Jar SignatureandFull APK SignatureWhat’s the difference between these two signatures?

Jar Signature / v1

Signature byJar SignatureinAPKWhat is the form of expression of?V1 signature process is very simple, which is divided into three parts:

  1. Digest non-directory files and filter files and store them inMANIFEST.MFFile.
  2. rightMANIFEST.MFSummary of documents and toMANIFEST.MFThe content of each entry of the file is abstracted and stored inCERT.SFFile.
  3. Use the specified private key pairCERT.SFThe file computes the signature and then writes the signature along with the digital certificate containing the public key informationCERT.RSA.

In fact, it is obvious from the implementation process that there must be a problem with the signature pattern, because the final signature data is equivalent to being exposed to the outside. You can decompile an APK with a little attention to the data and then compile it back.

Full APK Signature / v2

Now that we know how Jar Signature works, how does this new Signature work?

APK signature scheme V2 is a full file signature scheme that can help speed up validation and enhance integrity assurance by discovering all changes made to the protected part of APK.

When signing with APK signature scheme V2, an APK signature block is inserted into the APK file before and immediately adjacent to the “ZIP Central Directory” section. In APK signature Block, v2 signatures and signer identity information are stored in APK signature scheme block V2.

APK signature scheme V2 authentication

  1. Find “APK Signature Chunking” and verify the following:
    1. The two size fields of APK Signature Chunking contain the same value.
    2. “ZIP Central Directory End” follows the “ZIP central directory” record.
    3. There is no data after “ZIP central directory end”.
  2. Find the first APK signature Scheme V2 block in APK Signature Block. If v2 blocks exist, proceed to Step 3. Otherwise, fall back to validating APK using the V1 scheme.
  3. Do the following for each signer in “APK Signature scheme V2 Chunking” :
    1. Signatures select the supported signature algorithm ID with the highest security. The ranking of safety factors depends on each implementation/platform version.
    2. Verify signature in Signatures using public key and matched against signed data. (Signed data can now be parsed safely.)
    3. Verify that the list of signature algorithm ids (ordered lists) in Digests and signatures are the same. (This is to prevent signature deletion/addition.)
    4. The same digest algorithm used for signature algorithm is used to compute the digest of APK content.
    5. Verify that the calculated digest matches the corresponding digest in the Digests.
    6. Verify that the SubjectPublicKeyInfo of the first certificate in certificates is the same as the public key.
  4. APK validation will succeed if at least one signer is found and step 3 succeeds for each signer found.

So the question is, how do you figure out this whole piece of v2?

For details about the calculation process of V2, see APK signature scheme v2 Block

  1. Each part is broken down into consecutive 1MB chunks. The last block of each section may be shorter.
  2. The summary of each block is computed by byte 0xa5 + the length of the block + the contents of the block.
  3. The top level digest is computed by joining the digest of byte 0x5a + number of blocks + blocks.

In order to speed up the computation through parallel processing, the algorithm is divided into blocks.

V3 (Android 9 and later)

In the new version of V3, the certificate rotation check is added, that is, the new certificate, the new private key can be used to sign the APK during an upgrade installation. Of course, the new certificate is guaranteed by the old certificate, like a certificate chain.

For details, see Android P V3 Signature Features

v4(Android 11)

This scheme generates a new signature in a separate file (apk-name.apk.idsig), but is otherwise similar to V2 and V3. No changes have been made to APK. This solution supports ADB incremental APK installation. While large (2GB +) APK installations can take a long time to install on the device, ADB (Android Debug Bridge) incremental APK installations can install enough APK to start the application while streaming the rest of the data in the background to speed up the APK installation.

zipalign

Zipalign is an archive alignment tool that provides important optimizations for Android application (APK) files. The goal is to ensure that the beginning of all uncompressed data performs a specific alignment with respect to the beginning of the file. Specifically, it aligns all uncompressed data in APK (such as images or raw files) on 4-byte boundaries.

Point of use

Zipalign must be used at one of two specific points in the application build process, depending on the application signing tool used:

  • If jarsigner is used, zipalign can only be performed after signing the APK file.
  • If you are using apkSigner, you can only perform zipalign before signing the APK file. If you make further changes to the APK after using apkSigner to sign the APK, the signature is invalidated.

Since then, one can runAPKIt was born.

APKRunning on theAndroidOn the phone

Well, since we’re going to start running it on mobile, we’ll probably need itadbHere is a review of an installation commandadb install <dir>/XXXX.apk 在AndroidHere’s what we need to knowDalvikandARTTwo virtual machines.

But first we need to know why there wereJVMIn the case of, you have to build your ownDVMTo meet the demand?Let’s start with the question, whyAndroidThe program clearly doesJavaWritten, can be directly inJVMRun on, but also write another oneDVM??

There are probably a lot of articles that say this, because running through the JVM, while it’s possible to run a piece of code around, it’s obviously not performance wise to do all the data manipulation directly through registers. But I heard a story before that Google was restricted by Oracle from using the JVM 😵, so they built a DVM. Then it was better than using the JVM, and it started to catch on.

So why does the JVM run slower than the DVM?

JVM DVM
Stack based development Register-based development
Java file Dex file
According to the need to load One-time loading

DVM before the introduction of multiDex is loaded at one time, which may be slower than THE LOADING speed of JVM, but after loading, the overall efficiency is high, based on several aspects:

  1. On-demand loading results in insufficient real-time loading.
  2. Based on stack development, the corresponding binary instructions are more complex.

If Davlik sounds so good already, why develop another ART virtual machine?

In fact, his optimization Angle has the following aspects:

  1. Using AOT (Ahead-of-time) compilation technology, it can directly convert Java bytecode into machine code Of the target machine.
  2. A more efficient and fine-grained garbage collection mechanism (GC).

AOT (Ahead-of-time) compilation technology

JIT (Just in Time)

The runtime compiles bytecode to native machine code Disadvantages:

  • You need to recompile every time you start the application
  • Run-time is more power-intensive (because there is always compilation)

AOT Ahead of Time

Bytecode is compiled into native machine code at application installation time Disadvantages:

  • Application optimization after application installation and system upgrade is time consuming (recompiling, converting program code to machine language)
  • Optimized files take up extra storage space (caching conversion results)

JIT + AOT

Why did this happen? In fact, whether pure JIT or AOT solutions have their own advantages and disadvantages, why say so.

This is an era of flow, and the volume size and installation time of an installation package often become the soft side of users’ installation. The reason can be seen in the technical analysis of App competitive products (3) reducing the volume of the installation package. This gives the JIT approach the advantage of being faster because there is no compilation at install time. But what happens after it runs? The advantage of JIT falls off a cliff. At this point, AOT can speed up the execution of the program at the next startup. But what is the trigger condition of AOT?

When the phone is idle or charged for a long time, the system will execute AOT process for compilation, and the generated machine code is cached as a file. Therefore, this AOT is a very uncontrollable process without any intervention.

More efficient and fine-grained garbage collection mechanism (GC)

GC can be divided into the following aspects:

  1. Memory allocator
  2. Garbage collection algorithm
  3. Support for large object storage space
  4. Moving GC policy
  5. Diversity of GC scheduling policies

We will only cover the GC garbage collection algorithm here. First of all, a quick review of what you need to know about the JVM. I mentioned three garbage collection algorithms in the JVM, copy collection, tag cleanup, and tag cleanup, but for the JVM there is a whole heap lifecycle that is governed by its own rules. Then there are many, many garbage collectors, such as Serial collector, ParNew collector, G1 collector…

But that’s for the JVM, and what about the posture of the DVM?

The simple solution for DVM is to Stop The World as The original JVM garbage collector did, and then put on its own cleanup algorithm that marks The data in use and then cleans up The unwanted data. This leads to a user experience that is difficult to describe in words.

How does ART improve performance while maintaining The concept of Stop The World? What the ART requires the garbage collector to do is split up into the application itself, which is essentially a tag. Here to make a blind guess, ART should be implemented by adding something similar to the use of mark bit, by constantly updating the value, and so on, when it needs to be cleaned, the identification of the data is actually in a complete state, the problem may be the setting of the mark bit. To reduce the burden of cleaning process, Google introduced a technology called Packard pre-cleaning pre-cleaning to reduce the amount of GC needed to improve efficiency.

See Android 5.0 ART GC vs. Android 4.x Dalvik GC for details

The resources

  1. Android compatible with Java 8 syntax features of principle analysis
  2. Reduce, obfuscate, and optimize applications
  3. Also Android signature mechanism
  4. APK signature scheme v2
  5. Android P V3 signature is new