Author: Su Yanjiao (Mu Lei)

Android projects typically use Gradle as a build packaging tool. Gradle is often praised for its clean, dynamic features, but also for its slow build execution.

In recent years, with the increasingly rich functions and features of Youku, the size of the code of Youku has also increased sharply. Meanwhile, the huge size of the code has also brought about the increasing construction time. The whole package construction once took up to 35min, seriously affecting the efficiency of integration and iteration. Therefore, construction speed optimization is imperative. By November 2021, youku’s construction time optimization has achieved a relatively ideal optimization result (as follows). Now the practice plan of construction speed optimization is recorded in a written form.

Android Build Types In 2020, 2021
Android Debug package construction time 12min 2.5 min
Android Release package construction time 35 min 12min

Plan and Revenue Chart:

Optimization idea

Technical optimization projects are generally carried out from three dimensions: set data index, technical optimization and anti-corruption of achievements. According to the technical optimization item disassembly, we need to complete the following three sub-projects:

  1. Set data indicators: that is, collect and select core optimization data indicators to reflect the value of the results. In this paper, indicators such as construction time, construction failure rate and construction times of hour dimension were selected as data support for the optimization of results.
  2. Technical optimization: It can be seen from the factors that affect the construction speed, including software and hardware, so the construction speed optimization can be divided into software optimization and hardware optimization.
  3. Results corruption prevention: that is, to maintain the technical optimization index does not deteriorate, to ensure the optimization results.

Next, I will follow the three parts of setting data indicators and anticorrosion of results, technical optimization — software optimization, and technical optimization — hardware optimization.

Optimization scheme

Set data index and result anticorrosion

Optimization projects need to establish and improve the corresponding data index system, and use the data to evaluate the volume optimization items and optimization programs to determine the effectiveness. Before construction and optimization, the author built data evaluation and market monitoring based on Alibaba Aone FaaS (Severless service) platform. The market has a number of indicators such as build type, build time, build success rate, build task time and so on, to meet the needs of build optimization projects according to the type, task frequency, high-time task investigation.

After the completion of relevant data capacity building, through the tracking and analysis of key data indicators — construction time and construction success rate, it is concluded that the main influencing factors of construction time are high time consuming tasks, and the main influencing factors of construction success rate are unreasonable construction tasks. Therefore, we can quickly detect and analyze the deteriorating situation of build speed through high-time task alarm and unreasonable task alarm, thus ensuring the optimization results of build time.

Software optimization

Build side to Atlas

Atlas is a container framework running on Android system derived from the continuous development of Mobile Taobao, which is also called Dynamic Bundle framework. It mainly provides decoupling, componentization and dynamic support. It covers various problems in engineering coding period, Apk operation period and subsequent operation and maintenance period of engineers.

Atlas can be regarded as a mobile OSGI implementation solution and componentization solution based on the deeply customized product structure and highly complex and deep hook runtime framework. However, with the adjustment of Youku mobile terminal architecture and the implementation of self-developed remote program, Atlas runtime framework gradually lost the role of OSGI framework, so the Atals framework was removed during the running period.

When Atlas dependencies are removed at runtime, the complex build process of Atlas (as shown in the figure below) also loses its meaning. Immediately, Youku launched the Atlas removal project on the construction side, aiming to remove the Atlas construction plug-in, build the original biochemistry, purify and simplify. Through a series of actions such as product protobiochemistry, construction task cleaning and tool chain upgrade, the construction performance was partially improved while the target of Atlas removal on the construction side was completed.

Benefit: The debug package construction time is reduced by about 3 minutes. The release package construction time is reduced by 4min to 5min.

Gradle upgrade and Android Gradle Plugin upgrade

The Gradle team is constantly optimizing gradle build speed and other performance metrics, and the Google team is also constantly optimizing the Android Gradle Plugin build tool. In order to further improve the building performance of Youku Android terminal, we decided to upgrade the Android building system of Youku and upgrade the android Gradle Plugin building tool version from 3.0.1(2017) to 3.4.3(2019). Update gradle build tools from 4.4(2017) to 5.5 (2019).

Comparing the build time before and after the upgrade, you can find that the performance of the build tool is improved in the following aspects:

  1. With the upgrade of Android Gradle Plugin, aapT2, ProGuard and other build tools are also upgraded. After the upgrade, the build performance of these tools is slightly improved.
  2. Better task scheduling and parallelization: After upgrading Gradle and AGP, agP 3.4.3 integrates and optimizes signing, compression, and alignment tasks.
  3. Configure the load on Demand and asynchrony policy: Android Gradle plugin 3.4.3 Adopts the asynchronous loading strategy of resources, that is, the configuaration stage only does the dependency pull work, no product decompression, filtering, merge work, so as to effectively avoid IO congestion and CPU busy phenomenon.

Benefit: The debug package construction time is reduced by about 2 minutes. The release package construction time is reduced by about 4 minutes.

Dx construction optimization

After upgrading to Android Gradle Plugin 3.4.2, agP has added three dx build parameters, which can significantly improve the speed of dx class file processing. After testing, the following three properties are set to reduce the build time.

android.dexingNumberOfBuckets=16 
android.dexingWriteBuffer.size=256 
android.dexingReadBuffer.size=256
Copy the code

Read agP source code carefully, it can be seen that these several parameters build build memory DX cache size, DEX read write chip size. By default, dexingNumberOfBuckets is half the number of cpus, and the read and write size is 1KB. This results in high IO conditions, busy CPU and so on. The construction time can be significantly reduced by reducing disk write times and increasing cache.

Benefits: The debug Release package construction time is reduced by about 3 minutes.

Sorting out redundant tasks

As we continue to iterate and upgrade the platform, some build tasks and build functions have been abandoned. However, due to the particularity of the build system — no product rollback capability, the tasks of the build system have been in a single increase state. Moreover, due to the high risk, low profit and outdated logic function of the construction system, the construction speed governance power is insufficient.

In view of the above problems, the functions of each build task are gradually understood by means of logic sorting, configuration item cleaning and single task debugging, and 30+ useless tasks such as postPackageDebug are cleared, and core functions such as Transform management and task management are simplified. The following table is the build redundant task cleanup list.

The task of Role of output Whether or not to retain
postPackageDebug The apk post-processing Can be abandoned
remoteSignAppDebug Remote signature Can be abandoned
DexCountDebug Dex number Can be abandoned
ChannelPackageDebug Channel Package construction Can be abandoned
generateAppInfoDebug Appinfo generated Can be abandoned
uploadBuildFilesDebug File upload Can be abandoned
buildPatchBaseApkDebug Hot repair build Can be abandoned
.

Benefits: The debug package construction time is reduced by 15s+, and the Release package construction time is reduced by 20s+.

Task pipeline,

Gradle uses task cases to build task scheduling, which allows for post-processing of artifacts based on this extended feature. For example, the apK post-processing includes channel processing, ARSC processing, alignment, signature, subcontracting, image compression and other tasks. Each task requires APK to perform repeated decompression, compression, and copy operations, which wastes CPU and system IO and increases construction time.

In order to reduce the construction time of APK and simplify the complex operation of APK products, we reformed and expanded the existing tasks and realized a product pipeline processing mechanism with low copy, once decompression and once compression. The following figure is the flow chart of APK post-processing mechanism in the construction process.

Benefits: Debug package time reduced by 21s, release package time reduced by 11s.

Build template optimization

There are many build variants of Android by purpose, debug, release, remote and non-remote, etc. Some of the plugins’ optimizations such as Turbo dex reduction and 7ZIP compression are unnecessary during the development phase and can be disabled.

Benefits: Debug shortened about 1-2 minutes, release package unchanged.

Reduce code size

Among the construction time, the Java code obfuscation time accounts for about 60% of the total construction time, and the obfuscation time is positively correlated with the code size. So build time is positively correlated with code size.

The code size expansion is partly due to business expansion and partly due to engineering corruption of redundant code. From the second half of 2020 to the first half of 2021, Youku has carried out regular package volume management, which has achieved excellent results in package volume and partially alleviated the problem of code corruption.

As shown in the figure below, from the second half of 2020 to now, the Java code scale of The Android terminal has decreased by 25%, and the contribution of construction speed is about 45s+.

Benefit: Release package build time reduced by about 45s+. The debug package takes about 5 to 10 seconds to build.

The hardware optimization

Private build tenant pools

Linux performance analysis tools such as IOstat and TSAR were used to analyze the construction process of Youku android terminal (as shown in the figure below). CPU IO-wait is serious in the entire construction process, that is, a large number of I/O operations are performed during the construction process. As the I/O performance of the construction machine is insufficient, the construction takes a long time. This verifies the effectiveness of software optimization in reducing the I/O construction speed.

There are two solutions to THE I/O bottleneck: Buffered IO to improve I/O performance:

  1. First, only agP plug-in was retained for construction, and it was found that the construction speed was not significantly improved. Proof: custom plug-in has no IO optimization space;
  2. Secondly, the analysis of Android construction process shows that IO fragmentation is less, so buffered IO is used to deal with IO bottleneck, and there is not much room for optimization.
  3. Finally, by using SSD to replace the mechanical hard disk effect is better. The comparison data of physical mechanism construction are as follows:
The machine type Non-first build (cache dependent case) main case First build (no dependent cache case) Hardware.
Group 1 (Desktop) 12min57s 21 min SSD Ex900 521G/write peak value about 900Mb/s/Intel 2.9Ghz 16 threads /16 gb memory
Group 2 (Dell R740 Blade Server) 25min 40min Mechanical hard disk/write peak about 254Mb/s/snapdragon 2.1Ghz 24 threads / 48 gb memory
Group 3 (Dell WorkStation) 19min58s 27min SSD EX900 521G/write peak about 900Mb/s/snapdragon 2.2Ghz 20 threads /32 gb memory
Group 4 (Desktop) 10min10s 23min SSD 512G/Write peak about 1G/s AMD 3.5ghz 24 threads 32G memory
Group 5 (DevOPS cluster) 23min 40min Mostly mechanical hard disk, depending on the dispatch of specific machines

Benefits: about 5min less debug, 10min less release package.

conclusion

To sum up, in order to maintain the results of construction speed optimization, we can carry out the following work:

  1. In order to meet the requirements of data index setting and construction optimization, we need to set up construction optimization index and establish a reasonable data evaluation system.
  2. Most of the optimization results of build speed can be obtained through software optimization methods such as splitting build template, clearing redundant tasks, upgrading Gradle and Android Gradle Plugin, and reasonably setting relevant parameters of build — DX build optimization.
  3. The hardware optimization part needs to establish the analysis of the key bottleneck of the construction process, and the bottleneck of each application construction may be different.

Limited by technical means and stability problems, construction speed optimization has the following unfinished parts.

  1. Control and clearing of obfuscation rules: There is a positive correlation between the execution speed of obfuscation tasks and the number of obfuscation rules.
  2. Governance of project corruption degree: The scale of useless code is an important indicator of project corruption degree, and the scale of useless code is an important factor affecting construction speed. However, how to control the degree of engineering corruption in large-scale projects is the next important topic to be explored in application architecture and overall application development.
  3. R8 construction optimization: Through the upgrade test of Android DEX construction tool chain — R8, it was found that the construction speed of Youku Release package was significantly improved. However, the optimization of R8 is delayed due to some Google bugs.

In the future, youku technical team will also continue to optimize the construction time for the above problems, welcome everyone to discuss with us at any time.

[Related documents]

  • Linux performance tuning guide: www.processon.com/view/618255…
  • R8 related problems: issuetracker.google.com/issues/1923…
  • Atlas:github.com/alibaba/atl…

We’re hiring!

Youku — Technology Center — Architecture team recruitment, as described below.

【 Job Description 】

  1. Responsible for android infrastructure work with Youku as the core, including basic framework, middleware, etc.

  2. Responsible for long-term governance of APP stability, performance, package slimming and other work to improve basic user experience.

  3. Research the cutting-edge technology engineering of mobile terminal, and explore the technology trend.

  4. Solve all kinds of difficult problems, support rapid, stable and efficient business iteration.

【 Job Requirements 】

  1. Familiar with Android SDK, Framework and other basic technologies, and good source code analysis ability.

  2. Proficient in Java language, solid basic knowledge, Kotlin, C/C++ development ability is preferred.

  3. Experience with any part of the Dalvik/Art virtual machine is preferred.

  4. Experience in key technology selection, Bug fixing, memory optimization is preferred.

Resume should be sent to [email protected]

Pay attention to [Alibaba mobile technology] wechat public number, every week 3 mobile technology practice & dry goods to give you thinking!