1. An overview of the

With the rapid iteration of the business, the packet size on Douyin’s Android side has exploded. Package size has a direct impact on download conversion rates, promotion costs, running memory, and installation time, so slimming down your APK is a necessary and lucrative business. Apk consists of dex, Resource, implies, native libraries, and meta-data. For each part, package size optimization can be done specially. After a period of efforts on The Android end of Douyin, packet size optimization has achieved periodic results. It is still being optimized.

–	Before optimization	The optimized	The percentage
trill	73MB	61.5 MB	15.7%
Trill lite	10MB	4.9 MB	51%

Resources account for a large proportion of the apK package volume, and resource optimization is an important part of package size optimization. In line with the principle of pursuing perfection, this paper will elaborate on the optimization measures for the resource part of Douyin Android terminal.

2. Image compression

2.1 Image compression principle

In the case of no compression, the formula for calculating the image size is: Image size = length x width x bit depth. An original image (1920×1080), if each pixel 32bit representation (RGBA), then the image requires a storage size of 1920x1080x4 = 8294400 bytes, about 8M, a picture this big is not acceptable. So the images we use are compressed. Image compression uses the principles of spatial redundancy and visual redundancy:

Spatial redundancy makes use of the spatial coherence between the colors of each sampling point on the image, and merges and compresses the data that should be stored one pixel at a time, and then decompresses and restores it when loading. Lossless compression usually uses the principle of spatial redundancy.
Visual redundancy refers to the fact that the human visual system, limited by physiological characteristics, pays uneven attention to the image field and does not feel obvious to subtle color differences. For example, the general resolution of human vision is 26 gray levels, and the general image quantization is 28 gray levels, that is, there is visual redundancy. Usually lossy compression uses the principle of visual redundancy, erasing information that is redundant to the human eye.

2.2 advantage

Douyin Android research and development team developed Gradle plug-in McImage, hook resources during compilation, using open source algorithm pngQuant/Guetzli compression, support webP compression. Compared with some known schemes such as Tinypng, it has the following advantages:

McImage now supports WebP compression, the compression ratio is higher than Tinypng, but Android WebP needs to be compatible, will be described in detail below;
Tinypng is not open source, so each account can only be compressed 500 free per month; The compression algorithms used by McImage are all based on open source algorithms;
McImage can compress not only module images, but also jar and AAR images;
McImage support compression algorithm expansion, there is a better compression algorithm selection when the expansion is convenient;
Compared to other solutions in the industry, McImage also supports compression of webP images containing transparency and is compatible with AAPT2 hooks for resources.

2.3 earnings

McImage supports two optimization methods, which cannot be used at the same time:

Compress, pngQuant Compress PNG images, Guetzli Compress JPG images;
ConvertWebp, Webp compression PNG \ PNG image.

The compression ratio of WebP is higher than that of PngQuant and Guetzli, so ConvertWebp is more recommended.

McImage has also been used in image compression and optimization for several bytedance products, with the following benefits:

describe	earnings
Trill – Compress	9.5 MB
Trill – ConvertWebp	11.6 MB
The volcano – ConvertWebp	3.6 MB
Vigo-ConvertWebp	4MB
Vigo aab-Compress	1.2 MB
vigo aab-ConvertWebp	3.2 MB
Flash – ConvertWebp	3.5 MB

2.4 other

In addition to compressing and optimizing images, McImage provides the following features:

Large image detection. In the app/build/ McImage_result directory will generate McImage_log. TXT log file, in addition to output the conversion result log, in the end also output large pixel image and large volume image, the threshold can be set in the McImageConfig. Convenient to optimize the size of the package; Also support compilation phase detection, detect the large image directly block compilation, can be found in time to submit the large image;
The compression algorithm is easy to scale. If you want to add other compression algorithms, just inherit AbstractTask and implement the Work method in ITask interface.
Support multi-threaded compression. < span style = “font-size: 14pt;”
Increased the image cache to further shorten the packing time. In the case of multithreading + image cache, the process of the whole McImage is less than 10 seconds if all the images are hit. The cache path is configurable.
The compression quality can be configured to meet different compression quality requirements. Cache files are saved and hit according to different compression quality.
Scan images that do not contain transparent channels to app/build/ McImage_result directory.

3. Webp non-invasive compatibility

3.1 Tinypng and WebP options

Which compression ratio is higher between Tinypng and Webp? There is no direct comparison of the compression ratio of the two compression algorithms on the Internet, so a more intuitive comparison is needed, so the following experiment is done:

The 1,960 images in the project were scanned and compared by compression of different algorithms:

describe	The size of the
The original image	13463.07 KB
Webp compression	4177.18 KB
Tinypng compression	6732.18 KB

Find 490 images from the project, create a new demo, compress the images with different algorithms and compare the size of the package APK:

describe	The size of the
The original APK	9617.53 KB
Webp compression APK	3924.06 KB
Tinypng compression APK	5386.80 KB

By comparing the two groups of experiments, it can be seen that the compression ratio of WebP is better than that of Tinypng. We also manually compressed all images in the douyin project using the WebP tool, reducing the package size by about 1.6MB. Therefore, the Webp compression algorithm was chosen.

3.2 Scheme Selection

Webp compression algorithm, compared with PNGQuant, Guetzli, Tinypng, WebP compression ratio is higher, so WebBP compression image should be a better choice. However, webP support on Android devices has compatibility issues and is not fully supported until 4.3. We know from our website that to use WebP with transparency directly in your application, minSDK needs to be at least 18.

–	advantages	disadvantages
Provides specific API compatibility	Simple to implement	It is too intrusive and must be loaded with a specific interface or a specific View
The LayoutInflater setFactory is compatible	Simple to implement	Need for all ImageView and subview processing, and must have a unified Activity, Fregment base class processing
Runtime hook system key methods	Method substitution can be made non-invasive	It’s more complicated to implement

3.3 Scheme Implementation

Runtime hooks are the best choice for non-intrusive compatibility. However, the runtime hook solution needs to solve the following problems:

The selected hook scheme should be stable and reliable;
Hook points should be convergent enough to ensure that all parsing of the image is performed as expected.

3.3.1 Hook scheme should be stable and reliable

Through the research and comparison of Xposed, AndFix, Cydia Substrate, Dexposed and other common Android Java Hook scheme, Dexposed has the characteristics of not need root, and hook system method, Finally choose Dexposed:

Dexposed on Dalvik is more stable, only need to do hook for the following 4.3 mobile phone version, do not need to consider version compatibility issues and system upgrade issues;
According to the internal data, there are not many users of Douyin below 4.3, which is at the level of 100,000, accounting for a few tens of thousands of the total users, and the risk is low.

3.3.2 Hook points should be sufficiently convergent

By reading the source code, it is found that all the images are loaded and parsed into bitmaps, and eventually the methods in the BitmapFactory are called. For example, the call path to setImageResource() for ImageView is as follows:

ImageView setImageResource process, Bitmap creation is implemented through the BitmapFactory. SetBackgroundResource (int resid);

If you look at any API that loads an image, you go through the process of Resources calling getDrawable. The related method of Drawable is called, and then the BitmapFactory resolves the different resource types (File\ByteArray\Stream\FileDescriptory) to Bitmap. We can infer that BitmapFactory is a unified interface for Android system to load bitmaps through different resource types. This is also seen in the class annotation of BitmapFactory:

BitmapFactory is a good hook point because the process of system loading and parsing Bitmap is convergent enough and is implemented by BitmapFactory.

With a stable Hook scheme and enough convergence Hook point, the implementation of the scheme will come easily, the use of Dexposed on the BitmapFactory key method to replace it.

4. Multiple DPI optimization

In order to adapt to a variety of different resolution or mode of the device, Android design for developers the same resource multiple configuration of the resource path, app through the resource to obtain picture resources, automatically according to the device configuration to load the resources. But these configurations come with the problem of high resolution devices containing low resolution useless images or low resolution devices containing high resolution useless images.

Generally, for the domestic application market, in order to reduce the package size, the App will choose the dPI (Google recommends XXHDPI) which has the highest market share and is compatible with all devices. Most apps in the overseas application market will be packaged and uploaded to Google Play through AppBundle, which can enjoy the function of dynamic distribution of DPI. Mobile phones with different resolutions can download different PICTURE resources of DPI. Therefore, we need to provide multiple sets of DPI to meet all devices. In the project, some of our images had only one set of DPI, while others had multiple sets of DPI. For the above two scenarios, we combined resources and copied resources respectively in packaging, reducing the package size.

4.1 DPI Replication (Bundled)

In domestic projects, in order to reduce the occupancy of pictures, the dPI with high market occupancy rate is generally adapted, for example, only xxHDPI resolution pictures are retained. This leads to two problems. One is that there are more and more 2k resolution phones on the market. If the mainstream resolution of mobile phones is XXxHDPI in the future, the modification cost of thousands of pictures in the project will be very high. Another problem is that many of the company’s overseas products are packaged and uploaded to Google Play via AppBundle, which can deliver different DPI resources to users on different devices. However, only XXHDPI is available in the project, and xxHDPI pictures are still delivered. You cannot reduce the package size by reducing DPI. In Brazil, 80% of our users use XHDPI and HDPI phones, and XXHDPI takes twice as many pictures as HDPI, which is a very profitable part.

Therefore, we reduced the high resolution images to low resolution by means of compression resolution, and the project business only stored the highest DPI images, which were copied and filtered according to the needs when packaging. We hook the task of image compression. Before image compression, we get all PNG images including dependent libraries, use Graphics2D to reduce the resolution of the images, and put them in the folder with the corresponding resolution. Then execute the image compression task to prevent some image resampling after the size increase.

We only scale the resolution of the image and do not reduce the sample rate of the image, so there is no difference in the display effect. According to the definition of Google, we made a table for the specific resolution of different DPI:

–	LDPI	MDPI	HDPI	XHDPI	XXHDPI	XXXHDPI
Resolution (universal)	240×320	320×480	480×800	720×1280	1080×1920	2k
ratio	3	4	6	8	12	16

We copy the default logo of XXHDPI to all DPi. The process is as follows. There is no corresponding picture in xhdPI and MDPI folders. There is a corresponding picture in hdPI, skip it; Xxxhdpi also has no corresponding image, but in order to avoid reducing the accuracy of the picture, can not be copied to a higher resolution folder, skip.

As shown in the figure, when the TikTok R&D team used this solution, LDPI reduced the package size of 2.5m compared with XXHDPI. At the same time, the low-resolution mobile phone directly loads the corresponding DPI image resources when loading the images, and no longer needs to scale the high-resolution images, which improves the performance.

When copy need to pay attention to these problems, in order to handle your all images including dependent libraries, copied in resource consolidation stage, which can cause. Some of the cache directory path can be a lot more image resources, so the plugin we open on CI, avoid local packaging added a lot of pictures, submitted to the warehouse code. At the same time, because multiple copies of images are copied in the cache, multiple DPI should be removed in the assemble packaging process. There is a concurrent scenario on the CI. If Duplicated and compressed at the same time, there is a.png and A.waebp in the. Cache directory, which leads to Duplicated error.

4.2 Multiple DPI reassemble

For ordinary packaging mode (direct output of APK, such as douyin packet), we can choose to keep only one picture with a high resolution, so that the high resolution device can get appropriate pictures, and the low resolution device will automatically zoom when it obtains from Resource, and still ensure a reasonable running memory.

Multiple DPI images can be deduplicated using Android resConfig, but only for qualifier. For example, pixel density and screen size are not deduplicated at the same time. Douyin uses AndResguard to deduplicate Drawable images. You can define the priority and scope of different configurations. According to the optimized configuration, ensure that one share of resources is left. The optimization method is shown in the following figure (grey data indicates that it will be deleted) :

5. Duplicate resource consolidation

As the project iterates, it is inevitable that the same resources will be added to the resource path repeatedly in the project. For such files, manual processing is certainly not feasible, and they can be automatically deduplicated during the packaging phase.

Douyin chooses to analyze all the resources in the AndResguard phase, reserve a copy of the md5 identical resource file, and delete the other duplicate files. Then, when AndResguard writes to the ARSC file, the resource path of the deleted resource file points to the only reserved resource file. The optimization mode is as follows:

The following figure shows the optimization results of douyin 511 version with multiple DPI deduplication and duplicate resource merging functions:

The MD5 file was deduplicated. Procedure	Deweighting DPI images	MD5 file deduplication reduces the number of files	MD5 file deduplicating reduces the total file volume	Deduplicate DPI image files to reduce the number of files	DPI image file deweighting reduces the total volume of the file	Apk size	Less than the original package
false	false	–	–	–	–	85030636	–
true	false	171	156.6 KB	–	–	84883829	143KB
true	true	171	156.6 KB	391	312.9 KB	84507008	511KB
false	true	–	–	422	434.5 KB	84523236	495KB
true	True (full open)	171	156.6 KB	463	465.4 KB	84352272	662KB

6. ShrinkResource is in strict mode

6.1 background

As the project progresses, we will have many resources that are no longer in use, but are still in the project. Although we can use the company’s open source bytecode plug-in development platform ByteX to develop plug-ins and scan for some useless resources before ProGuard, the scan results are not complete because the useless code is not deleted in this step. ShrinkResources is Google’s official way to optimize for such useless resources. Running after Proguard, shrinkResources flags all unused resources and optimizes them.

6.2 earnings

When shrinkResources strict mode is enabled on Douyin Android, the shrink resources number is 600+ and the revenue size is 0.57MB.

6.3 Access Methods

ShrinkResources is an official tool provided by Google, so refer to the documentation on Google Developer for details on how to access shrinkResources.

6.4 shrinkResources principle

By default, Resource shrink is in safe mode, Val name = string. format(“img_%1d”, Angle + 1) val res = resources. GetIdentifier (name, “drawable”, PackageName), so that we can safely return the resource when the reflection calls the resource file. From the source code, Resource Shrink will help us identify the following five conditions:

Resource Shrink, on the other hand, uses one of the dumbest but safest ways to get matching prefix/suffix strings, which is to treat all the strings in the application as possible prefix/suffix matching strings.

Thus, in safe mode, a resource accidentally matched by a string is retained even if it is not used. In our project, for example, in com. Ss. Android. The ugc. Aweme. Utils. PatternUtils, we have the following code:

In safe mode, this causes all useless resources starting with tt to not be dropped by shrink (which is why there are so many useless resources starting with TTLive_ in strict mode).

Strict mode, when turned on, forces off the process of character matching in this section:

It is not safe to use getIdentifier() because it does not match any string in strict mode. Therefore, after you turn on strict mode, make sure to check if any of the shrink resources are reflected to shrink!

6.5 shrinkResources Compatible with the Dynamic Feature

AppBundle is a Feature that Google has been pushing hard in recent years. It enables our APK to be generated and delivered according to different dimensions, and also provides a Dynamic Feature to deliver functions. If shrinkResources is used after Dynamic Feature is enabled, the following error is displayed:

Therefore, Google does not officially support the use of Shrink Resource when App bundles use Dynamic Feature. I found that someone had already submitted an Issue on Google Issue Tracker, related Issue. Google’s response was also crude: —- plans, but no time:

However, normally, if we do a good job, the Dynamic Feature module of our App Bundle will rarely reference the Master resource. Even if it does, the keep.xml method can be used to keep this resource. Therefore, in theory, it should not be a problem to shrinkResource and pay attention to reflection calls to the Master module alone. Check that shrinkResources configuration is in Configuring phase under Dynamic Feature

Therefore, the idea is not to enable shrinkResources during the configuration phase, and then to insert the Task with shrinkResources when executing the resource processing Task:

Start Task, shrinkResources, with Dynamic Feature. This is easy to write in less than 50 lines:

7. Resource confusion (compatible with AAB mode)

The mapping between the Resource ID and the full path of the Resource is recorded in the ARSC file, and the APP obtains the corresponding Resource through the Resource id. Therefore, confusing the name of the Resource path in the mapping can achieve the effect of reducing the packet size.

Douyin has enabled the open source AndResguard of wechat for resource confusion. On the basis of open source, it has been optimized by adding MD5 to it and reserving only one resource for multiple DPI. Because the company has a lot of overseas products, it needs to go aAB when putting Google Play on the shelves, so the team made aAB compatibility with resource confusion — aabResguard, which has been open source.

8. ARSC thin body

8.1 background

Resources.arsc is a file that takes up quite a bit of space in many projects. Common optimizations are to use AndResGuard obblur to reduce filename and directory length, 7z compression, and dynamic language delivery if available overseas. After finishing these optimizations, we decided to try to further optimize the ARSC because there were many overseas products in the company, which involved multi-language relations. After investigation, we finally optimized three aspects, namely deleting useless names, merging duplicate strings in the string pool, and deleting useless copywriting, and finally brought 1.6MB of revenue. Before that, we also did duplicate MD5 file image merges based on AndResGuard, the principle is the same.

8.2 the principle

This binary file has a fairly complex data structure, AndResGuard only changed a small part of the file, and there is no way to change more, so we parsed the file ourselves. There are also many instructions on this file format on the Internet, so I won’t go into them here. Recommend Lao Luo and Nicholas blog and AAPT2 source. The code for Android-Arscblamer and ApkTool provided by Google is also worth a look.

Here is a diagram to briefly describe the modification process:

As shown in the figure, strings are actually obtained by means of index. All strings are stored in two string pools (single package), one is the global string pool and the other is the string pool under package. We only need to modify the offset value pointing to the global string. The following figure shows the binary location of name and value.

8.3 plan

8.3.1 Deleting An Unnecessary Name

AndResGuard also added this feature in July of this year, so let’s see how it works. The string pool for Name is the package string pool, and since this pool contains only all the names, we can be a little more violent by making a backup, emptying the string pool, and adding a replacement string assigned to [name_removed].

First determine which names are called through getIdentifier and configured to be whitelisted. The name entry is traversed, and if it is not whitelisted, the offset for this name is replaced by 0, pointing to [name_removed]. If the name is in the whitelist, it should not be deleted. We find the string corresponding to the name through the backup string pool and add it to the string pool with the offset pointing to the corresponding subscript.

Tiktok reduced the packet size by 70K through this optimization.

8.3.2 Merging duplicate Strings

Value is a global string pool. Although the name doesn’t sound like it has duplicate values, we found that there are many duplicate strings after scanning the sort. (This is not a problem with AppBundle packaging.) In the Douyin project, the string pool has 1K + duplicate strings, and it is necessary to merge these strings.

We first iterate through all the data, then merge the duplicate strings in the string pool, record the offset change, and finally point the reference to the value that needs to be changed to the new offset. This procedure involves manipulating the ResValuel and ResTableMap of the ARSC data structure to ensure that all string values are replaced.

Tiktok reduced the packet size by 30K through this optimization.

8.3.3 Deleting Useless Copywriting

In fact, all strings stored in strings. XML will not be optimized during the packaging process. As the project gets bigger, some abandoned files or files that will be useful in the next version are introduced into APK. We scan again after Proguard and find 3000+ useless strings. In some of the company’s internal overseas projects, some copywriting has been translated into more than 100 languages, taking up a lot of space.

The deletion method is similar to above, pointing to the offset of the replaced string. As shown in the figure, there may be two different names pointing to the same string. You need to determine whether the string to be deleted has other references.

Different projects may have different benefits. The company’s internal and overseas projects have replaced these useless copywriting, reducing the package size of about 1.5m.

8.4 implementation

For normal assemble packaging, get the arsc file from the ap_ file directly in the ProcessResources process and use our tool to modify it.

If AppBundle is used, modify ap_ is useless, because the final product is resources. Pb file generated by aapt in proto format, hook aapt only. The structure of this file is different from that of the ARSC file. Fortunately, we can use the official Resources class to parse and generate the PB file, and modify it in a similar way.

Modify the effect as shown in the figure:

8.5 Further Optimization

There is room for optimization of offset array in ARSC, and we will try to optimize it in the future. Opening the ARSC file with a binary editor shows that such FF values are abundant in the file.

What accounts for this waste of space? As you can see in the image below, each of the blank boxes represents the offset where the string is located. There is no value here. Assigning FF, FF, FF to the default offset wastes 4 bytes of space. In the figure, drawable has 4k+ pictures and 24 columns. Most configuration has only a few pictures, so 4K *23*4≈ 380K is wasted. Roughly, tiktok can reduce volume by 1M. (before compression)

The following figure shows Facebook’s processing of ARSC files. We can extract the ids in a row that have only one value and put them into a Resource Type. Each ID has only one value, which avoids the space waste mentioned above. However, this changes the ID, so the ID in the corresponding code also needs to be modified, which involves reverse XML and DEX, and increases the modification cost. There is another way of thinking is to modify aAPT source code, not directly change ARSC flexible.

9. To summarize

The above is our douyin Android terminal in the package size optimization for resources to do some attempts and accumulation, and strive for the ultimate.

For the optimization of package size, we have also done a lot of optimization measures in other aspects. For so optimization, we have done so merger, STL version unification, simplified exported symbol table, so compression and other measures. To optimize the code, refine the obfuscation rules, develop bytex plug-in for useless code scanning, acess method inlining, getter/setter method inlining, delete line number and other optimization measures.

In addition to optimization measures, a good package size monitoring system is the most important tool to prevent package size deterioration, otherwise the benefits of package size optimization measures are not worth the package size growth caused by rapid business iteration. Douyin Android terminal, combined with CI and Cony platforms, developed a set of code entry pre-check system, each branch increment exceeds the threshold is not allowed to enter; We also developed a tool to monitor the package size by line of business, which is convenient to monitor the package size growth of each line of business and set the package size index for each line of business.

Finally, Douyin Android is looking for people with a passion for technology. Those who are interested can check out tiktok Android related positions via bytedance recruitment website or send their resume to [email protected].

More share

Optimization practice of Douyin BoostMultiDex: The first startup time of APP on Android low version is reduced by 80% (2)

Optimization practice of Douyin BoostMultiDex: The initial startup time of apps on Android Versions is reduced by 80% (1)

Open source | AabResGuard: AAB resources obfuscation tools

Welcome to the Bytedance technical team

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Douyin Packet size optimization – Resource optimization