Special edition:

If you have any questions about the mPaaS server component, you can also ask the mPaaS server leader

I. Background introduction

Image loading has always been a difficult problem for Android apps. Loading speed and memory consumption are inherently contradictory. Relying on the complex ecological business scenario of Alipay’s super App and drawing lessons from industry-leading open source frameworks Fresco and Picasso, we took the essence and discarded the trash. And using Ashmem, Native Mem Cache, Bitmap Reuse, scene Cache, image size Cache and other multi-dimensional integrated image loading technology, to achieve a perfect balance of loading speed and memory consumption.

After three years of wind and rain, xMedia multimedia image loading component has become an important driving force of Alipay, carrying most of the business. At the same time, we also provide stable image loading technology to external enterprises through the external output of mobile development platform mPaaS.

Android Memory Fundamentals and challenges

With the limited heap memory allocation of a single process in The Android system, and the uneven hardware performance and system version of different Android phones, it has become an essential topic for large apps, especially those that contain image loading components, how to use Android memory efficiently and reasonably. He that would do a good job must sharpen his tools. If you want your App to use memory efficiently and reasonably, you need to first understand some basic knowledge about Android system memory.

1. Android memory classification

For mobile phones, storage space is divided into ROM and RAM, just like for computer devices.

| ROM (Read Only Memory):

In fact, there are many kinds of ROM types, read-only, read and write, mainly used to store some data information, data will not be lost after breakpoint.

| RAM (Rondom Access Memory):

The physical memory of a mobile phone at runtime, which is responsible for running programs and exchanging data. Stored information is lost when a power failure occurs. The memory space of the program process is only virtual memory, while the actual physical memory of the program is RAM. The operating system will map the virtual memory of the process applied by the program to the physical memory RAM.

General memory in Android application process can be divided into Heap memory, Code Code area, Stack memory, Graphics memory, private unshared memory and system memory, where Heap memory is divided into Davilk Heap and Native Heap.

Android can view the current process memory usage with adb shell Dumpsys meminfo+ Package name or pid command, as shown in Figure 1.

The memory categories are described as follows:

type describe
Native Heap Object memory allocated from C or C++ code. Native Heap is the memory allocated in Native Code using malloc, etc. This part of memory is not limited by the size of the Java Object Heap, that is, it can be used freely, of course, it is limited by the system. The upper limit is usually 2/3 of the system RAM.
Dalvik Heap For object memory allocated from Java or Kotlin code, The Android system limits the Dalvik Heap size per process. The maximum Heap size can be obtained by calling SystemProperties.
Code Memory used by code and resources such as dex bytecode, optimized or compiled DEX codes,.SO libraries, and fonts.
Stack The system stack, which is allocated by the operating system, mainly stores function addresses, parameters, local variables, recursive information, and so on. The stack space is not large, generally several MB.
Cursor Specifies the memory occupied by Cursor at /dev/ashmem/Cursor.
.* mmap Used for storing.so,.dex,.apk,.jar,.ttfMemory used by file storage mappings.
AshMem Anonymous shared memory, based on mMAP system, is different from MMAP in that AshMem registers Cache Shrinker to control memory reclamation.
Other dev Occupied by the internal Driver.
EGL mtrack It occupies Graphics memory, the memory used to display Graphics pixels on the Graphics buffer queue item screen.

Figure 1 provides a simple and intuitive look at the basic memory classification and usage of Android processes. For application developers, the memory operations directly contacted mainly focus on Dalvik Heap and Native Heap, especially Dalvik Heap memory, which is often used improperly in the application.

The reason why an application is prone to OOM is not that the system RAM is insufficient, but that the system imposes a mandatory limit on the Dalvik Heap size of the virtual machine process. Once the total Dalvik Heap size used by the application allocation exceeds the process limit threshold, The bottom layer will throw an OOM exception to the application layer.

2. Android memory reclamation mechanism

Since the application is easy to appear on OOM, and Android upper-layer applications are mostly based on Java language program development, developers do not need to display the allocation and release of memory like C/C++ development, most of the memory is unified by the system garbage collection mechanism for memory collection management, Memory seems to be out of control. During the development, the system frequently triggers GC and OOM because of some memory leaks and unreasonable memory. During the system GC, the thread will be suspended and the application will run slowly. Therefore, it is necessary for an application developer to understand the mechanism of memory reclamation.

Android MEMORY GC reclamation has two layers: in-process and process-level memory reclamation.

| process memory recovery:

When the virtual machine’s own garbage collection and system memory state changes, it notifies the application to let the developer do the memory collection themselves. The garbage collection mechanism of the virtual machine is through the inside of the virtual machine monitor application object creation and usage, and under certain conditions the destruction of recycling useless objects take up memory, useless objects recognition here usually have reference counting, object marker tracking and generational algorithm, algorithm principle of concrete can be reference. Even if the virtual machine automatically recycle those objects that are no longer referenced, but the developers can not use memory so as to cause OOM, developers generally need to confirm in appropriate occasions when some objects are no longer used, the active release of its reference, to avoid the occurrence of useless objects are held for a long time caused by memory leak, The VIRTUAL machine cannot free memory for the leaked object during memory reclamation.

| process level memory recovery:

The lower the priority of a process, the easier it is to reclaim the memory. As shown in Figure 2, Android processes are divided into five priorities by default, and their priorities are in descending order: empty process -> background process -> Service process -> Visible process -> Foreground process.

On Android, the oom_adj value of a process represents the priority of a process. You can run adb shell cat /proc/process pid/oom_adj to view the oom_adj value of a process. A larger oom_adj value indicates a lower priority. The Android memory reclamation is coordinated by the Frame Work layer and the Linux kernel layer, and the overall process is shown in Figure 3.

In the Framework layer, the Activity Manager Service (AMS) is responsible for centrally managing the process memory allocation and adjusting the oOM_ADJ value of the process, and then notifies the kernel layer of the oOM_ADJ value, and notifies the application program of the memory shortage according to the system memory and process status. Easy for developers to take the initiative to reclaim memory.

The kernel layer is divided into Low Memory Killer (LMK) and OOM Killer (Low Memory Killer). OOM Killer is a Memory recycling mechanism under Linux. When the system runs out of Memory and cannot allocate new Memory, it is enabled to selectively kill some processes. The system is already unstable; LMK is a multi-level OOM Killer extended by Andorid based on OOM Killer principle. It triggers memory reclamation in advance according to the memory threshold level before reaching OOM, and specifies a set of memory thresholds in the user’s built-in space. If one of the values is in the same range as the oom_adj value in the process description, the process is killed. For details about LMK, please refer to.

3. Industry picture components

As described above, for apps that use a large number of images, the decoded images, known as bitmaps, occupy a large amount of memory and are bound to be more likely to trigger frequent GC. At present, several mature open source image loading components in the industry include Facebook’s Fresco, Google’s Glide, Square’s Picasso, etc., all of which use the three-level cache technology, namely “memory cache + disk cache + network”. The loading priority is memory Cache -> Disk Cache -> Network in descending order. In terms of memory caching, Bitmap objects are directly cached, and part of the strategy is similar, as shown in Figure 4.

| Fresco:

The memory cache uses CountingMemoryCahce, which contains the cache mCachedEntries in use and the cache mExclusiveEntries to be reclaimed, all based on CountingLruMap. The contents of the memory cache contain Bitmap and undecoded image data EncodedImage. The Bitmap cache is checked first. If there is no undecoded image in the memory cache, obtain and decode it. For Bitmap memory cache:

  • For systems below 5.0, the KitKatPurgreableDecoder takes advantage of the system features to put the decoded Bitmap pixel into AshMem (Native Heap also occupies one copy of data in the actual test). As you can see from Figure 1, AshMem does not use Java Heap memory, so Bitmap cache does not use a large amount of Java Heap memory, reducing the frequency of GC and OOM caused by images using Java Heap memory.

  • In systems above 5.0, BitmapFactory is directly called in ArtDecoder to decode images and generate Bitmap, and the generated Bitmap occupies Java Heap memory. In the decoding process, the inBitmap and inTempStorage properties of BitmapOptions are reused with BitpmapPool and SyncronizedPool respectively, so as to maximize the reasonable utilization and optimization of memory. Detailed decoding process can be referred to.

| Glide:

The memory cache is designed to directly store Bitmap objects by combining LruCache and Weakference, and Bitmap objects are repeatedly reused from BitmapPool, which reduces frequent creation and reclamation of bitmaps and reduces memory jitter.

| Picasso:

The LruCache, based on the LinkedHashMap implementation, stores Bitmap objects. Bitmap objects occupy exclusively Java Heap memory, so their maximum cache capacity is only 15% of the maximum memory value of a single process.

By comparison, except for Fresco, the other two image components basically use LruCache+Bitmap directly, and Bitmap occupies Java Heap memory. Fresco, on some system versions, uses a so-called black technology to move Bitmap memory to AshMem, thereby reducing Java Heap memory usage.

The memory cache for the xMedia image component uses a multi-dimensional cache design, described in more detail later.

4. Technical challenges

For the complex ecological business scenario of Alipay, xMedia used the common heap memory caching technology based on LRU culling mechanism at the beginning, which could not satisfy the balance between experience and performance. In the whole development process, it encountered the following pits:

| main process picture memory cache Java Heap is too high

  1. A large number of image memory caches cause the App to occupy too much Java Heap memory, which frequently triggers GC and leads to page jams.
  2. Background processes can easily be killed if they have too much memory. It is important to keep App memory low without affecting the experience. The image is stored in the entire App
  3. Do not use too much memory for other services or functions. Otherwise, functions or experiences may be affected.

| large cache will accelerate insets cache

  1. Using LruCache+Bitmap, large image decoding occupies too much memory. For example, a 1280*1280 decodes in ARGB8888 mode occupies nearly 6M memory, while the total Heap memory allocated by a single process on a low-end computer is only about 100M. Image memory cache can only be dozens of megabytes at most, storing a large image at most is 10 pieces, it is easy to cause the image memory cache LRU obsoleted, affecting the small image loading experience.

  2. When the image memory cache of common services reaches the upper limit, it is expected to be effectively reclaimed. However, some specific services do not want to be reclaimed frequently. For example, the image memory occupies a small amount of memory but is used frequently.

  3. Gif contains multiple frames of images. If each frame is decoded separately to generate a Bitmap, an animation needs to cache many bitmaps, which is more likely to lead to the recycling of ordinary images.

Refine memory cache

In order to solve the above problems, the idea is to minimize the proportion of images cached in the Java Heap, such as the image cache stand-alone process, change the Process Java Heap limit, and move the image memory to non-Java Heap storage. Finally, xMedia selects the scheme as shown in Figure 4 and adopts three types of memory cache designs: ordinary cache NativeHeap, cache Heap and temporary cache SoftReference.

1. Normal cache NativeHeap

As the name implies, Native memory is used as the image memory cache. The main reason is that Native memory is not controlled by vm memory reclamation, which can effectively reduce the Java heap memory usage and thus reduce the probability of GC.

  • Under 5.0 system versions, LruCache is used to directly manage the decoding of bitmaps using AshMem memory.

AshMem memory is different from ordinary heap memory. This part of memory is similar to the Native memory area and is managed by the bottom layer of the Android system. Options (inPurgeable and inInputShareable) are set to ensure that the decoted Bitmap uses AshMem. This memory is not counted into the normal heap memory in Android system, so it is not easy to trigger GC and OOM.

  • Use NativeCache in version 5.0 and above.

The NativeCache scheme occupies the memory of Native Heap. It is recommended to use the images that are commonly used. The implementation principle is as follows: The upper layer uses LruCache to manage cache information. Key is the key that only indexes the image, and value is the BitmapInfo that holds the pointer to the Native memory copy of the Bitmap. When the cache is eliminated, the corresponding Native memory is released. Both schemes use 3/8 of the process memory, with a maximum of 96M.

In the initial optimization of memory cache, multiple schemes were tried and compared. In the case that Android 4.0 and above system supported the reuse of Bitmap, the Native scheme of using JNI interface to manage C memory was finally selected.

The following is the memory read time data test comparison, and the results are shown in Figure 5 and Figure 6:

Figure 6.Native memory Bitmap overcommitted and unovercommitted load times

Test conditions:

Redmi Note1, system version 4.4.2, single process system allocates 128M maximum heap memory by default.

Test results:

1) As can be seen from Figure 5, the reading speed of the image memory cache based on Native is basically controlled within 3ms, which takes about 1ms more time than the pure Heap based memory speed. It can be basically considered that the reading speed of memory based on Native is the same as that of ordinary Heap memory.

2) As can be seen from Figure 6, when the Bitmap in Native memory is not reused (a new Bitmap is created from the system for each loading), a loading time of more than 100ms will occur periodically. The reason is that new Bitmap is frequently created for each loading, which will increase the system heap memory overhead. This causes memory jitter, which increases the GC frequency of the system, especially on low-end models, as shown in Figure 7.

2. Cache the Heap

This cache is a common heap memory cache based on LRU phase-out policy. The total size of the cache is 1/8 of the current process, and the maximum size is no more than 64M. The content of the cache is the Bitmap object after the decoding of the picture.

3. Temporary cache SoftReference

This cache is mainly used in two scenarios: giF-related objects and oversized graph objects are stored, occupying Java Heap memory. The implementation principle is that SoftReference is used to retain references to Bitmap or Gif objects, and GC can be performed in time to free up memory when memory is tight. The main purpose is to reduce the load of a single large memory map (the default is a large memory map for 5M), which will eliminate many scenarios of small memory maps, and improve the user’s picture experience.

The entire image memory loading and storage process combined with the above three memory caches is shown in Figure 10 and Figure 11:

Iv. Test comparison of competing products

Test conditions:

Based on Android 4.4 and 6.0, use different picture components to load 20 local images on the same interface. The memory usage of each image component is shown below, and the results are shown in Figures 8 and 9.

Test results:

1. On Android 4.4

| Java Heap memory usage:

Picasso->Glide->(Fresco and xMedia). The Fresco and xMedia image caches do not occupy Java Heap memory. When exiting the test interface GC, Picasso did not free the Java Heap memory, while Glide actively did so internally.

| Native Heap memory usage:

Fresco->xMedia->(Picasso and Glide), where Fresco used a so-called black technique to put the image memory cache in AshMem, but AshMem is actually two different memory areas from the Native Heap. Fresco occupies one portion of the AshMem Heap and one portion of the Native Heap. XMedia does not occupy the Native Heap, but only the AshMem; Picasso and Glide do not occupy Native and AshMem memory. If you want to dump the current process memory, you can see why Fresco is in AshMem and Native Heap, while xMedia is in AshMem. In FIG. 10, both Native Heap and AshMem occupancy changed greatly before and after Fresco loaded the image. In Figure 11, only AshMem changes greatly before and after the xMedia image is loaded.

Figure 11: Memory usage before and after xMedia loads images

2. On Android 6.0

| Java Heap memory usage:

In descending order, Fresco->Picasso->xMedia->Glide. Each of the four image components occupies the Java Heap. XMedia does not cache bitmaps directly, but the interface UI controls reference these bitmaps, so xMedia uses the Java Heap. However, after exiting the test interface and GC, the whole Java Heap will be released. When entering the test page next time, the corresponding picture data will be directly copied from Native to the newly created or reused Bitmap for display. Glide will actively release all the image memory cache internally after exiting the test interface, but it needs to re-decode all the images when re-entering the test page to load, and the cache reuse rate is not high.

| Native Heap memory usage:

From high to low, xMedia->(Fresco, Picasso, and Glide) is used for image caching. Only xMedia uses Native Heap for image caching, while the other three use Java Heap.

In general, under 5.0 systems, xMedia has advantages in both Ava Heap and Native Heap; Above 5.0 system, xMedia broke through the image memory cache using Native Heap technology, although from the Java Heap or Native Heap occupation, Glide Java Heap and Native Heap is the smallest, Glide will take the initiative to recycle as long as the Bitmap is no longer used, and the next load needs to be decoded again, so the cache reuse rate is not high; In addition, xMedia will occupy double memory for the image that is being displayed, and only occupy Native Heap for the image that is no longer displayed. However, the advantage of Glide is that Native memory cache still exists after exiting the interface, and there is no need to re-decode the image when using it next time, which is more efficient. The overall performance of Fresco and Picasso was weaker than that of xMedia and Glide.

V. Other optimization points

  1. For ordinary large images, the image size and memory size are reduced by limiting the maximum edge to 1280. For social images, we provide three different sizes of thumbnail (120×120), large image (1280×1280) and original image. Even for large original images, we will limit the maximum edge size to 12000. And then when decoding again sampling processing.
  2. For the thumbnail fuzzy image of social conversation, the scaled fuzzy image is directly clipped and scaled by the server and pushed to the client by push message to render and display directly, avoiding the gray background in rendering after network request when viewing the picture message.
  3. The background pressure is divided into different stages to actively clean up the image memory cache, to ensure that the overall memory of the wallet is running in a low position after the background pressure, and to reduce the probability of the background process being killed.
  4. Periodically clear the infrequently used memory cache. The principle is to update the cache usage time every time it is used, and then periodically scan the cache that exceeds a certain time and actively clear it.
  5. Supports the Listview, ViewPager, RecyclerView to stop loading during the sliding process, and then loading after the end of the sliding, reducing some unnecessary task overhead.
  6. Gif images use self-developed decoder, by multiplexing a Bitmap object to achieve the decoding of each frame of data display, reducing the memory footprint.

Vi. Summary and prospect

This paper introduces xMedia’s multi-dimensional and refined memory management scheme for image memory cache, and focuses on using JNI to manage Native C layer memory to achieve the purpose of image memory cache, which breaks through the size limit of Java Heap. This scheme also has a small flaw, that is, when displaying the current image, in addition to the Native occupies a decoded memory, the Java heap exists in the service also occupies a memory, so the service needs to reuse ImageView as much as possible when using, and release it in time after using. With the development of mobile terminal intelligence and big data, if we can do some artificial intelligence management based on big data for picture memory, we believe it will bring better technical experience.

If you are interested in mPaaS multimedia components, you are welcome to visit the mPaaS documentation page to learn more.

Past reading

The opening | ant gold clothes mPaaS core components of a service system overview”

Ant Financial mPaaS Server Core Component System Overview: Mobile API Gateway MGS

Ant Financial mPaaS Server Core Components: Analysis of Mobile End-to-end Network Access Architecture under 100-level Concurrency

“Alipay App Building Optimization Analysis: Optimize Android Startup Performance through installation package Rearrangement”

“Alipay App Building optimization Analysis: Android Package size extreme compression”

Follow our public account, get first-hand mPaaS technology practice dry goods

Nail group: Search “23124039” through nail group

Look forward to your joining us