ART VS. Dalvik

The Dalvik VIRTUAL machine was released in 2008 along with the Android system. At that time, the system memory of mobile devices was only about 64M, and the CPU frequency was between 250 and 500MHz. This level of hardware has changed dramatically. With the rise of smart devices, the performance of mobile chips has improved significantly in recent years. Today’s smartphones already have 6GB or even 8GB of ram. Cpus have also moved into the 64-bit era, with frequencies up to 2.0 GHz and beyond. Hardware updates are often accompanied by software updates. Therefore, it is not surprising that the Dalvik VIRTUAL machine will be phased out.

Dalvik should be replaced by ART for the following reasons:

  • Dalvik is designed for 32-bit and is not suitable for 64-bit cpus.
  • Pure bytecode interpretation plus JIT-compiled execution is weaker than native machine code execution.
  • Both interpreted execution and JIT compilation occur in a single run, and may need to be redone with each run, which is a waste of resources.
  • The old garbage collection mechanism was not good enough, leading to stalling.

Obviously, the ART virtual machine improves on the above mentioned areas. Aside from 64-bit support, there are two major improvements:

  • Ahead-of-time (AOT) is relative to just-in-time (JIT). JIT is compilation of bytecode to native machine code at run time, which is why Java is generally considered less efficient than C++. Both interpreter interpretation and just-in-time compilation at runtime are a more time-consuming process than native machine code execution compiled by C++. AOT is similar to the C++ compilation process: when APK is installed, a tool called dex2oat compiles the dex file in APK into an oat file containing the local machine code. By doing so, the compiled machine code can be used directly during program execution to speed up efficiency.

  • Garbage Collection improvements: Garbage Collection (GC) is a very important feature of virtual machines because its implementation affects all applications running on virtual machines. Poor GC implementation can lead to issues such as bouncing, dropping frames, and slow UI response. Compared with Dalvik VM, the garbage collection mechanism of ART has the following improvements:

    • Change GC pauses from 2 to 1
    • Parallel processing in only one GC pause
    • In special scenarios, recently created objects with short lives consume less time for garbage collection
    • Improve the efficiency of garbage collection and perform parallel garbage collection more frequently
    • The memory garbage collection process of the background process is compressed to solve the problem of fragmentation

AOT compilation is done when the application is installed, and the following figure shows the difference between the Dalvik VIRTUAL machine and the ART virtual machine (on Android 5.0) when APK is installed:

Process for installing APK on two virtual machines

From the picture we can see:

  • In Dalvik VIRTUAL machine, Dex files in APK will be optimized into Odex files during installation, and will be compiled into native code by JIT compiler at runtime.
  • During installation on ART VIRTUAL machine, Dex file will be directly translated into oAT file by dex2OAT tool. Oat file contains both the original content of Dex file and the compiled native code.

Oat files generated by dex2OAT are located in the directory /data/ Dalvik-cache/on the device. At the same time, oAT files are classified by subfolders in this directory because of the difference between 32-bit and 64-bit machine codes. For example, there are usually two directories on your phone:

  • /data/dalvik-cache/arm/
  • /data/dalvik-cache/arm64/

Profile Guided Compilation

The timing of JIT execution requires a tracking mechanism to determine which parts of code need to be JIT- that is, hot zone determination. This tracking technique is known in Android as Profile Guided Compilation.

Profile Guided Compilation works by:

  • Step1: when the application is first started, it will only be executed through the interpreter, and the JIT will step in and perform optimization against hot methods.

  • Step2: during the execution of the first step, a kind of information called profile information is synchronously output and saved in a file.

  • Step3. The above steps will be repeated to improve the profile information. The profile information will record hot methods that need offline optimization, Classes that affect the startup speed of the program to further optimize the startup speed of the program, and so on.

  • Step4. When the device is idle and charging, the profile guided compilation service is entered. This compilation takes the profile information file as input and the output is the binary machine code. The secondary machine code here is used to replace the corresponding part of the original application.

  • Step5, after the previous steps, the application will be able to select the most appropriate execution from AOT/JIT/Interpreter on subsequent startup based on actual conditions.

Profiles in the Cloud:

Profile compilation from N is beneficial for storage space, memory, power consumption, and CPU usage. However, profile compilation is only local, and you need to wait for obtaining a profile before taking precedence. In the future, users’ profiles will be collected and uploaded to the cloud (Google Play), and profiles will be obtained from the cloud for direct use to new users during installation. About a 20% increase in cold start performance.

Reference:

  1. ART virtual machine on Android
  2. Android ART Vm Execution Engine -JIT (9)
  3. New features of Android P from a system perspective