On December 3, 2017, Cheng Kun, Application R&D Manager of IFlyTEK, delivered a speech titled “Iflytek Input Method Android Architecture Evolution and Practice” at IAS2017 Internet Architecture Summit. As the exclusive video partner, IT mogul Said (wechat ID: Itdakashuo) is authorized to release the video through the review and approval of the host and the speaker.

Read the word count: 3031 | 8 minutes to read

Guest Speech video and PPT review:
suo.im/5qUJXn


Abstract

This talk will share the challenges and experiences faced by IFlytek Android from its initial development to its gradual development and maturity, as well as the gradual evolution of the architecture. Finally, some team practices in componentized architecture are mentioned.

Architecture Evolution

An overview of architecture evolution

Iflytek initially adopted simple MVC as the input method. In March 2012, layered reconstruction was carried out, and in December 2014, multi-process architecture was made. Finally, componentization was tried in December 2015.

How architecture evolves

When the architecture does not match the business development, we need to consider changing the current architecture to adapt the architecture to the business. Both organizational architecture and software architecture need to face business problems, and how to solve this problem is the core goal of architecture.

Software architecture should also match organizational architecture, and there should be a one-to-one mapping relationship between them, that is to say, the architecture of system design corresponds to the design structure and organizational structure is equal.

The key to the evolution of an architecture is balance, and architecture designers often pursue perfection, such as designing very scalable, pursuing the ultimate architecture, and incorporating the latest and coolest technologies into the architecture. Whether these practices are required by the business is not certain, so the reliability, scalability and security of the architecture must be balanced, considering the issues of time and cost.

The beginning of the product

The IFlytek input project started in July 2010 with just two developers. In The voice cloud conference in October 2010, the IfO input method will be shown as a demonstration product, so the requirements for the product are very high. You can see that the development time is actually only a short more than three months, and at that time many functions are the first and proletarian test.

Simple MVC

Based on these early challenges, we didn’t really use architecture when we were developing the product. Instead, we prioritized how to implement features quickly and stably. The diagram below shows the design before development, with the presentation layer on the top containing various functional modules, the business logic on the right, the engine for pinyin handwriting and voice input in the middle, and the data storage at the bottom.

Based on the experience of the project development process, when you cannot determine whether the product will survive or meet expectations, do not spend too much effort to make a very good architecture, how to quickly and steadily release the product is the core purpose, so our advice is to reuse as much as possible.

Product rapid development early

During the rapid development of the product, our main work was to complement functions and optimize effects. At this time, the number of developers increased from 2 to 4. There were also some problems in the development process. Because all the codes were written separately at that time, the reuse of the code was poor, the development cost of new functions was high, and it was difficult to maintain.

Based on this problem, we asked the product people to slow down the pace of the product release, so we had three months to refactor the product.

Hierarchical reconstruction

This refactoring layers the product in abstraction, primarily for reuse. The next two layers are business independent. The tool layer contains common tool classes, and the framework encapsulates common business capabilities that are business independent. The service layer and the business layer are business related, for example, the service layer’s logging applies the framework layer’s logging capabilities and incorporates business policies.


After the layered architecture, the development efficiency is greatly improved. The original modules that are poor or difficult to maintain are sorted and optimized again, and many common initial modules are encapsulated.

Product rapid development medium

As the product iterates through more features and more complex code, the traditional layered architecture can no longer meet the requirements. Therefore, we divided the existing team of 10 people into business group and architecture group. The architecture group is mainly responsible for performance and stability optimization.

Multi-process architecture

During the optimization process of the architecture group, we changed the original architecture to a multi-process architecture. The original input method has only one process, which starts very slowly. If the process crashes, the input method cannot be used. So we split the input method into five processes, put the functions that users don’t usually use in a separate process, and kill them as soon as they’re done. Another point is isolation. Some functions of the background, such as logging, downloading and pushing, are separated into a separate process, so that when there is a problem in this part, the main process will not be affected.

Multi-process invocation is very cumbersome, so we abandoned the original singleton pattern and simplified invocation.

Product rapid development later

In the later stage of product development, the team has reached more than 20 people, which is difficult to manage. Therefore, the whole team is divided into four business teams and four architecture teams. What you need to consider here is what architecture will match the routine development of each team, while also supporting faster product iterations.

In order to ensure product stability during rapid iterations, we began to move toward componentized architectures.

Componentized architecture

There are so many open source componentized frameworks out there that it costs virtually nothing to implement them. At the time, our focus was on parallel development, where teams were isolated and developers could go live, test, and validate their own features after they developed them. Another concern is dynamic update, which ensures that the framework supports dynamic update capability when it goes live.

During this process, we found that there are so many componentized architecture pits that people who have not yet or are planning to do this need to decide whether or not to use componentized architecture. There are two concepts, one is parallel development and the other is parallel publishing, and you can choose where you want to end up. Parallel development is relatively easy, but parallel publishing is not the same, it involves not only the client side, but also the server side, big data, version management and compatibility and other aspects of the problem.

Componentized architecture practices

Why repeat the wheel

Earlier componentization frameworks include Atlas/ACDD, DynamicLoadApk, DynamicApk, Small, and recently two new frameworks, VirtualAPK and Replugin, have been developed. Although we looked at many frameworks, we decided to implement it ourselves because of the unique nature of the input method business. The input method is different from ordinary apps, and it has very high requirements on the keyboard, which other open source frameworks can not meet.

compatibility

Due to the fragmentation of Android system, it is very troublesome to deal with compatibility. For example, when the phone switches to landscape mode, the input method displays half of the information. Later, it is found that the resources are not refreshed when the screen is switched, and the original screen width is obtained, so we Hook it in ActivityThread. The Hook itself uses Java’s reflection mechanism to replace the API’s native implementation with our own.

Startup performance

Startup performance problems mainly occur in the keyboard startup slow, insufficient space leading to crashes. Starting the framework does more and takes more time than it did before. In addition, when the old users upgrade, there will be a configuration file during the startup process of the component, which lists the list of components and sets the priority of the startup of the component, and then decompresses and compresses it and finally copies it to the corresponding place. As you can see, because of the need to make extra space to extract components for decompression optimization, so that the old model of ROM space is not enough for the phone is very unfriendly.

Based on the above problems, we changed the original manifest file into a class file, and classified the relevant parts of the application startup into one Dex, and the irrelevant parts into another Dex. In this way, the startup speed can be improved, because the Dex is optimized during installation.

Multiple processes

We added layers of encapsulation between process calls to each other, using cross-process capabilities as if they were local capabilities, including remote interface to local interface, remote pull up, reconnection, and state recovery. In addition, in order to ensure the normal operation of each component, the component adaptively selects the process.

Engineering structure

The shell project here has no code, only scripts and configuration files, and only a single branch, generally not modified. In addition, shell engineering is only used for integration packaging and integration debugging, not repO and Git submodule.

Bundles in business components can be compiled and debugged independently, and packaged products include test APK, component APK, and AAR, which are uploaded to the Nexus private server.