background

It is the eternal pursuit of technology r&d and project management team to continuously deliver online version quickly and ensure quality and performance is always in a good state. With the maturity of mobile development technology and the intensification of industrial competition, major head apps have gradually become platforms and containers, assuming the mission of the main entrance of their companies’ business. Inevitably, this has led to a huge increase in the complexity of the apps themselves, compared to earlier mobile apps.

In addition, the rapid pace of iterative development, complex user environment (network, device, scene, etc.) and diverse application frameworks all bring great challenges to the development and management of mobile terminals. In fact, the traditional software crisis in mobile development also has a certain degree of reflection.

In the above context, this paper introduces the thinking and practice of iQiyi’s technical team on this issue from the quality and efficiency challenges faced by mobile terminals and the specific situation of iQiyi App.

Challenges encountered

From a macro perspective, the complexity of such a huge App as IQiyi App comes from several aspects:

  • Large business scale: there are dozens of business modules. How to correctly manage and isolate the coupling and conflict between the modules to avoid the spread of errors and faults is not a small challenge.
  • Large scale of developers: many of the above modules are distributed to different development teams, and the development integration cycle is not completely synchronized, which brings great complexity to the integration release;
  • Complex technology forms: Due to business characteristics, different teams/businesses choose different technology stacks, including Native, RN, H5, applets, Flutter and other forms. Different cross-end languages/technologies are parallel, such as JS, Lua, C/C++, etc. How to ensure that all of these environments are properly isolated and coordinated needs to be considered from many aspects.
  • Fast iteration speed: At present, IQiyi App maintains the rhythm of double weekly release, front-end development at the back end, weekly development and weekly testing at the client end. In this rapid iteration cycle, the overall maintainability and testability of the App are maintained in good quality standards, and new requirements are put forward for the research and development architecture, process and tools.

From the micro point of view, each business module itself is also faced with varying degrees of technical complexity, frequent demand changes, historical precipitation tracing difficulties and other challenges; In the process of rapid iteration, most of the time, it belongs to the state of changing wheels in the process of moving, and its own code quality and function quality are also a topic that needs continuous attention.

The solution

Combined with these factors, the combination of multiple complexities and uncertainties creates a huge challenge for sustained quality delivery. In this case, it is neither economical nor realistic to simply rely on testing manpower coverage to guarantee the quality of products.

After investigating the practice of the industry and combining with the actual situation of IQiyi, after full discussion, we determined: automation, standardization, initiative, systematic governance ideas. Specifically, the following series of measures have been taken:

First, for UI business scenarios with relatively fixed processes, an automated test is constructed. For UI business library interfaces, corresponding unit test cases are required to be supported and run automatically, freeing testers from tedious repetitive work and focusing on more complex and changeable scenes. In addition, automatic code coverage statistics mechanism has been introduced on mobile (support for real/emulator to cover single commit/whole App). There is a real data assessment of single commit quality and test coverage quality, allowing development and testing to be targeted for verification of changes.

Secondly, considering that automated and manual testing can only find defects that have already been introduced, and the cost is very high, we hope to kill defects in the coding and debugging phase as much as possible, so as to minimize the cost. So how to achieve this goal? First of all, we should reach an agreement on the concept. After discussion, we all agree that “the quality of the final product needs to be guaranteed by testing and development”. After reaching an agreement on the concept, we also need to find effective means to achieve the final goal. After investigation and discussion, I believe that log is a good way to enter, for the following reasons:

  • Wide coverage: the unified log system can cover all modules to ensure that there is no blind spot in monitoring;
  • Detailed information: logs are added by developers to output key nodes of the main process of the module, interested context and internal state, and the amount of information is much greater than the interface white-box test, let alone compared with the black-box test;
  • Persistent: Log information evolves iteratively along with the module code, settling down. In the process of concrete implementation, it can be gradually accumulated by stages according to the priority, and the results can be used persistently;
  • Covering the whole process: log information covers the whole process of development and debugging -> test -> internal test -> external test -> online release, and can play its role in each stage;
  • Convenient tool and automation: Develop a unified log SDK and unified log format and rules to make the log easier to use and more tool-based. If the corresponding supporting tools are developed at the same time, the use of logs can be largely automated at all stages, improving efficiency and reducing the use cost.

Based on the above considerations, the team constructed a set of comprehensive diagnostic debugging system for mobile terminal with the log system as the core. The system provides comprehensive tool support for application development, debugging, internal testing, release and other stages, exposing all kinds of problems in the early stage as far as possible in the form of standardization and process, and ensuring the traceability of the complete processing process of problems; In the later stage of release, it provides comprehensive diagnosis and location information for specific scenarios of specific users. In addition, upload the log, after debugging information, such as warehousing, will its according to the standardized format is unified big data analysis and processing, and get through, with intelligent customer service system to form reporting, analysis, monitoring, reporting, intelligent customer service, confirm the repair of complete closed loop, at the same time of improving the efficiency of development and debugging, the escort for the overall quality, The entire system can be shown in the following diagram:

Log the SDK

The log SDK is integrated into the main App and the independent App to provide log monitoring and tracking for key nodes in each module. At present, the main module has been covered. In addition to basic functions such as log classification and fragment storage, the log SDK provides different filtering policies, storage media, reporting policies, and remote cloud control configuration functions. To prevent excessive log accumulation, the log system supports functions such as storage capacity limitation and automatic expiration clearing. The structure of the log system is shown as follows: \

Supports hierarchical log formatting and output to different targets. In addition to console output, file storage, and server upload, logs can also be output to LogDebugger, Web viewer, etc., which is convenient for developers and testers.

Supporting tool set

The logging SDK only provides a necessary pathway for functions such as monitoring and diagnostics, and you need to develop tools to take full advantage of it. These tools can be divided into client, desktop, and Web based on the running platform. According to the function can be roughly divided into two categories: quality and efficiency: \

Log debugger

A common problem with logging is that the IDE console is mixed with a lot of system log output, third-party library log output, etc. These irrelevant information will completely drown out the valid information. It is difficult for developers and testers to have a global and intuitive understanding of the log output of this module without knowing or without uniform keywords. In order to solve this problem and facilitate and encourage people to use logs to diagnose and locate problems, we developed a log debugger, which is interconnected through the client-server structure, supporting local debugging, cross-debugging, Intranet remote debugging, and unified log output pipeline, which effectively avoids the problem that log information is flooded and cannot be extracted effectively. It also supports formatting display and monitoring by level, Module and keyword.

After startup, the tool will automatically search and discover clients in the LAN, display the client App name, version, device id, connection status and other information, and automatically maintain the connection status, as shown below:

If a client is online, click connect to connect to the client. After the connection is established, the logs sent by the client through LogSDK will be displayed in the debugger window in real time, as shown below:

The log debugger can monitor log levels, modules, and keywords. Users can export and send logs filtered out according to specified conditions.

The Web viewer

In order to facilitate the relevant personnel in the development, debugging and internal testing stage can always have a way to obtain the information such as the log of the client being debugged and internal state, the development of Web viewer. Web viewer supports viewing real-time/non-real-time logs and debugging information of specified modules. The following figure shows the scenario of viewing logs in real time: \

Active diagnostics/debugging

In order to further improve the debugging efficiency and cope with the rare recurrence opportunities, and even the debugger itself may interfere with the debugger (see Heisenberg effect, Heisenbug), we draw on the debugging method of embedded development to construct a set of active diagnostic debugging mechanism, which is characterized by: \

  • Change from passive to active: simple log output and analysis, or relatively passive, it is hoped that the module level can support some active debugging function, especially for the internal complex strong state module/business;
  • Contactless, even in remote debugging, covering the development, testing, and release phases: Specifically, in the development and beta, to provide remote command channel, developers can through the shell to interactive debugging of the App, such as more precisely to pull log, accurate access to the client A/B testing status, related configuration information, at the same time can also perform module custom debugging instruction, in order to get their specific debugging information; In the release phase, active debugging instructions are issued through the Push token channel to obtain internal status information of the client, which facilitates more accurate problem location and provides information for bug fixing and experience improvement.
  • The framework is extensible: Specifically, the framework implements the most common basic debugging instructions, and the business side can define its own extended debugging instructions according to its specific situation;

Debugable design

The previous section mentioned active debugging for modules. Why introduce active debugging? The background is that the high complexity of software products and the fast iteration rhythm bring great difficulties to reliable testing and debugging. One way to solve these problems is to plan debugging functions in the Design and development stage, take precautions, and change passivity into initiative. This is the idea of Design for Debug. The idea originated from the field of integrated circuit development, and was later introduced into the software field and gradually received attention.

In fact, more broadly speaking, in mobile development, the whole App, a certain module, a certain interface, etc., should be considered to a certain extent. Of course, this is a very big topic, and the module active debugging function we have implemented so far is only a preliminary practice of its ideas at the module level, which needs further exploration.

conclusion

Log system through the whole process of development, debugging, testing, release, is a good start to improve efficiency and quality, this paper introduces the iQiyi technical team to build a comprehensive diagnosis and debugging system and supporting related tools.

In addition, the principle of active diagnostic/debugging functions being explored is introduced, and the idea of debuggable design is further discussed. This concept originally originated in the field of integrated circuits, which has certain inspiration to us today when software modules are becoming more and more complex. Of course, we are still in the early stage of exploring the debugable design, welcome to further discuss with friends in the industry.

end

Maybe you’d like to see more

Iqiyi RND Framework JS Framework analysis

Talk about mobile client efficiency tool — Lens

Scan the qr code below, more exciting content to accompany you!

Iqiyi technical product team

Think simply, act simply