Author: The technology of idle fish — Reed Bamboo

background

As a vertical trading community APP, Xianyu has complex and diverse business scenarios, including c2C, recycling and consignment, rental and leasing, face-to-face trading, inspection and guarantee, etc., with complex and changeable trading modes. For example, inspection process:Copy the code

• 39 state machine nodes • Across 10+ application systems • Cooperation between 6 business departments • Dozens of interfaces

We need to ensure that every interface and every scene is feasible. If there is a little problem, the taste of RMB will be involved. In practical work, we encounter various problems, the thorny ones are as follows:


The problem

In the rapid iteration mode of winning business first, test and verification are carried out entirely by manual forces. After testing the new functions, the old functions have to be returned. Even a small requirement requires several people days, and the version OF PTM has to be returned several times.

1. The transaction business is strongly dependent on the Middle Office, with high communication costs, difficult cross-team collaboration and low iteration efficiency. How to self-consistency in the test environment? 2. How to support the steady iteration and daily regression verification of demand under the complex and diverse trading modes?


Test strategy – Automation

The quality infrastructure of Idle fish is being carried out at a fast pace. It is not feasible to rely solely on manpower for the diversified trading modes of Idle fish. In this regard, according to the interface -> link policy, we explore and compare several different schemes to guarantee the whole link on the basis of ensuring the OK of each interface.


The interface layer

For every large application, the number of interfaces is increasing, code is changing more frequently, and the system is refactoring irregularly. How can the quality of this interface be guaranteed? The traditional way of writing scripts requires too much manpower and time. In the actual testing process, we explored some new ideas of interface testing. At present, the recognized effective way in the industry is the automated test based on drainage playback. There are different opinions on the implementation scheme in the industry, but all changes are included in it, citing the following summary, simple and clear

One is the black box test idea, which collects online traffic (mainly request parameters and results) during the online interface request, and then uses the same environment with the online environment (database sharing, etc.) to trigger the request again with the collected traffic, and then asserts whether the returned value of the request is consistent with that of the recording. This method is suitable for testing the Get type of interface, for write operation request data is easy to cause pollution, coupled with the flow state of the data collected (data timeliness), environmental dependencies (various middleware, interfaces, the internal request of RPC calls), so this testing method has some limitations, It cannot meet the complex requirements in actual test scenarios. Another idea relative to white box, mainly through intelligent Mock method, flow sampling when rely on external middleware in the process of code to run or RPC calls to return the result, when traffic replay, to Mock the native application foreign dependence may produce changes in the content, make the test more focus on local interface code logic.

Inside Ali Group, based on the idea of traffic playback, two different traffic playback schemes are mainly implemented, one is doom based apocalypse/Blizzard, the other is JVM-based Sandbox based Phoenix, both of which rely on JVM AOP.

Apocalypse/Blizzard

Apocalypse/Blizzard, the underlying doom for traffic recording, works like this <br />! [doom frame. PNG] (https://gw.alicdn.com/imgextra/i4/O1CN01WOLEs31JGV9Tv4bwj_!!!!! PNG)<br /> Doom schematic <br />Copy the code

1. Use Java Agent’s client in JVM to collect the input parameters, return values, subcalls (a method call during application execution) of the main call (the entry method during collection or playback) in the way of ASM AOP. The acquisition machine collects the method’s input and return values for the method to mock when played back, and uploads the collected data to the server (offline mode); 2. After receiving the playback request from the interface, the client executes the local logic of the interface and mocks the input parameters and results of the subinvocation. 3. Compare the collected traffic with the playback data.

In doom mode, business application systems need to introduce Jar packages, modify startup classes, and modify JVM mount agents, which is partly intrusive.

- Phoenix phoenix, also using JVM AOP implementation of traffic recording solution, concept and doom is similar, Phoenix overall architecture based on jVM-Sandbox (Ali open source a JVM platform non-invasion runtime AOP solution, Implement method-level AOP functionality enhanced by bytecode) output module atomic capabilities. When recording, it records the methods that were called, the input arguments, return values, and the order in which the calls occurred, stored in chained data structures, and mocks the interface logic and subcalls when playing back. <br />! [phoenix recorded playback. PNG] (https://gw.alicdn.com/imgextra/i2/O1CN01m49rqS1rsh7EMIakW_!!!!! 6000000005687-2-TPS-442-331.png)< BR /> Phoenix Recording Playback < BR /> Phoenix does not need to modify code intrusion, and does not need to modify application startup parameters. Relatively speaking, it has little impact on service code, but has application structure requirements. Considering the cost and risk, as well as our application structure, Xianyu adopts Phoenix flow recording and playback based on Sandbox to guarantee the change of online process. < BR /> During the development process, there are also various traffic playback issues, such as use cases that expire and need to be clearly re-recorded manually. We are now using a scheduled task to automatically clear the way to re-record. <br /> Here is an example of our scenario: <br />! [image.png](https://gw.alicdn.com/imgextra/i3/O1CN01bR7Yqe1qfaA29uZCx_!! 6000000005523-2-tps-1318-418.png)<br /><br />Copy the code

The link layer

In recording and playback based on flow ratio on the interface of the test process, we found that the mechanism for a practical application, the quality of security, but for the link across application authentication, core write operation, outgoing calls, as well as the system reconstruction class, program transformation of big demand and some shortage, the solution of the link level solutions.

•Thub + micro services

In the test environment, for the strong dependence of upstream and downstream of the full link, one of the measures is to develop the test servitization ability and establish self-consistent ability. In the test environment, the dependence of coupling on the outside such as trading center and Cainiao Wrapping can be removed, and the test environment can carry out the full link closed-loop. The first task of landing is to sort out all service link nodes:

- Each MTOP interface on the trunk link, as well as the upstream and downstream dependencies of the interface - internal applications, mid-platform applications, external merchant dependencies - data flow and TDDL combingCopy the code



Complete business sorting, test service interface development. Here is part of the link case captured by us:



At the same time, if the test environment is unstable due to the dependent party’s test environment, such as block test, we provide the test servitization interface for encapsulation, exposing the servitization capabilities such as ordering and inspection to be embedded in the Idle Fish quality platform for development and testing in the research and development process.

• Sky Computing platform

Sky computing platform, the use of shadow library, full link pressure measurement mode, online business data and test data isolation, test library copy online library part of the data. The main way to achieve this is to carry out solidification simulation of the scene on the line, perform full link execution, and compare all data changes in the process of execution. Users can choose the baseline of any code version to compare with the changed version.

A flowchart



The sky calculation ability can basically meet the trading link of idle fish. Idle fish has established the shadow library related to the main link, and the shadow link is being debugated for the whole link inspection of the trading server. At the same time, the shadow link has some problems such as the expiration of shadow data caused by business changes. This scheme is mainly used for stable business, while new business or continuously updated business is not all in this scheme.


conclusion

To sum up, at present, idle fish transaction uses jVM-based sandbox traffic recording scheme in interface layer, shadow link is used in daily inspection, and service orchestration capability for process self-test and link automation is developed.

doom

jvm-sandbox

Thub + service

Shadow link

Release

no

Yes (github open source)

no

no

Code into

is

no

no

no

Application requirements

no

is

no

no

The stability of

general

general

general

general

Full link test

no

no

is

is


Looking forward to

On the basis of the sound infrastructure, we will continue to explore the test solutions of FLUTTER and the full intelligent direction of the server, hoping to release more technical erii from the repetitive work, and guarantee the xianyu transaction from the three-layer quality network of governance, prevention and control, so that users can safely sell and buy in Xianyu. We are looking forward to communicating with you about different testing solutions in the industry! Thanks to doom, Sandbox, Phoenix, Apocalypse, Blizzard, Full link Pressure test, Thub and other teams for their support!