This article is from the official account of the project: “AirtestProject” Copyright notice: It is allowed to be reproduced, but the original link must be retained. Do not use it for commercial or illegal purposes

preface

Today, I would like to share with you the story of a small white user’s Airtest from entry to give up:

Little A is an automatic little white, when wandering around the test forum, accidentally found Airtest, A UI automation framework based on image recognition.

Out of curiosity, LITTLE A tried this framework and found that it only needed A few simple screenshots to automate various operations on the equipment, so little A successfully planted this framework.

After A few days, however, as A. used the screenshots more, he realized that the screenshots weren’t as “perfect” as he had expected. Sometimes the program will tell him that he can not find the corresponding screenshots, sometimes the program will identify some wrong positions, or he wrote a screenshot script hard, after changing a mobile phone, and can not identify the……

After the NTH failure, Little A finally gave up the framework.

Seeing this, maybe some of you will feel empathy, because when you first started using the framework of Airtest, you often stepped on the pit mentioned above!

Airtest, as a self-developed testing framework, is certainly not “perfect”. But in addition to looking forward to our developers coming up with more accurate image recognition solutions in the future, we have many tricks to improve the compatibility of our screenshots.

In this article, we will take a look at the principles of Airtest image recognition, and then take a look at 11 screenshot techniques to improve the compatibility of screenshots in a real case.

Here are some things you should know about Airtest screenshots

1. Image recognition algorithm used by Airtest

By default, Airtest tries to use SURFMatching, TemplateMatching, and BRISKMatching algorithms for image recognition.

TemplateMatching belongs to TemplateMatching algorithm, while SURFMatching and BRISKMatching belong to feature point matching method. To put it simply, template matching algorithm relies on feature vectors to match images, while feature point matching algorithm relies on feature points of images.

The recognition effect of these algorithms is better for the unique ICONS and images on the device screen, because they have more feature vectors/feature points, while the recognition effect is worse for the screenshots with pure text and a large number of blank backgrounds.

As we all know, pure text screenshots contain only some simple strokes and have fewer feature vectors/feature points, making it easier to identify wrong results compared to images. However, in the screenshot of blank background, the gray value of each pixel point basically does not change, so there are almost no feature points, which makes it easier to find the matching result/the matching result is extremely different.

2. How does the program determine whether matching screenshots are found according to the algorithm results

So when we write screenshots and run them, how does the program use these image recognition algorithms to help us determine if we recognize the results?

Here are two very important terms: threshold and reliability, both of which range from 0 to 1. In each image recognition script, there is a threshold for result filtering, with a default value of 0.7.

When the above three algorithms recognize the initial result in the execution process, the credibility of the initial result will be calculated. When the credibility is greater than the threshold value, the program will think that the best matching result has been found. When the confidence is below the threshold, the program will assume that the best match has not been found.

When executing the screenshot script, we can view the log window to observe the credibility of the algorithm recognition result:

① Credibility > threshold, the program determines to find the matching result

(2) If the reliability is less than the threshold, the program determines that no matching result is found, and the three algorithms are used to continue the search until timeout

11 Screenshots To make it easy to play through screenshots Automation

After understanding the basic knowledge related to screenshots, we finally came to the content of screenshots skills, but it should be noted that the skills applied in different scenes are very different, I hope students can use flexibly:

1. Try not to intercept too much background content when capturing ICONS

Take a simple example, for example, we want to open the app of netease Cloud Music by clicking the app icon of netease Cloud Music. In order to have better recognition results on different devices, we should try our best to select the first screenshot in the following figure instead of the second one mixed with too much background:

To give you an intuitive view of the difference, we cut the above two images on device 1 and executed them separately on device 2. The result is as follows:

It can be seen that there are not too many background screenshots, and the recognition reliability is as high as 0.95. The reliability of the screenshot with the icon in the background dropped to 0.88. Minimizing the background when capturing these specific ICONS will improve the compatibility of these screenshots.

2. Start the app using start_app instead of screenshots

Start_app () supports Android and iOS devices, and is much cleaner and more compatible than using screenshots to start the app:

# open netease cloudmusic start_app("com.netease.cloudmusic")Copy the code
3. Use the Image Editor to check the credibility of recognition results

After recording/writing a screenshot script, we don’t need to run it. We can directly double-click the screenshot, enter the picture editor, and click the snapshot+ Recognition button in the upper left corner to view the recognition status of the screenshot on the current page, including the recognized position and the credibility of the recognition result:

This recognition can be used as a reference to help students quickly debug their screenshots.

4. Skillfully tap different parts of the screenshot with target_pos

Let’s look at what target_pos is. By default, our screenshots are all centered on the screenshots, target_pos=5. For a screenshot, there are 9 target_pos in total. When we set the target_pos of the screenshot to different values, the script will click on different locations in the screenshot:

Double click the screenshot in the IDE to open the image editor and modify the value of target_pos on the right side:

After changing the script to code mode, we can see that there is a target_pos parameter in the screenshots:

PNG ", target_pos=6, record_pos=(-0.434, -0.773), resolution=(900, 1600))Copy the code

When we do automation, we often encounter a situation of stacking ICONS. For example, in a song list of netease Cloud Music, three identical play buttons are listed on the right side:

If we need to click the middle button, it is difficult to identify the specific button among the three buttons only by cutting one play button.

At this point, we can achieve two screenshots. One is to enlarge the screenshots in the vertical direction and make the middle button in the position of target_pos=5:

The other option is to expand the screenshot in the horizontal direction to include the song description on the left with the middle button at target_pos=6:

Both of these methods make sure we hit the middle button (assuming the list of songs doesn’t change).

Therefore, when accurate screenshots (only capturing a button/icon) cannot meet the unique positioning, we can consider enlarging the scope of screenshots and adding more feature points to ensure the accuracy of screenshots positioning.

5. Click/slide skillfully with coordinates

Sometimes, when we open an app, we will encounter some cutscenes or some introduction pages for the app. These cutscenes and intro pages may change as the version is updated, so using screenshots to click on can take a lot of effort to maintain these screenshots.

In fact, we could have used coordinate clicks instead of screenshots, because any click on a cutscene or intro page can be skipped.

For example, the rotation chart of the home page of netease Cloud Music may be different every day. If we use screenshot scripts to slide/click, we need to maintain these scripts all day, so it is better to replace them with coordinate sliding/clicking, which will save more worry and effort:

6. Skillfully replace the returned screenshot script with keyevent(“BACK”)

Many times, we need to return to the home page of the APP from a certain page. Some students may use a bunch of screenshot statements to return ICONS to achieve this requirement:

Keyevent (“BACK”) = keyevent(“BACK”); keyevent(“BACK”) = keyevent(“BACK”);

7. Recording function is easy to use, but also pay attention to the compatibility of screenshots

IDE has built-in recording function, which can help our novice students to quickly use the test framework based on image recognition. However, the screenshot statements automatically recorded are not always in line with our actual needs, so we cannot rely too much on the recording function.

After recording, we can check which screenshots are not well captured, and then manually capture them to improve the compatibility of the whole script.

8. During screen switching, you can use Wait or sleep more often and then click

Many novice students are easy to make a mistake, is accidentally write a lot of continuous click operations; After each click, the screen changes in real time. If the next click is performed while the screen is loading, it is easy to recognize the wrong location or recognize a timeout.

For example, when you enter a netease cloud music app, we agree to the terms of service, there will be a very long start the animation, we only wait for after the start animation, to be able to click on the “immediate experience” of next operation, otherwise the click operation probably because of waiting for the start of the animation process and identify the timeout:

In addition, in order to ensure that continuous clicking can be executed normally, we can also buffer between continuous clicking with sleep(1.0) to reduce the impact of picture switching on continuous clicking operation.

9. Adjust the threshold appropriately

Thresholds, as we mentioned above, serve as a result filter. That is, if we set the threshold too low, we are more likely to let the wrong result pass; Setting a threshold too high may filter out correct results that are less reliable than required, making it difficult to obtain effective identification results.

So we can better filter out the recognition results we want by adjusting the threshold appropriately. For example, the default threshold for a screenshot is 0.7, but after running it many times, we find that there is a certain probability that the wrong result will be identified. At this point, we might as well raise the threshold to see if we can improve the probability of correct identification. If so, our threshold adjustment is effective.

In the IDE, we can double-click on the screenshot to open the image editor and change the threshold of the screenshot on the right side:

After setting up and closing the image editor, we can right-click in the script window and switch to code mode. We can see that the screenshot script has a parameter threshold=0.8:

PNG ", threshold=0.8, record_pos=(-0.021, 0.121), resolution=(900.0, 1600.0))Copy the code

Of course, we can also set the global threshold:

From airtest.core.setting import Settings as ST st. THRESHOLD = 0.7 # Default THRESHOLD for other statementsCopy the code

However, the above modification method only applies to the screenshot statements except the assertion statement. If you double-click the image editor in the screenshot of the assertion statement and then modify the threshold, it will not take effect eventually. Because the threshold for an assertion statement is different from the threshold for other screenshot statements, it can only be set in the following way:

From airtest.core.setting import Settings as ST st. THRESHOLD_STRICT = 0.7Copy the code
10. Use custom statements (such as screenshot lists)

Syntax can also be used to improve compatibility for devices with different aspect ratios, different device resolutions, and multiple fonts. In this way, you need to connect to the device with problematic script compatibility and include the corresponding screenshots in the search list. The code script is as follows:

PicList = [pic1,pic2,pic3] # picList for picIn: pos = exists(PIC) Touch (pos) break # As soon as any image in the list of images is found, execute touchCopy the code

Note: If there is no break statement in the for loop, this will cause the sub-logic to run through all the images (touch when it finds them) instead of returning them immediately.

This also applies when we want to click on any random icon.

11. If you can use the POCO framework, you can also use POCO statements instead of screenshots

If the project you are testing can use the POCO framework, it is suggested that you can flexibly mix Airtest and POCO scripts when automating scripts to achieve better compatibility of your scripts:

For example, if you want to select the top 10 songs in a certain playlist of netease Cloud Music, you need to write 10 screenshots if you use screenshots. However, if you use POCO framework, you only need several lines of scripts to traverse nodes (taking the first 3 songs as an example) :

Screenshots also need to be maintained when the song name changes; Choosing the invariant node as the operation object at this time obviously improves the compatibility of our script.

summary

These are the tips for improving the compatibility of screenshots, but of course there are other tips that students might come up with during the actual automation process. In fact, these skills are summed up in the process of many practical operation, so as long as the students practice, there will be more and better ideas to solve the problem!


Airtest website: airtest.netease.com/ Airtest tutorial website: airtest.doc.io.netease.com/ build enterprise private cloud service: airlab.163.com/b2b