Compile author | Martin Schneider | | Flin source medium

Visual regression testing is most commonly performed using baseline images. However, different aspects of visual testing are also worth discussing. We’ll cover template matching (using OpenCV), layout testing (using Galen), and OCR (using Tesseract), and show how these tools can be seamlessly integrated into existing Appium and Selenium tests.

We use Java (and the Java wrappers for OpenCV and Tesseract), but similar solutions can be implemented through other technology stacks.

This article is a companion article to a quick talk delivered (in a shorter form) at Taqelah Singapore in September 2020 and during the Selenium Conference 2020. For a demonstration of the full functionality and more details, see www.justtestlah.qa/

I hope this summary helps you choose the tools that have the most impact on your use cases and gives you some ideas on how to integrate them into your own toolbox.

Template matching

The task of template matching is to find the given image (template) on the current screen.

Where’s Waldo?

For mobile testing, Appium added this feature in its 1.9 release in the form of an image locator policy. (More information can be found in the documentation and early tutorials.) The idea is to pass a Base64-encoded string representation of the image to WebDriver.

  • Early tutorial: appiumpro.com/editions/32…

Using the Image locator, you can interact with the result element just like any other WebElement. Such as:

WebElement element = 
driver.findElementByImage(base64EncodedImageFile);
element.click();
Copy the code

or

By image = MobileBy.image(base64EncodedImageFile);
new WebDriverWait(driver, 10).until(ExpectedConditions.presenceOfElementLocated(image)).click();
Copy the code

The approach the developers took was to add functionality as part of the Appium server and use OpenCV (which will become a dependency of the instance running the Appium server) to enhance the actual image recognition capabilities.

Interestingly, the flow between client and server is as follows:

  1. Request a screenshot from the Appium server.

  2. Both screen shots and templates are sent to the Appium server for matching.

This doesn’t feel perfect, especially if we want to match multiple templates on the same screen.

When I first implemented template matching in 2018 (not knowing that the Appium team was already working on template matching at the time), I also chose OpenCV and ran it on the client side. Using the OpenCV Java wrapper, my code points are as follows:

Mat result = new Mat(resultRows, resultCols, CvType.CV_32FC1);
Imgproc.matchTemplate(image, templ, result, Imgproc.TM_CCOEFF_NORMED);
MinMaxLocResult match = Core.minMaxLoc(result);
if (match.maxVal >= threshold) {
  // found
}
Copy the code

This approach does not require additional requests to the Appium server mentioned above. In fact, it doesn’t require any of WebDriver’s features, other than the ability to capture screenshots. It can also be used with Selenium and Appium. That is, it also increases the dependency on OpenCV, this time on the instance running the test execution.

I’ve wrapped both of the above methods (client-side and server-side execution) into the TemplateMatcher interface to demonstrate their use (think of it as PoC).

You can find more details and examples in JustTestLah!

  • JTL test framework: justTestLah.qa /#template-m…

Layout of the test

Another type of visual testing involves verifying the layout of a page or screen. You can do this by comparing images, which also implicitly check the layout. An easier approach is to use a dedicated layout testing tool like Galen (one of the most underrated UI testing frameworks out there, in my opinion).

Galen uses a per-screen specification to define all the (important) elements on the screen and their sizes and their absolute or relative positions between them.

Let’s take the Google search page as an example:

We can express it using the following specification:

SEARCH_FIELD:
   below LOGO
   centered horizontally inside viewport
   visible

LOGO:
   above SEARCH_FIELD
   centered horizontally inside viewport
   width < 100% of SEARCH_FIELD/width
   visible

SEARCH_BUTTON:
   near LUCKY_BUTTON 20px left
   visible

Copy the code

Note that JustTestLah! The framework’s syntax (referring to UI elements by unique keys defined in the page object’s YAML file). In pure Galen, these need to be defined at the top of the spec file:

@objects
    LOGO          id        hplogo
    SEARCH_FIELD  css       input[name=q]
    ...
Copy the code

There are several ways to perform these checks. I prefer the Verify method as part of the BasePage abstract class:

private T verify(a) {
  String baseName = this.getClass().getSimpleName();
  String baseFolder = this.getClass().getPackage().getName().replaceAll("\ \.", File.separator);
  String specPath = baseFolder
              + File.separator
              + configuration.getPlatform()
              + File.separator
              + baseName
              + ".spec";
  galen.checkLayout(specPath, locators);
  return (T) this;
}
Copy the code

This way, we can easily invoke validation from the test whenever we first interact with the screen (incidentally, I use a similar approach to integrate Applitools for visual testing) :

public class GoogleSteps extends BaseSteps {
  private GooglePage google;

  @Given("I am on the homepage")
  public void homepage(a) { google.verify().someAction().nextAction(); }}Copy the code

Optical Character Recognition (OCR)

Another form of visual assertion is optical character recognition, or OCR. This feature is useful whenever text is rendered as an image for some reason and cannot be verified using standard testing tools.

This may also be interesting to those who use Selenium for Web scraping rather than testing, as it is one of the countermeasures that Web developers take to make it more difficult.

We used Tesseract (an OCR tool originally developed by HP in the 1980s and currently sponsored by Google).

Our example is not the most practical example, but rather shows the power of Tesseract in detecting different types of fonts: We’ll verify that the Google logo actually spells “Google” :

public class GooglePage extends BasePage<GooglePage> {

  @Autowired privateOCR ocr; .public String getLogoText(a) {
    return ocr.getText($("LOGO")); }}public class GoogleSteps extends BaseSteps {
  privateGooglePage google; .@Then("the Google logo shows the correct text")
  public void checkLogo(a) {
    assertThat(google.getLogoText()).isEqualTo("Google"); }}Copy the code

The OCR services used are as follows:

public class OCR implements qa.justtestlah.stubs.OCR {

  private Logger LOG = LoggerFactory.getLogger(OCR.class);

  private TakesScreenshot driver;
  private Tesseract ocr;

  @Autowired
  public OCR(Tesseract ocr) {
    this.ocr = ocr;
  }

  / * * *@param element {@link WebElement} element to perform OCR on
   * @return recognised text of the element
   */
  public String getText(WebElement element) {
    return getText(element.getScreenshotAs(OutputType.FILE));
  }

  / * *@return all text recognised on the screen */
  public String getText(a) {
    return getText(getScreenshot());
  }

  private String getText(File file) {
    LOG.info("Peforming OCR on file {}", file);
    try {
      return ocr.doOCR(file).trim();
    } catch (TesseractException exception) {
      LOG.warn("Error performing OCR", exception);
      return null; }}/**
   * Usage:
   *
   * <pre>
   * new OCR().withDriver(driver);
   * </pre>
   *
   * @param driver {@link WebDriver} to use for capturing screenshots
   * @return this
   */
  public OCR withDriver(WebDriver driver) {
    this.driver = (TakesScreenshot) driver;
    return this;
  }

  /**
   * Usage:
   *
   * <pre>
   * new OCR().withDriver(driver);
   * </pre>
   *
   * @param driver {@link TakesScreenshot} to use for capturing screenshots
   * @return this
   */
  public OCR withDriver(TakesScreenshot driver) {
    this.driver = driver;
    return this;
  }

  private File getScreenshot(a) {
    return driver.getScreenshotAs(OutputType.FILE);
  }

  @Override
  public void setDriver(WebDriver driver) {
    this.driver = (TakesScreenshot) driver; }}Copy the code

This requires Tesseract to be installed on the instance where the test is running. For the complete source code and demo, see JustTestLah! Test the framework.

  • www.justtestlah.qa/

Original link: medium.com/better-prog…

Welcome to panchuangai blog: panchuang.net/

Sklearn123.com/

Welcome to docs.panchuang.net/