Two ways to test asynchronous systems

Internet software systems have been evolving to adapt to more complex business scenarios, higher performance requirements, etc., due to increasing demand, users, etc. There are various ways of software evolution, and system asynchronization is one of them.

In general, asynchronization is a good choice for those tasks that do not require high real-time performance but are computationally intensive or need to process large amounts of data for a long time, or have slow I/O. At the system level, things like introducing message-oriented middleware to decouple the system, and placing time-consuming tasks behind the middleware to perform them asynchronously. At the method level, things like putting longer tasks into other threads for asynchronous execution.

Synchronization with the test system or method is different, when we test the asynchronous system (end-to-end testing, integration testing) or asynchronous method (unit testing), due to the asynchronous task thread blocking test thread will not be let test become uncontrolled, probabilistic failure, unit test, for example, writing asynchronous test is not stable,

@Test
public void testAsynchronousMethod(a) { callAsynchronousMethod(); assertXXX(...) ;// The asynchronous task may still be incomplete and assert may fail
}
Copy the code

There are two types of asynchronous tasks:

After an asynchronous task is executed, the initiator or caller is aware of the task, for example, sending an event or notification
After an asynchronous task is executed, the initiator or caller is not aware of the task, but some states in the system are changed

Testing for asynchronous tasks is also discussed in the two categories above. For the first, we can use the listening method to test:

import org.junit.Before;
import org.junit.Test;

public class ExampleTest {
    private final Object lock = new Object();

    @Before
    public void init(a) {
        new Thread(new Runnable() {
            public void run(a) {
                synchronized (lock) {  / / gets the lock
                    monitorEvent();    // Listen for asynchronous events
                    lock.notifyAll();  // The event arrives, releasing the lock
                }
            }
        }).start();
    }

    @Test
    public void testAsynchronousMethod(a) {
        callAsynchronousMethod();  // Call an asynchronous method that takes a long time to complete and triggers event notifications

        /** * the lock in init was released when the event did not arrive. /** * The lock was released when the event arrived
        synchronized(lock) { assertTestResult(); }}}Copy the code

The assumption here is that event notifications will come and be listened for, but what if they don’t (for example, an exception task fails)? We’re waiting, in fact, we can also timeout mechanism, introduced in the test, it also led to the second type of abnormal test (can call polling mode), suppose we have the following an asynchronous system, used to send messages to the NSQ middleware, a Job under test to monitor the message and post-processing the message arrived message:

The link from the application to the Job can be tested. The NSQ message can be directly constructed by the application, or the NSQ message can be transformed by Mysql binlog. From the perspective of integration testing, we can narrow down the scope of testing and directly construct NSQ messages in the test to test the link from the message-oriented middleware to the Job. The long link test takes a long time and requires to know the message triggering logic of the specific application before writing the test. Writing the test is also slow, which increases the test cost virtually. So for such a system, we can use the integrated test method to measure.

@Test
public void testAsynchronousJob(a) throws Exception {
    String msg = buildNsqMsg();    // Construct the NSQ message
    nsqClient.send(TOPIC, msg, false);	// Send an Nsq message

    with().pollInterval(ONE_HUNDRED_MILLISECONDS).  // Check after 100ms
            and().with().pollDelay(10, MILLISECONDS).  // Check every 10ms after that
            await("description").  // Description
            atMost(1L, SECONDS).   //1s timeout period
            until(() -> xxxService.getState() == "changed");  // Business-specific assertion logic
}
Copy the code

For the above tests we introduced the Awaitility utility class for polling, and a proper polling has at least the following features:

Timeout mechanism, you can’t poll all the time
Delayed polling for the first time
The polling frequency

Finally, let’s discuss the reliability of the test results.

Assume that an asynchronous system is tested in polling mode. After the asynchronous task is triggered, the system status jitter occurs during the two polling periods for some reason. In the next polling, the polling mode may mistake the asynchronous operation as incomplete or an exception occurs, resulting in misjudgment of the test result:

In contrast, there is no such problem in the monitoring mode. As long as the system state changes, the monitoring test can be immediately sensed and reliable test results can be made:

Many asynchronous systems do not have callbacks. In this case, polling can only be used to test asynchronous tasks. The reliability of polling tests depends on the reliability of the system to be tested.

However, a reliable system that executes periodically will also suffer from the same problems, and the tests will become unreliable as the code executes periodically and the system state changes periodically. For such a system, we can do some testability modifications. Separate the business logic from the periodic execution logic, and add a portal that can call the business logic, such as a restful interface, so that the timing and frequency of the execution of the business logic can be accurately controlled during testing, so that reliable testing can be achieved.

Elasticsearch is a search engine that will refresh data every 5 seconds. For example, Elasticsearch is a search engine that will refresh data every 5 seconds. The polling mode is limited by the long testing time, unless the refresh rate of Elasticsearch is increased; The other type is the Job that interacts with Mysql and Redis. The test of this kind of Job can work very well. The test can be completed in 150 milliseconds, which means that it can be put into continuous integration build like normal test.

Two ways to test asynchronous systems

Related Posts

Analyze NIO’s zero copy in depth

Practice of TiDB in Key Business Scenarios of Financial Industry (Part I)

Use Baidu cloud API interface to realize gesture recognition and matting