Abstract: Developer testing is a very important part of modern software engineering, agile development, backbone development of these advanced project management methods and processes are based on perfect developer testing.

1. “Developer Testing” is “Developer testing”

Developer testing is a very important part of modern software engineering. Agile development, backbone development and other advanced project management methods and processes are based on perfect developer testing. When a monthly or even weekly release is being delivered, it is not possible to devote a large number of test engineers to large-scale system level testing, and most of the testing pyramid needs to be automated.

We’re talking about developer testing today. What is “developer testing”? We have a clear development and testing division. Writing the code goes to developing the siege lion, testing goes to testing the siege lion, and most of the time both sides are in a red-blue standoff. This is very similar to the situation of my r&d team more than 10 years ago. In today’s software engineering, there are very few dedicated “test siege lions”. Many companies develop test ratios greater than 10:1, and some departments even have no test siege lions. The role of test siege lion is no longer manual test case “coolie”, but management of product test system, product test planning, analysis; Summarize functional test mind maps, design test cases, and lead the R&D team to test, more like a “test expert/test coach”. For example, my previous product was an online video conference collaboration product. Our daily online meeting was based on our own product, and we would use the new function test site developed by ourselves to hold “stand meeting”. In addition to spending a small amount of time doing dialy updates, the test expert then leads the team (including PO, Architect, SM, Dev) to conduct a focused (half hour) test as planned. That is to say, not only UT, API, IT and other tests, including system level test development will also do, so the “developer test” is “developer to test”, and many of our traditional test engineers face three ways out: growth, transformation, elimination. The “test expert” also had a high voice in the project. The previous company used trunk development, had a “one in, one out” review, and this type of “test expert” on the team had veto power. There are even Principle Engineer level test experts in the company (equivalent to our 20-21 level technical experts).

  • One-step: To see if a feature can enter Release Branch, turn on the feature toggle in Release Branch to test the release level.

  • 1 output: When EngineerRelease is generated, the function is qualified and the feature toggle is allowed to enter the production line.

There is no test that cannot be “automated”

Back in the testing pyramid, white-box testing at the code level is extremely important in terms of the four dimensions of “development cost”, “implementation cost”, “test coverage”, and “problem location” of testing.

  • Development cost: the cost of implementing the test case.

  • Execution cost: The cost of running a test case.

  • Test coverage: Line coverage and branch coverage

  • Problem location: Test the efficiency of locating problems

Through the evaluation of the test pyramid and its four test dimensions, we can conclude:

  • Do LowLevel tests as often as possible: they are faster and more stable than the upper Test types and can be executed multiple times a day. In general, LLT gray is implemented in continuous integration build tasks, even in MR, to ensure the quality of the code going into the code repository.

  • Perform IT, ST, AND UI tests of a certain size under the guarantee of automation: because their execution speed is slow, the environment is more dependent, and the tests are relatively unstable. This is usually done once or twice a day (usually at night) to periodically check code quality and report code problems.

  • Do as few large-scale manual tests as possible: their execution speed is less stable than LLT, their labor costs are higher, and they cannot be executed multiple times a day, each execution takes a long time to get feedback. However, they are more closely aligned with real user scenarios, so make sure you test them on a regular basis or at critical points to ensure software quality.

Many companies are now iterating on shorter releases, even two weeks. Manual testing clearly does not fit into this development pattern, and the only way to automate test cases for manual testing is through various technical solutions. At the code level, UT can be made from the underlying business code to UI code as long as the architecture design is reasonable. The top-level UI interaction test and test cases can also be run automatically (most UI frameworks can conduct AUTOMATIC UI test through the interface of ACCESSIBILITY). I see that huawei mobile hardware department can automatically test the extreme test of “dropping the phone”. Why can’t software do that? At least some of the industry’s leading technology companies have products that can be measured in days, from code submission Merge Request to product delivery. Testing of this product cannot and cannot be done manually.

Iii. Developers test “Benefit in the present” and “win the future”

A lot of people think that the underlying developer testing, spend a lot of time, write a lot of code, and then ensure that the function is correct, but every time the function or structure of the code changes to modify the test code. It’s much more efficient for me to debug and validate manually, and even some developers test more for metrics. In fact, debugging code through UT, API tests is not very different from running debugging yourself manually, but debugging code through developer tests ensures the quality of the current project iteration; But its more important function is not that. In our bug categories we often have terms like Build Regression bug and Release RegressionBug.

  • BuildRegression Bug: the same feature in development has a Bug in a new release, but not in the previous release. This is called a BuildRegression Bug.

  • ReleaseRegression Bug: a ReleaseRegression Bug where the same functionality appears in a new Release, but not in the previous Release.

Every time we deliver code into production, no one can guarantee that it will be 100% problem-free. In the rapid iteration of agile development, it is not possible to conduct full functional manual testing, so developer testing, especially the underlying UT, API testing, integration testing, can easily identify such problems. This is when developers test an important feature in order to prevent future changes from affecting the current code. So developer testing is “for the present” and “for the future”.

Four, gray TDD, do not force to write test code first

With TDD, the common wisdom is to write the test code first and then the implementation code, which is both true and false. It’s conceptually correct, but it’s not necessarily the most efficient when done strictly, which is one of the reasons TDD is so hard to roll out. We divide the coding implementation into three parts: implementation code, test code, and debug code. The concept of TDD is to write test code, code it, and debug it. When we implement the code, it is not possible to consider very clearly at the beginning, the interface definition is completely accurate, if strictly follow the test, coding, debugging to do, the test code will be changed frequently with the code. Of course, this is not a big problem in itself, in the actual implementation process, many people used to build the code framework, test framework, and then code, test. Wait until the test is complete before debugging. So from the perspective of Huawei gray management, as long as the unit test before debugging, can be called TDD development mode. BTW, of course, is now in vogue for BDD. The point here is that if a team can’t do TDD as I said, don’t even consider BDD.

Behavior-driven Development: BDD combines the general techniques and principles of TDD with ideas from domain-driven design (DDD). BDD is a design activity that allows you to gradually build blocks of functionality based on expected behavior. BDD focuses on the languages and interactions used during software development. Behavior-driven developers use their native language in conjunction with the language of domain-driven design to describe the purpose and benefits of their code. Teams using BDD should be able to provide extensive “functional documentation” in the form of user stories and add executable scenarios or examples. BDD often helps domain experts understand implementation rather than expose code-level testing. It is usually defined in GWT format: GIVEN WHEN&THEN.

Five, UT coverage of 100% is really bad

In unit testing, we all focus on one metric: coverage. Regardless of module, function, line, or branch coverage, there must be a certain percentage of coverage. But achieving 100% of each will give you a bad rating. It’s not that you can’t do it right (not the right way), it’s about cost and value for money. Take branch coverage, which is the hardest to achieve. If some of the memory allocation or fault-tolerant branches have to be tested to achieve 100% coverage, then your test case should be considered double, but it doesn’t add value. Even some code conditional branches are never executed during the lifetime of the program.

  • Module coverage: Business module code through UT, architecture module code through IT; From a UT coverage point of view, there is no need to test architecture code.

  • Function coverage: Don’t write UT for code that doesn’t have any logic. For example, some of our functions are just get/set attributes, and the internal implementation uses a variable to assign values to them. This function UT is written for coverage and has no real meaning.

  • Line coverage: Generally speaking, 80% line coverage is a reasonable indicator, some can be 0%, and some need 100%. If all code is more than 90%, it is high cost and low efficiency, so it is not recommended.

  • Branch coverage: The more complex the business logic, the more test cases need to be written to cover it, and some memory allocation errors can be determined without testing.

Use testing to drive architecture and code quality

When we talk about test-driven architecture and code quality, we’re talking about making your code fully testable. What is code testability? Simply put, it is the decoupling of the relationship between classes, modules, classes and modules through interface programming. Dependent interfaces are passed in through passive injection rather than active retrieval. When the program is running properly, the interface parameters passed in are real business objects, and when testing, you can pass in a mock implementation of FAKE. Of course, not all dependency modules do this. Some business-neutral UtilityLibraries, or some specific data object implementations, can be called directly.

So here we have fake and mock, TestDoubles, and basically the idea is, what does each mean, and you can go online and search for it

  • Dummy

  • Stubs

  • A spy

  • Mock objects

  • Fake Object

At present, most of our developers use Mock Objects (in fact, many of them are stubs controlled by input parameter return values). Remember DHH (David HeinemeierHansson) from a few years back, “TDD is Dead, Long Live Testing”, where the excessive use of mocks/stubs in TDD led to architectural problems. TDD founder Kent Beck says he never mocks. Although you can Mock your code, the fact that you have to Mock basically means that your code is more relevant, your module display is more dependent, and your module is less portable, especially in C programming. As a result, many modules are now unable to carry out unit testing at all, and more are doing integration testing.

Why is this happening? Our senior architects pay more attention to system-level architecture design and make the relationship between system modules and applications very clear. Usually, senior architects can design the relationship between system modules and applications reasonably. However, the design and implementation of the specific application business is left to low-level architects. In fact, the amount of code inside these modules is not small, many of them are hundreds of thousands or even millions of lines of code. At this point, the level of the architect determines the Clean Code quality of the Code. Many of our company’s current code problems are not system architecture problems, but specific business implementation, lack of strict requirements and reasonable architecture design. If there is an architectural solution to be regulated at the application level, it can be at least as clear as the system design in terms of module interfaces and module-to-module interactions. The indeterminable part is thousands of lines of code inside each submodule.

A typical example came across during a recent technical review of a project. One team wrote a socket library that relied on the platform specific system library. Porting the socket library to Linux or another platform would require a major refactoring. The refactoring method should be easy to understand, that is, through the adapter pattern, the underlying operation is abstracted into the interface (here for socket library, the underlying library is a kind of dependency, different scenarios should treat the underlying library differently, do not generalize), the actual code does not pay attention to the specific implementation of a specific platform, This problem can be solved by implementing adapters of different platforms. However, in the early stage of development, when the design builds the code framework and tests the framework, it discovers that the underlying library is a kind of coupling, and the testing has to Mock to achieve the test surrogate. If decoupled design is considered at this point, the architecture will naturally support it without the need for refactoring when supporting different platforms.

The reason for proposing to use test-driven architecture and code quality is that when a high standard is set for testing (some of the previous projects explicitly prohibit the use of Mock, test framework or even a single Catch2), we have to solve the testing problems from the architecture. When the testing problems are solved, The code architecture of CleanCode L3 comes naturally.

From “I’m going to write test Dependent code” to “I’m going to write test dependent code”

Strange as it may seem, this is actually a fundamental way to solve low-level developer testing from the ground up. There are dependencies between modules, whether through Mock or Fake, that cannot be eliminated no matter how architecturally sound they are. Instead, we try to decouple dependencies from modules by designing wisely. The first “I’m going to write test dependent code” means that I’m going to write test code to test when I implement my module. However, I will be quizzed on how to write my test dependencies. At that time often appeared problem, such as A1, A2, A3 three modules in B1, B2 two modules, normally we do A1, A2, A3 team or individual will be to write B1, B2, lead to repeat the test code, if the module design is unreasonable, testing depends on too many, unit testing cost is too high. The second “I’m going to write test dependency code” means that when I implement my code, I’m going to think about how the module that depends on me is going to solve my dependencies when it tests. “I’m going to write test dependency code” (what I call FAKE object and implementation) to help the module that depends on me solve the test dependency problem. Similarly, when testing A1, A2, and A3, the dependency of B1 and B2 already exists, as long as you focus directly on the test case itself. To be specific:

  • Thinking shift, test-driven: When developing a module, don’t think about how to test yourself first, think about how I can make it easier for others to test if they rely on me. The provider of the module would provide not only the module code, but also a reusable Faked object (call verification; The return value. Parameter verification; Parameter processing; Function simulation, etc.).

  • The writer of the module code implements his own Fake implementation. Basically, most of the code is done by the module writer, and this is a reusable Fake implementation. Module dependencies add their own code based on their particular business needs. It basically follows the 80/20 principle.

  • Architecturally decoupled dependencies, interfaces are programmed by injecting dependencies. At production runtime, modules use real implementation objects, while developer tests use Fake objects.

  • When the test code is written, all the interfaces for dependencies and dependencies are basically done, and the focus is more on test cases than testing dependencies.

Click to follow, the first time to learn about Huawei cloud fresh technology ~