In unit testing, code coverage is often used as a measure of how good a test is, or even how well a test task is being completed. But I’m sure you’re not asking for coverage just for coverage’s sake. You need meaningful coverage to show that you have tested the software well.

Issues related to measuring code coverage have always caught my eye. On the one hand, I often find that companies and organizations don’t necessarily know how much code they cover during testing, which is really surprising! On the other hand, for some organizations, code coverage numbers are so important that the quality and effectiveness of testing becomes almost irrelevant. They blindly chase 100 percent code coverage and believe that if they have that number, the software will be excellent, maybe even the best. In fact, this is just as dangerous as not knowing what you’re testing, and may actually be more dangerous because it can give you a false sense of security.

Code coverage is a nice and interesting number to use to assess software quality, but it’s important to remember that it’s a means, not an end. We’re not asking for coverage for coverage’s sake, because it should show that we’re doing a good job of testing the software. If the testing itself is meaningless, then even higher coverage does not mean better software. An important goal is to ensure that every code is tested, not just executed. There isn’t enough time or money to fully test everything, at least to make sure everything that matters is tested.

That said, low coverage means we may be undertesting, and high coverage per se is not necessarily associated with high quality — it’s more complicated than that.

Obviously, having a pleasant test environment that allows you to have “enough” coverage to release software with a good, stable, maintainable test suite that has “enough tests” is the best state of affairs. In practice, however, these coverage traps are still common.

Trap 1: “We don’t know our coverage”

It doesn’t seem reasonable not to know your code coverage — code coverage tools are cheap and plentiful. A friend of mine told me that developers generally know their code coverage is poor, so developers and testers don’t want to expose poor coverage to management. I certainly hope this is not the case in general.

One practical problem the team encountered when trying to estimate coverage was that the system was too complex. When you’re building your application piece by piece, just knowing where to place coverage counters can be a daunting task. I suggest you think twice about your architecture if it’s actually hard to measure your application’s coverage.

The second way to fall into this trap occurs in organizations that may be running a lot of tests without actually covering the number because they have not found a proper way to aggregate the number from different test runs. If you’re doing manual, functional, unit, and end-to-end testing, you can’t simply add up the numbers. Even if they each achieve 25 per cent coverage, the combination is unlikely to reach 100 per cent. In fact, when you look at the results, it’s more likely to be closer to 25% than 100%.

It turns out that there is actually a way to measure and increase coverage together in a meaningful way. At Parasoft, the vast amount of fine-grained data captured by the Parasoft DTP reporting and analysis tool can be used in this case to provide a comprehensive, aggregated view of code coverage. The application monitor can collect coverage data directly from the application during testing and then send it to Parasoft DTP, which summarizes all test practices as well as coverage data from the test team and test runs.

If that sounds like a lot of information, you’re right! DTP provides an interactive dashboard to help you browse this data and make decisions about where to focus your testing. See the sample dashboard below:

If multiple tests cover the same code, it is not counted, while untested parts of the code can be seen quickly and easily. This shows you which parts of the application are well tested and which are not. You can read it all in the free white paper.

So there is no more excuse for not measuring coverage.

Trap # 2: “Coverage is everything!”

There is a common misconception that coverage is everything. Once the team is able to measure coverage, the leader often says “let’s increase that number.” Eventually, the numbers themselves become more important than the tests. As Parasoft founder Adam Kolawa puts it:

“It’s like asking a pianist to hit the keys of a piano at 100 percent coverage, rather than hitting only the keys that make sense according to the needs of the score. When he plays a piece, he gets any number of key coverage that makes sense.”

That’s the problem — blind coverage is the same as blind music playing. Coverage needs to reflect real, meaningful use of code, otherwise it’s just noise.

Speaking of “noise”… The cost of coverage increases as coverage increases. Keep in mind that you not only need to create tests, but you also need to maintain them. If you’re not going to reuse and maintain tests, you probably shouldn’t waste time creating them in the first place. As the test suite gets bigger, the amount of “noise” increases in unexpected ways. Twice as many tests can mean twice or even three times as much “noise.” Meaningless tests end up making more “noise” than good tests because they have no real context, but they have to be dealt with every time a test is executed. Think of technical debt! Useless testing is a real danger.

Now, in certain industries, such as those critical to security, a 100% coverage target must be achieved. But even in that case, it’s too easy to treat any execution of a single line of code as a meaningful test, which simply isn’t true. I ask two basic questions to determine if a test is a good test:

  • What does a test failure mean?
  • What does passing the test mean?

Ideally, when a test fails, we’ll know what went wrong, and if the test is really good, it’ll point us in the right direction to fix it. When a test fails, many times no one knows why, no one can copy it, and the test is ignored. Instead, we should be able to know what was tested when the test passed — meaning that a particular feature or feature worked.

If you can’t answer one of these questions, there may be a problem with your test. If you can’t answer any of them, the test may cause more trouble than it’s worth.

The way out of this trap is first to understand that coverage is not a goal in itself. The real goal is to create useful and meaningful tests. It certainly takes time. In simple code writing, unit testing is simple, but in a complex real-world application, it means writing stubs and mocks and using the framework. This can take a lot of time, and if you don’t do it all the time, it’s easy to forget the nuances of the API involved. Even if you take testing seriously, creating really good tests can take longer than you expect.

In order to improve the efficiency of unit testing, Parasoft Jtest, a Java development testing tool, has a new technology for this purpose — unit test Assistant. The unit test assistant has the tedious task of getting the mocks and stubs right. It can also help you extend existing tests in an efficient way to increase coverage — helping you create good unit tests and making recommendations to improve test coverage and test quality.

Hopefully, you’ve learned that coverage is important and that improving coverage is a worthwhile goal. But keep in mind that simply chasing percentages is not as valuable as writing stable, maintainable, and meaningful tests.