Saturday, May 16, 2009

Don't Think - Just Write Tests, Right ?

I gave a brief talk on Continuous Integration at the local Java User Group a few weeks ago and I happened to mention that although I value automated regression testing very much I do not think it gives you any real guarantees that your software is error free.

I think this provoked some in audience a bit - I got questions along the line of "why do you hate unit testing so much?" and I realized that I unwittingly had stepped on some toes there. Sorry guys, I hope we are still friends.

Yes, I am a sceptic but for the record: I don't hate unit testing and I certainly do not think that people who do unit testing are idiots.

What I do have a problem with are when people set up their test suite, complete with red light for errors, green light for no errors - and then conclude the following:
  • Red light: there is a bug in the application code
  • Green light: there are no bugs in the application code

I think this assumption is incredibly naive and will try to explain why.

Your test system consists of two components: The application code and the test code itself (junit test cases, mocking etc.). Software bugs can be present in either component. This should be a known fact to anyone who have been working seriously with automated testing.

Consequently, red light means you have either
  • A bug in your application code
  • OR you have a bug in your test code!

I admit that this is very useful information and it also gives you an idea about where the problem lies - it is either in the code being tested or in the test itself.

Green light on the other hand is much more tricky. Really, all it tells you is that

  1. You have no bugs anywhere...
  2. OR .. you have a bug in your application code but the test that was supposed to cover it is faulty as well!
  3. OR .. you have a bug in your application code that is not covered by a test
When we see the green light we want to believe that we are dealing with case #1 - no bugs. But we really have no way of knowing this for sure.

Case #2 is fairly common in large test systems and is generally detected by someone else down the line - either QA or the end user. How do know that your test code is error free anyway? Are you writing unit tests to test your unit tests? And if so, are you also testing the test code that is testing the original test code... ?

This brings us to case #3 which has to do with test coverage. Yes, you can minimize the case 3 bugs by writing more tests. There is a catch, however: All the new test code you write could potentially have bugs, thus introducing more of the case #2 scenarios.

If you have ten lines of test code for every line of application code it means that there is a ten times higher risk of bugs in your test code. There is some powerful mathematics working against us in our attempt to achieve a full, error-free test coverage.

Integration testing is a way to test more lines of application code with less lines of test code but the same principle applies: As your test suite grows, so does the likelihood of having bugs in the test suite.

All this doesn't mean that you should stop doing automated testing. What is does mean is that you should look at your test results with a certain amount of scepticism. A failed test means you have a problem - either with your application code or your tests. No failed tests just means that there are potential undetected problems out there. And the way to solve these problems may not be to just blindly write more test code. Don't fire your QA people just yet.