Unit Test Coverage

S.Lott writes in his blog about unit test code-coverage: how much is enough?

Effective tests should account not only for code paths, but also input values and other application state or external environment that may affect the behavior.

For example, it may be easy to get 100% code coverage from tests for a function like the following:

divide(x, y) { return x/y; }

But unless you test for division-by-zero (when the parameter y is zero), you haven’t tested sufficiently.

The code-coverage metric doesn’t reveal when you’ve tested a good variety of input values. It only tests if your tests have visited the given lines of code, not what values were in each variable at the time.

Likewise for other application state besides input parameters. Values in other application objects, the contents of databases or files, or the operating system environment can all affect the proper functioning of a class or function that you’re testing. These variations are not measured by code-coverage metrics.

It could be argued that if you’re testing for external state, you aren’t doing unit testing by its strict definition; you’re doing functional or system testing. Nevertheless, most people rely chiefly on unit testing tools, because automated unit testing tools that generate code-coverage metrics are pretty easy to use.

While it’s a worthwhile goal to try to get high code-coverage in your unit-testing, a score of 100% doesn’t guarantee that you’ve tested enough. Likewise, a score below 100% isn’t necessarily an indication of inadequate testing. Code-coverage is therefore not a goal in itself; it’s one way of measuring one type of testing.

TDD lesson from Sudoku

I am a Sudoku addict. I like to analyze the logic strategies for solving these puzzles. I even gave a presentation at OSCON 2006 about using SQL to solve Sudoku puzzles.

The image I’m including is a screenshot from jigsawdoku.com, copyright 2008 by Rachel Lee and Gideon Greenspan. This is my favorite Sudoku web interface currently.

It’s most satisfying to solve the puzzle with no “crutch.” That is, every number is placed in its square without having to guess, and you never have to take a number out of a square after finding that it’s incorrect. Computer-based Sudoku interfaces that give you “hints” are also unsatisfying. It’s like doing a crossword puzzle in pencil!

I was getting pretty good at solving the Hard puzzles, but I had hit a wall solving the puzzles in about 5 or 6 minutes and I couldn’t improve my time any more. One day I was in a hurry, and I wanted to finish the puzzle and go do something else. I started guessing as I placed the numbers. I used the “hint” button after each guess, to tell me if I had gotten it right.

I only guessed when I had narrowed down the choice to two possible squares. In those cases I had a 50% chance of being right, in which case the hint told me I had made no mistakes. If the hint told me that I had made a mistake, I knew it must be caused by the number I had just placed.

What I found was that I immediately cut my time in half. I could solve Hard puzzles regularly in under 3 minutes, sometimes under 2.5 minutes. This was surprising and a bit discouraging. This meant that solving the puzzle with rigorous logic, and without guesswork costs twice as much time as solving the puzzle in a sloppy fashion. Where’s the satisfaction in being sloppy?

But I’ve been thinking about this. It’s an analogy to running tests frequently during incremental software development. Let me explain how.

When computers were massive machines operated behind locked doors using punch cards. As you designed your program, you had to imagine it running in your head, and anticipate the bugs and design flaws as a “thought experiment.” Then you thought the program was ready, you’d put a rubber band around your stack of punch cards, and put them in the queue to be run by the operator. The next morning you’d get your result and see if your program ran correctly.

Today, in most cases, the computers can run your program thousands of times per day if you need them to. It’s very inexpensive to run a partially-finished program, so now you can use the computer instead of your imagination to find out if the code works correctly. You can even use testing tools that make it easy to run tests repeatedly and identically with the touch of a button.

The efficiency of running repeatable tests enables Test-Driven Development, or at least a hybrid approach in which you write code incrementally and employ tests frequently to validate your work.

Here’s where it comes back to Sudoku. As I was placing numbers in the Sudoku grid and using the “hint” button to tell me if I had made the correct choice or not, I was practicing Continuous Integration. That is, I made the smallest change I could to the system (placing a single number in the grid) and then I re-validated the result with a test that was automated and repeatable.

I observed that I could solve the puzzle much more quickly and accurately using this approach. This illustrates the benefit of using software testing during development. You’ll get a more robust product in the end, and it will take less time than if you had to write all the code up front.

Some people think that since writing software tests takes some time on its own, that it’ll make the project schedule take longer. But I would point out that during the Sudoku game, I had to move my mouse to the “hint” button every time I placed a number on the grid. And yet I still solved the puzzle in half the time it took me to do it in the traditional way. The “overhead” of doing the testing, which one might assume is wasteful, in fact resulted in a net gain of productivity.

Now I don’t feel like I’m cheating by using the “hint” button in Sudoku. I’m just working in a more efficient manner.