The Well Tested Programmer: Tests and Architecture

This post is a follow up to my previous on testing implementation details.

I believe that when you support something you should take extra care to ensure that you are not blinding yourself to the downsides. Since I'm a proponent of testing software, and TDD as the way to do that, it is good to ask: what are the downsides.

There is one obvious downside that I'm aware of: tests to not drive good architecture. Even though TDD is sometimes called Test Driven Design (Test Driven Development is also used), TDD doesn't actually drive good design! Sure it works well for tiny programs. However I have found that for large programs where many programmers (more than 10) are working full time you cannot rely on just TDD to get architecture: you need something more.

In the well known TDD bowling game example the authors talk about a list of frame objects that they expected to need in the early stages. However as their program matured over the exercise they never created one. Many people have done the same example, with the same tests, and ended up with a a list of frames. Which is right? For the requirements given both are equally correct. However the requirements are such that 1 pair could write the entire program in a of couple hours.

So lets imagine what a real bowling program would look like if we worked for Brunswick. If you haven't been bowling in a few years you should go to their website and investigate a modern bowling center. Bowling is not just about keeping score, they need to track which pins are still standing, the clerk needs to turn lanes on/off when they are paid for. There is a terminal for interaction (entering bowler's names), and a second screen that displays the score, or funky animations when a strike is bowled. There is the pin setting machine that we can assume has a computer of its own. There are probably a number of other requirements for different events that aren't obvious from the website, but important anyway. The need to add these additional features places requirements on architecture that do not exist in the example. Do any of these requirements demand a list of frames?

Now many of you are thinking YAGNI, that is only partially correct. Even most extreme proponents will tell you that it cannot work without continuous refactoring. You can look at something that is done, ask "is this right" and re-design the architecture to fit. The entire cycle is only a few hours long: the full sequence of steps end in refactor, and part of refactor is create good architecture. When you have something that works you can step back and say "now that I have experience, what is a good design" instead of trying to guess. This is great for getting the classes right in the small, much better than up-front design. However I contend that YAGNI alone isn't enough for a large project.

YAGNI hits a wall on large projects. At some point you will realize someone over on the other side of the office (often world!) did something similar to something that you have also done; and both of you did it a few months back and have been using your versions all over. The right thing to do is refactor to remove the duplication, but how do you do that when the API is slightly different and both are used all over? Creating modules is easy if the API in question was identified correctly in advance, but if you see the need late it can be difficult. Tests do not help to create the right modules in advance. Tests can help you add those modules latter when you find a need, but they often will actually hinder adding those modules

So the only question is how do you identify those modules in advance? Fortunately for me, this is a blog on testing. Therefore I can leave the discussion of how to figure out where you need those modules to someone else...

Unfortunately I can't just leave you with that cop out, nobody sensible would accept it. I'm also out of my league. There are the obvious places where the boundary exists, but those are also easy and probably done. You would be a fool to write your own string implementation for example (I'm assuming your project isn't write the standard library for your language). There are somewhat less obvious places in your project that you can analyze in advance with just a little spike work. If you haven't done the one to throw away first you should, a couple weeks effort can save you months in many ones, one of which is finding a number of modules. However until you get into deep into the real implementation there will be something you miss.

Those things you miss are the important parts. If you mocked any of them switching will be hard. You have to change both the test and the implementation at the same time, then all of the tests to use a different mock and then everyplace in the code that calls that mock to the new API. When you change code and tests at the same time there is always the danger that you make a change that is wrong, but the tests don't catch it. When you have many tests broken there is a long time between feedback, what if you break real functionality, you may go for hours making tests pass before you notice the one failing test that is a real bug, which means you are hours from the code change where the problem came in.

Back to our Brunswick bowling machine. As the example played out, they have a game object which you can ask CurrentFrame, and also GetScoreForFrame. This interface might need some extension, be you can quickly see that game is the heart of any analysis of score and so that this object is what is passed around. The implementation behind it is hidden by the game abstraction: this looks like a good choice to me. Or does it - what if we are interested in looping across all throws? Now the game object gets a couple new functions: GetNumberOfThrowsInFrame and GetThrowForFrame. However this is two loops which is unexpected complexity, we have to loop across each frame and then across the throws in the frame.

We can also imagine a different architecture where instead of a game object as the abstraction we use the list of frames, each frame is 3 throws - and a flag to indicate which throws were actually taken. Looking on list of frames is easy, and the code to loop across all throws is clear. This list is also cache efficient which is useful on modern CPUs that pay a big price for cache misses. On the other hand we are passing data around which means if we ever want to change our the data structure we need to change every place in the code that reads it - this is why data is left private in nearly every architecture.

Note that you are not limited to either list of frames or a game object. There are many other ways to design your system. Each has different trade offs. Maybe the real problem is I created a poor API for getting throws from the game object? Or maybe the problem is the game object is the abstraction: perhaps a different object (or set of objects) can provide the right abstraction. I leave the correct way to solve this problem up to your imagination.

In short, TDD is not a silver bullet that lets you fire all the architects. You still need their vision to get the interfaces right.

If there is any other downside to automated testing you think I've missed, please let me know.

The Well Tested Programmer

Friday, December 5, 2014

Tests and Architecture

No comments:

Post a Comment