Erasing history in tests

Something I say about the ideal of Agile design is that, at any moment when you might ship the system, the code should look as if someone clever had designed a solution tailored to do exactly what the system does, and then implemented that design. The history of how the system actually got that way should be lost.

An equivalent ideal for TDD might be that the set of tests for an interoperating set of classes would be an ideal description-by-example of what they do, of what their behavior is. For tests to be documentation, the tests would have to be organized to suit the needs of a learner (most likely from simple to complex, with error cases deferred, and - for code of any size - probably organized thematically somehow).

That is, the tests would have to be more than what you’d expect from a history of writing them, creating the code, rewriting tests and adding new ones as new goals came into view, and so forth. They shouldn’t be a palimpsest with some sort random dump of tests at the top and the history of old tests showing through. (”Why are these three tests like this?” “Because when behavior X came along, they were tests that needed to be changed and it was easiest to just tweak them into shape.”)

I’ve seen enough to be convinced that, surprisingly, Agile design works as described in the first paragraph, and that it doesn’t require superhuman skill. The tests I see - and write - remind me more of the third paragraph than the second. What am I missing that makes true tests-as-documentation as likely as emergent design is?

(It’s possible that I demand too much from my documentation.)

9 Responses to “Erasing history in tests”

  1. Michal Migurski Says:

    Aaron Cope from Flickr has a way of talking about documentation that roughly maps here: the documentation should be a story about the code, showing how it’s used and more importantly why. It should begin at the beginning and end and the end, and help the user follow a thread through the features rather than present a simple core dump of method parameters descriptions.

    I love this idea of putting the code into a coherent narrative.

    Python has a package called doctest that implements this in a lovely way, where you start off a module with a textual description of how it’s used in the form of annotated code examples. If you do them right, they can be parsed, run, and verified by doctest in the form of a unit test. It’s quite awesome.

  2. Clarke Ching Says:

    I’ve no idea what the answer to your question is, Brian, but I love that you’ve asked the question.

  3. John Sumsion Says:

    I recently had a valuable experience watching new members of my team grapple with tests of the 3rd paragraph variety.

    We used the expressions of disbelief and confusion as hints for how we could simplify the code. We created simpler tests that were easier to properly organize and have started removing the disorganized and overly-complex tests.

    In short, it is important to listen to a new person’s first impressions of a test and organize and rewrite until it seems obvious. Make the test suite discoverable.

  4. Brian Marick Says:

    When you say that you “simplified the code”, do you mean test code, or were there effects on the product code?

  5. John Sumsion Says:

    The production code was too complex and needed to be broken into pieces. The new tests for those pieces were straightforward to write and easy to understand without a lot of history.

    The existing test suite basically contained a lot of integration tests, where the real purpose of each test was buried inside some assertEquals() without a “why” comment, or worse in some kind of assertTrue() three call levels away from the test…() method.

    Some of these complex tests were redundant and were removed. Some were so complicated that it was more work to remove them than we had time for. I want to get better at looking at a test suite (in a macro view, a la Edward Tufte) and cutting it down to the bare minimum.

    My point is that it is a Good Thing to have a low tolerance for artificial complexity — and the new members of a team usually bring that along with them.

  6. Michael Bolton Says:

    What am I missing that makes true tests-as-documentation as likely as emergent design is?

    I’m not sure I understand the question. Do you mean to ask, “What would increase the likelihood that our tests would provide all the documentation that (some person) needed?”, or some other question?

  7. Brian Marick Says:

    Michael: I mean that, in coding, we have one thing - a design, and a good one - that emerges out of a seemingly unrelated activity (making the next test pass). Design comes about as a reliable-enough side effect of TDD. It doesn’t seem to me that good documentation falls out in the same way. It seems odd to ask “why not?” - a more sensible question might be “Why should it?” - but that’s what I’m asking.

  8. Michael Bolton Says:

    >I mean that, in coding, we have one thing - a design, and a good one - that emerges out of a seemingly unrelated activity (making the next test pass).

    I’m interested in probing these ideas. I’m still not sure whether you’re suggesting that TDD should lead to good documentation, or that good documentation should lead to good products. However…

    0) I genuinely believe that it’s more helpful to follow your example and use the word “example” instead of “test” in this context.

    1) Does a design emerge from making the examples pass, or is making each example pass simply a conceit, an alias, for asking (and answering) the next question about the design?

    2) Is what we see necessarily a good design, considering that good is always asked “compared to what?” Could it be a good design with respect to the questions that that the product has been programmed to answer, but a crappy design with respect to other questions that haven’t been asked yet? Could it be both at the same time?

    3) Is making the next example pass really an unrelated activity to design? (I’ll answer for myself that I see a pretty strong relationship.) Is design really a side-effect of TDD, or is it the direct intention of TDD?

    4) If we construct something by a process of saying, “it should do this… and then this… and then this… and then this…”, and the “this”s are sufficient numerous general, AND we’ve eliminated a bunch of cases in which it didn’t do what we wanted it to, could it look designed without actually being designed? Consider the human eyeball as an example.

    5) I see documentation as someone’s stories about a product. That includes a set of assertions about what the product can do–examples sometimes provide those well enough. But good documentation also tends to include some suggestions about the context for using the product, some warnings about what not to do with it, ideas about the human’s potential relationships with the product. These are in the form of narratives, not simply true-or-false assertions that can be checked by a pretty dumb machine. So I’m unsurprised that TDD-style examples don’t lead to these stories; they tend not to be motivated much by Why? and Who?

    —Michael B.

  9. Brian Marick Says:

    1) Does a design emerge from making the examples pass, or is making each example pass simply a conceit, an alias, for asking (and answering) the next question about the design?

    To me, the design seems to emerge mostly from the refactoring after you make the test pass.

    2) Is what we see necessarily a good design, considering that good is always asked “compared to what?” Could it be a good design with respect to the questions that that the product has been programmed to answer, but a crappy design with respect to other questions that haven’t been asked yet? Could it be both at the same time?

    To my mind, a good design is one that makes the next unanticipated change easy. “Easy” is compared to some historical average of code I’ve worked with.

    3) Is making the next example pass really an unrelated activity to design? (I’ll answer for myself that I see a pretty strong relationship.) Is design really a side-effect of TDD, or is it the direct intention of TDD?

    I don’t know how to answer that question. Certainly, now that people emphasize the “design” in TDD more than the “test”, they can’t willfully not think of design issues. (”Don’t think of an elephant.”) But design happens at different scales. What’s surprising about TDD is that working at a small-scale seems often to yield better-designed larger-scale structures. I think.

    4) If we construct something by a process of saying, “it should do this… and then this… and then this… and then this…”, and the “this”s are sufficient numerous general, AND we’ve eliminated a bunch of cases in which it didn’t do what we wanted it to, could it look designed without actually being designed? Consider the human eyeball as an example.

    Dunno. How would my future actions differ if I believed that something “was designed” vs “looks designed”?

    5) I see documentation as someone’s stories about a product. That includes a set of assertions about what the product can do–examples sometimes provide those well enough. But good documentation also tends to include some suggestions about the context for using the product, some warnings about what not to do with it, ideas about the human’s potential relationships with the product. These are in the form of narratives, not simply true-or-false assertions that can be checked by a pretty dumb machine. So I’m unsurprised that TDD-style examples don’t lead to these stories; they tend not to be motivated much by Why? and Who?

    Sure they’re motivated by “Why?” and “Who?”. Who is the person writing the test and code, right this instant. Why is in support of some other code you can point at right now. What’s interesting about TDD is that you can achieve good design (by my lights) with such selfishly limited answers.

    The context is the context possessed by the person doing the work. In a well-functioning, long-lived team, that context is shared by all the programmers. (Remember: we’re talking about unit-ish tests here.) [The difference between TDD for teams vs. TDD for open source needs to be talked about more, since a big difference is in how easily context is shared.] So there are questions like: shared enough? what reminders need to be planted somewhere a programmer will notice them, and which can go without saying? The goal would be to write the minimal amount needed for the audience. I don’t know what that amount is, but when I’m the audience - and the producer - I feel like the minimum amount isn’t being written. Or that what is being written isn’t a superset of the minimum amount - too much is extraneous, not a help, a waste.

Leave a Reply

You must be logged in to post a comment.