Archive for the 'testing' Category

Position statement for functional testing tools workshop

Automated functional testing lives between two sensible testing activities. On the one side, there’s conventional TDD (unit testing). On the other side, there’s manual exploratory testing. It is probably more important to get good at those than it is to get good at automated functional testing. Once you’ve gotten good at them, what does it mean to get good at automated functional testing?

There is some value in thinking through larger-scale issues (such as workflows or system states) before diving into unit testing. There is some value (but not, I think, as much as most people think) in being able to rerun larger-scale functional tests easily. In sum: compared to doing exploratory testing and TDD right, the testing we’re talking about has modest value. Right now, the cost is more than modest, to the point where I question whether a lot of projects are really getting adequate ROI. I see projects pouring resources into functional testing not because they really value it but more because they know they should value it.

This is strikingly similar to, well, the way that automated testing worked in the pre-Agile era: most often a triumph of hope over experience.

My bet is that the point of maximum leverage is in reducing the cost of larger-scale testing (not in improving its value). Right now, all those workflow statements and checks that are so easy to write down are are annoyingly hard to implement. Even I, staring at a workflow test, get depressed at how much work it will be to get it just to the point where it fails for the first time, compared to all the other things I could be doing with my time.

Why does test implementation cost so much?

We are taught that Agile development is about working the code base so that arbitrary new requirements are easy to implement. We have learned one cannot accomplish that by “layering” new features onto an existing core. Instead, the core has to be continually massaged so that, at any given moment, it appears as if it were carefully designed to satisfy the features it supports. Over time, that continual massaging results in a core that invites new features because it’s positively poised to change.

What do we do when we write test support code for automated large-scale tests? We layer it on top of the system (either on top of the GUI or on top of some layer below the GUI). We do not work the new code into the existing core—so, in a way that ought not to surprise us, it never gets easier to add tests.

So the problem is to work the test code into the core. The way I propose to do that is to take exploratory testing more seriously: treat it as a legitimate source of user stories we handle just like other user stories. For example, if an exploratory tester wants an “undo” feature for a webapp, implementing it will have real architectural consequences (such as moving from an architecture where HTTP requests call action methods that “fire and forget” HTML to one where requests create Command objects).

Why drive the code with exploratory testing stories rather than functional testing stories? I’m not sure. It feels right to me for several nebulous reasons I won’t try to explain here.

Functional testing tools workshop just before Agile 2008


Agile Alliance Functional Testing Tools Open Space Workshop
Call for Participation

Dates: Monday, August 4, 2008
Times: 8 AM - 6 PM
Location: Toronto, Ontario, at the Agile2008 venue

Description

This is the second Agile Alliance Functional Testing Tools workshop.
The first, held in October 2007 in Portland Oregon, was a great
success. In this second workshop, we're increasing the size and
moving to an open space like format. The primary purpose of this
workshop is still to discuss cutting-edge advancements in and envision
possibilities for the future of automated functional testing tools.

As an open-space style workshop, the content comes from the
participants, and we expect all participants to take an active role.
We're seeking participants who have interest and experience in
creating and/or using automated functional testing tools/frameworks on
Agile projects.

This workshop is sponsored by the Agile Alliance Functional Testing
Tools Program. The mission of this program is to advance the state of
the art of automated functional testing tools used by Agile teams to
automate customer-facing tests.

There is no cost to participate. Participants will be responsible for
their own travel expenses.

Due to room constraints, we can accommodate up to 60 participants.
Registrations will be granted on a first-come, first-served basis to
participants who complete the registration process.

Registering for the AA-FTT Open Space Workshop

We will be using the conference submission system
(http://submissions.agile2008.org) to process the requests for
invitation (RFI). If you're interested in being invited to
participate in this workshop, please:
a) login to the submission system (create an account if you don't have
one already). NOTE: make sure your email address is correct.
b) click the 'propose a session' link to request an invitation,
filling in the following required fields:
- title: enter RFI 
- stage: select ‘AAFTT’
- session type: select ‘other’
- duration: select any of the values (not relevant for the RFI process)
- summary: briefly answer the following three questions
i) What do you see as the biggest issue for Functional
Testing Tools on Agile projects?
ii) What do you hope to contribute?
iii) What do you hope to get?
c) click ‘create’

The AAFTT stage producers will review the RFI, and send you an
invitation to attend the workshop, along with further instructions for
pre-organizing openspace sessions.

Please register as soon as possible, before the workshop fills up.

Pass This Along
If you know of someone that would be a candidate for this workshop,
please forward this call for participation on to them.

Conference of the Association of Software Testing 2008

The third annual Conference of the Association of Software Testing (CAST) 2008 in Toronto, July 14-16. Early bird registration ends May 30. Here’s what Michael Bolton has written about it:

A colleague recently pointed out that an important mission of our community is to remind people–and ourselves–that testing doesn’t have to suck.

Well, neither do testing conferences. CAST 2008 is the kind of conference that I’ve always wanted to attend. The theme is “Beyond the Boundaries: Interdisciplinary Approaches to Software Testing”, and the program is incredibly eclectic and diverse. Start with the keynotes: Jerry Weinberg on Lessons from the Past to Carry into the Future; Cem Kaner on The Value of Checklists and the Danger of Scripts: What Legal Training Suggests for Testers; Robert Sabourin (with Anne Sabourin) on Applied Testing Lessons from Delivery Room Labor Triage (there’s a related article in this month’s Better Software magazine); and Brian Fisher on The New Science of Visual Analytics. Track sessions include talks relating testing to improv theatre (Adam White), to music (yours truly and Jonathan Kohl), to finance and accounting (Doug Hoffman), to wargaming and Darwinian evolution (Bart Brokeman, author of /Testing Embedded Software/ and one of the co-authors of the /TMap Next/ book); to civil engineering (Scott Barber), to scientific software (Diane Kelly and Rebecca Sanders), to magic (Jeremy Kominar), to file systems (Morven Gentleman), and to data warehousing (Steve Richardson and Adam Geras), and to data visualization (Martin Taylor)… to four-year-olds playing lacrosse (Adam Goucher). There will be lightning talks and a tester competition. Jerry Weinberg will be doing a one-day tutorial workshop, as will Hung Nguyen, Scott Barber, and Julian Harty.

Yet another feature of the conference is that Jerry is launching his book on testing, /Perfect Software and Other Testing Myths/. I read an early version of it, and I’m waiting for it with bated breath. It’s a book that we’ll all want to read, and after we’re done, we’ll want to hand to people who are customers of testing. For some, we’ll want to tie them to a chair and /read it to them/.

The conference hotel is inexpensive, the food in Toronto is great, the nightlife is wonderful, the music is excellent…

More Information
================
You can find details on the program at http://www.cast2008.org/Program.

You can find information on the venue and logistics at http://www.cast2008.org/Venue.

Those from outside Canada should look at http://www.associationforsoftwaretesting.org/drupal/CAST2008/Venue#customs.

You can get registration information at http://www.cast2008.org/Registration.

Paying the Way
==============

If you need help persuading your company to send you to the conference, check out this: http://www.ayeconference.com/Articles/Mycompanywontpay.html.

And if all that fails, you can likely write off the cost of the conference against your taxes, even if you’re an employee. (I am not a tax professional, but INC magazine reports that you can write off expenses to “maintain or improve skills required in your present employment”. Americans should see IRS Publication 970 (http://www.irs.gov/publications/p970/ch12.html), Section 12, and ask your accountant!)

Come Along and Spread The Word!

===============================

So (if necessary) get your passports in order, take advantage of early bird registration (if you register in the next two weeks), and come join us. In addition (and I’m asking a favour here), please please /please/ tell your colleagues, both in your company and outside, about CAST. We want to share some great ideas on testing and other disciplines, and we want to make this the best CAST ever. And the event will only be improved by your presence.

So again, please spread the word, and come if you can.

Security mindset

A continual debate on the agile-testing mailing list is to what degree testers think differently than programmers and are therefore able to find bugs that programmers won’t. Without ever really resolving that question, the conversation usually moves onto whether the mental differences are innate or learnable.

I myself have no fixed opinion on the matter. That’s probably because, while vast numbers of testers are better than I am, I can imagine myself being like them and thinking like them. The same is true for programmers. In contrast, I simply can’t imagine being the sort of person who really cares who wins the World Cup or whether gay people I don’t know get married. (I’m not saying, “I couldn’t lower myself to their level” or anything stupid like that—I’m saying I can’t imagine what it would be like. It feels like trying to imagining what it is like to be a bat.)

However, I’ve long thought that security testers are a different breed, though I can’t articulate any way that they’re different in kind rather than degree. It’s just that the really good ones are transcendentally awesome at seeing how two disparate facts about a system can be combined and exploited. (A favorite example)

Bruce Schneier has an essay on security testers that I found interesting, though it doesn’t resolve any of my questions. Perhaps that’s because he said something I’ve been thinking for a while:

The designers are so busy making these systems work that they don’t stop to notice how they might fail or be made to fail, and then how those failures might be exploited. Teaching designers a security mindset will go a long way toward making future technological systems more secure.

The first sentence seems to make the second false. When I look back at the bugs I, acting as a programmer, fail to prevent and then fail to catch, an awful lot of the time their root cause wasn’t my knowledge. It’s that I have a compulsive personality and also habitually overcommit. As a result, there’s a lot of pressure to get done. The problem isn’t that I can’t flip into an adequate tester mindset, it’s that I don’t step back and take the time.

So, I suspect the interminable and seemingly irresolvable agile-testing debate should be shelved until we solve a more pressing problem: few teams have the discipline to adopt a sustainable pace, so few teams are even in a position to know if programmers could do as well as dedicated testers.

An alternative to business-facing TDD

The value of programmer TDD is well established. It’s natural to extrapolate that practice to business-facing tests, hoping to obtain similar value. We’ve been banging away at that for years, and the results disappoint me. Perhaps it would be better to invest heavily in unprecedented amounts of built-in support for manual exploratory testing.

In 1998, I wrote a paper, “When should a test be automated?“, that sketched some economics behind automation. Crucially, I took the value of a test to be the bugs it found, rather than (as was common at the time) how many times it could be run in the time needed to step through it manually.

My conclusions looked roughly like the following:

test tradeoffs in general

Scripted tests, be they automated or manual, are expensive to create (first column). Manual scripts are cheaper, but they still require someone to write steps down carefully, and they likely require polishing before they can truly be followed by someone else. (Note: height of bars not based on actual data.)

In the second column, I assume that a particular set of steps has roughly the same chance of finding a bug whether executed manually or by a computer, and whether the steps were planned or chosen on the fly. (I say “roughly” because computers don’t get bored and miss bugs, but they also don’t notice bugs they weren’t instructed to find.)

Therefore, if the immediate value of a test is all that matters, exploratory manual testing is the right choice. What about long-term value?

Assume that exploratory tests are never intentionally repeated. Both their long-term cost and value are zero. Both kinds of scripted tests have quite substantial maintenance costs (especially in that era, when testing was typically done through an unmodified GUI). So, to pull ahead of exploratory tests in the long term, scripted tests must have substantial bug-finding power. Many people at that time observed that, in fact, most tests either found a bug the first time they were run or never found a bug at all. You were more likely to fix a test because of an intentional GUI change than to fix the code because the test found a bug.

So the answer to “when should a test be automated?” was “not very often”.

Programmer TDD changes the balance in two ways:

Test tradeoffs for TDD

  1. New sources of value are added. Extremely rapid feedback reduces the cost of debugging. (Most bugs strike while what you did to create them is fresh in your mind.) Many people find the steady pace of TDD allows them to go faster, and that the incremental growth of the code-under-test makes for easier design. And, most importantly as it turns out, the need to make tests run fast and reduce maintenance cost leads to designs with good properties like low coupling and high cohesion. (That is, properties that previously were considered good in the long term—but were routinely violated for short-term gain—now had powerful short-term benefits.)

  2. Good design and better programmer tools dramatically lowered the long-term cost of tests.

So, much to my surprise, the balance tipped in favor of automation—for programmer tests. It’s not surprising that many people, including me, hoped the balance could also tip for business-facing tests. Here are some of the hoped-for benefits:

  • Tests might clarify communication and avoid some cases where the business asks for something, the team thinks they’ve delivered it, and the business says “that’s not what I wanted.”

  • They might sharpen design thinking. The discipline of putting generalizations into concrete examples often does.

  • Programmers have learned that TDD supports iterative design of interfaces and behavior. Since whole products are also made of interfaces and behavior, they might also benefit from designers who react to partially-finished products rather than having to get it right up front.

  • Because businesses have learned to mistrust teams who show no visible progress for eight months (at which point, they ask for a slip), they might like to see evidence of continuous progress in the form of passing tests.

  • People often need documentation. Documentation is often improved by examples. Executable tests are examples. Tests as executable documentation might get two benefits for less than their separate costs.

  • And, oh yeah, tests could find regression bugs.

So a number of people launched off to explore this approach, most notably with Fit. But Fit hasn’t lived up to our hopes, I think. The things that particularly bother me about it are:

  • It works well for business logic that’s naturally tabular. But tables have proven awkward for other kinds of tests.

  • In part, the awkwardness is because there are no decent HTML table editors. That inhibits experimentation: if you don’t get a table format right the first time, you’re tempted to just leave it.

    Note: I haven’t tried ZiBreve. By now, I should have. I do include Word, Excel, and their OpenOffice equivalents among the ranks of the not-decent, at least if you want executable documentation. (I’ve never tried treating .doc files as the real tests that are “compiled” into HTML before they’re executed.)

  • Fit is not integrated into programmer editors the way xUnit is. For example, you can’t jump from a column name to the Java method that defines it. Partly for this reason, programmers tend to get impatient with people who invent new table formats—can’t they just get along with the old one?

With my graphical tests, I took aim at those sources of friction. If I have a workflow test, I can express it as boxes and arrows:

a workflow test

I translate the graphical documents into ordinary xUnit tests so that I can use my familiar tools while coding. The graphical editor is pretty decent, so I can readily change tests when I get better ideas. (There are occasional quirks where test content has changed more than it looks like it has. That aspect of using Fit hasn’t gone away entirely.)

I’ve been using these tests, most recently on wevouchfor.org—and they don’t wow me. Sad While I almost always use programmer TDD when coding (and often regret skipping it when I don’t), TDD with these kinds of tests is a chore. It doesn’t feel like enough of the potential value gets realized for the tests to be worth the cost.

  • Writing the executable test doesn’t help clarify or communicate design. Let me be careful here. I’m a big fan of sketching things out on whiteboards or paper:

    A whiteboard

    That does clarify thinking and improve communication. But the subsequent typing of the examples into the computer is work that rarely leads to any more design benefits.

  • Passing tests do continuously show progress to the business, but… Suppose you demonstrate each completed story anyway, at an end-of-iteration demo or (my preference) as soon as it’s finished. Given that, does seeing more tests pass every day really help?

  • Tests do serve as documentation (at least when someone takes the time to surround them with explanatory text, and if the form and content of the test aren’t distorted to cram a new idea into existing test formats).

  • The word I’m hearing is that these tests are finding bugs more often than I expected. I want to dig into that more: if they’re the sort of “I changed this thing over here and broke that supposedly unrelated thing over there” bugs that whole-product regression tests are traditionally supposed to find, that alone may justify the expense of test automation—unless I can find a way to blame it on inadequate unit tests or a need to rejigger the app.

  • (This is the one that made me say “Eureka!”) Tests alone fail at iterative product design in an interesting way. Whenever I’ve made significant progress implementing the next chunk of workflow or other GUI-visible change, I just naturally check what I’ve done through the GUI. Why? This checking makes new bugs (ones the automated tests don’t check for) leap out at me. They also sometimes make me slap my forehead and say, “What I intended here was stupid!”

But if I’m going to be looking at the page for both bugs and to change my intentions, I’m really edging into exploratory testing. Hmm… What if an app did whatever it could to aid exploratory testing? I don’t mean traditional testability features like, say, a scripting interface; I mean a concerted effort to let exploratory testers peek and poke at anything they want within the app. (That may not be different than my old motto “No bug should be hard to find the second time,” but it feels different.)

So, although features of Rails like not having to restart the server after most code changes are nice, I want more. Here’s an example.

The following page contains a bug:

an ordinary web page

Although you can’t see it, the bottom two links are wrong. They are links to /certifications/4 instead of /promised_certifications/4.

  1. Unit tests couldn’t catch that bug. (The two methods that create those types of links are tested and correct; I just used the wrong one.)

  2. One test of the action that created the page could have caught the bug, but did not. (To avoid maintenance problems, that test checked the minimum needed to convince me that the correct “certifications” had been displayed. I assumed that if they were displayed at all, the unit tests meant they were displayed correctly. That was actually almost right—every character outside the link’s href value was correct.)

  3. I missed the bug when I checked the page. (I suspect that I did click one of the links, but didn’t notice it went to the wrong place. If so, I bet I missed the wrongness because I didn’t have enough variety in the test data I set up—ironic, because I’ve been harping on the importance of “irrelevant” variety since 1994.)

  4. A user had no trouble finding the bug when he tried to edit one of his promised certifications and found himself with a form for someone else’s already-accepted certification. (Had he submitted the form, it would have been rejected, but still.)

That’s my bug: a small error in a big pile of HTML the app fired and forgot.
Suppose, though, that the app created and retained an object representing the page. Suppose further that an exploration support app let you switch to another view of that object/page, one that highlights link structure and downplays text:

The same page, highlighting link hrefs

To the eyes of someone who just added promised certifications to that page, the wrong link targets ought to jump out.

There’s more that I’d like, though. The program knows more about those links than it included in the HTTP Response body. Specifically, it knows they link to a certain kind of object: a PromisedCertification. I should be able to get a view of that object (without committing to following the link). I should be able to get it in both HTML form and in some raw format. (And if the link-to-be-displayed were an object in its own right, I would have had a place to put my method, and I wouldn’t have used the wrong one. Testability changes often feed into error prevention.)

And so on… It’s easy enough for me to come up with a list of ways I’d like the app to speak of its internal workings. So what I’m thinking of doing is grabbing some web framework, doing what’s required to make it explorable, using it to build an app, and also building an exploration assistant in RubyCocoa (allowing me to kill another bird with this stone).

To be explicit, here’s my hypothesis:

An application built with programmer TDD, whiteboard-style and example-heavy business-facing design, exploratory testing of its visible workings, and some small set of automated whole-system sanity tests will be cheaper to develop and no worse in quality than one that differs in having minimal exploratory testing, done through the GUI, plus a full set of business-facing TDD tests derived from the example-heavy design.

We shall see, I hope.

Google talk references

One thing I meant to say and forgot: Just as the evolution of amphibians didn’t mean that all the fish disappeared, the creation of a new kind of testing to fit a new niche doesn’t mean existing kinds are now obsolete.

Context-driven testing:

Testing Computer Software, Kaner, Falk, and Nguyen
Lessons Learned in Software Testing, Kaner, Bach, and Pettichord
http://www.context-driven-testing.com
“When Should a Test Be Automated?”, Marick

Exploratory testing:

James Bach
Michael Bolton
Elisabeth Hendrickson
Jonathan Kohl

Left out:

The undescribed fourth age

Embedded vs. independent testers

Bruce Daley posts on how most humans are biased to think they’re less error-prone than they are. As far as I know, that’s a claim solidly based in empirical research. (See also Bruce Schneier’s The Psychology of Security.) From this, he concludes:

Given the nature of their work, software developers and software programmers suffer more from the illusion of knowledge and the illusion of control than most other professions, making them particularly subject to over-looking mistakes in their own code. Which is why software needs to be tested independently.

However. Consider the graph below.

Here, the programmer and independent tester start testing at the same time. (Bad programmer! Bad!) The programmer starts out with more knowledge of the app than the tester (the line marked P/+), but she also has a large amount of cognitive bias (P/-) and lacks testing skill. That makes her miss bugs her knowledge would otherwise allow her to find (the area under the red line). Moveover, her biases seem to be pretty impervious to evidence.

The tester starts out with less knowledge, but has no (relevant) cognitive biases at all. Also, his testing skill lets him ramp up his bug finding pretty fast—but it still takes him a while to overcome her advantage.

Which do you want doing the testing? If you’re shipping at time A, it looks like the programmer has the edge. (Compare the shaded areas under the curve.)

We could expect that advantage to erode over time. If the ship date is farther out, the independent tester would have an advantage, as this graph shows:

Even when all that matters is bug count, the decision is not straightforward, especially since it’s based on information you can’t know until after you’ve decided. (How long will it take the tester to get up to speed? How many and what kind of bugs will the programmer miss?)

On most projects, there are lots of other factors to consider.

So I encourage people not to make the assertion the post’s author does.

Project testing growth path

In response to a potential client, I wrote something very like the following. The interesting thing is that I’m placing more emphasis on manual exploratory testing. It’s not so much that I suddenly realize its importance as that automated business-facing tests continue to be hard to implement and adopt. More on that anon.

A short sketch of a reasonable growth path would go like this:

  1. Get the programmers sold on test-driven design. How difficult that is depends mainly on how much legacy code you have (where legacy code is, as Michael Feathers says, code without unit tests). Legacy code is hard to test, so programmers don’t see the benefits of testing as quickly, so it requires that much more discipline to get over what’s always a higher hump than with greenfield code. (Michael Feathers’ Working Effectively with Legacy Code is the gold standard book, though there’s an important strategy—”strangler applications“—that’s not covered in depth. Also, I’m the track chair for a new Legacy Code track at Agile2008, I just asked Feathers to give the keynote, and he says he has “a number of surprising proposals about how to make things better”.)

    I’ve come to feel that the most important thing to get across to programmers is what it’s like to work with code built on a solid base of tests. If they understand that early on, they’ll have a clear idea of what to shoot for, which helps with the pain of legacy code. I wrote a workbook to that end.

  2. At the same time, move testers away from scripted manual tests (if that’s what they’re doing) and toward a more exploratory style of manual testing. The people who are strongest on exploratory testing in Agile are Jonathan Kohl, Elisabeth Hendrickson, and Michael Bolton.

  3. As programmers do more unit testing, they will become accustomed to changing their design and adding code in support of their own testing. It becomes more natural for them to do the same for the testers, allowing them to do “automation-assisted exploratory testing”. (Kohl writes about this.) I like to see some of the testers learn a scripting language to help with that. Ruby is my favorite, for a variety of reasons. I wrote a book to help testers learn it.

  4. Over this period, the testers and programmers should shed most animosity or wariness they have toward each other. They’re working together and doing things to help each other. It helps a lot if they sit together.

  5. Once the programmers are sold on test-driven design, they will start wishing that the product owners would supplement what they say about what they want with clear, concrete, executable examples of what they want. That is: tests, written in the language of the business. That isn’t as easy to do as we thought it would be five years ago, but it can be done more or less well. Often, the testers will find a new role as helpers to the product owners. For example, they should get involved early enough to ask questions that lead to tests that prevent bugs (which is better than discovering the bugs after you’ve paid some programmers to implement them).

  6. Throughout this, some kinds of testing (like performance testing) don’t change all that much. For performance testing, I trust Scott Barber.

As a side note: I’m quite fond of the new The Art of Agile Development by Shore & Warden: enough to publicly declare that I’ll bring a copy to every team I work with. Lots of good from-the-trenches experience summarized there.

An occasional alternative to mocks?

I’m test-driving some Rails helpers. A helper is a method that runs in a context full of methods magically provided by Rails. Some of those methods are of the type that’s a classic motivation for mocks or stubs: if you don’t want them to blow up, you have to do some annoying behind-the-scenes setup. (And because Rails does so much magic for you, it can be hard for the novice to have a clue what that setup is for helpers.)

Let’s say I want a helper method named reference_to. Here’s a partial “specification”: it’s to generate a link to one of a Certification's associated users. The text of the link will be the full name of the user and the href will be the path to that user’s page. I found myself writing mocks along these lines:

mock.should_receive(:user_path).once.
     with(:id=>@originator.login).
     and_return("**the right path**")
mock.should_receive(:link_to).once.
     with(@originator.full_name, "**the right path**").
     and_return("**correct-text**")

But then it occurred to me: The structure I’m building is isomorphic to the call trace, so why not replace the real methods with recorders? Like this:

  def user_path(keys)
    "user_path to #{keys.canonicalize}"
  end

  def link_to(*args)
    "link to #{args.canonicalize}"
  end

  def test_a_reference_is_normally_a_link
    assert_equal(link_to(@originator.full_name, user_path(:id => @originator.login)),
                 reference_to(@cert, :originator))
  end

This test determines that:

  • the methods called are the right ones to implement the specified behavior. There’s a clear correspondence between the text of the spec (”generate a link to”) and calls I know I made (link_to).

  • the methods were called in the right order (or in an order-irrelevant way).

  • they were called the right number of times.

  • the right arguments were given.

So, even though my fake methods are really stubs, they tell you the same things mocks would in this case. And I think the test is much easier to grok than code with mocks (especially if I aliased assert_equal to assert_behaves_like).

What I’m wondering is how often building a structure to capture the behavior of the thing-under-test will be roughly as confidence-building and design-guiding as mocks. The idea seems pretty obvious (even though it took me forever to think of it), so it’s probably either a bad idea or already widely known. Which?

Alternately, I’m still missing the point of mocks.

P.S. For tests to work, you have to deal with the age-old problems of transient values (like dates or object ids) and indeterminate values (like the order of elements in a printed hash). I’m fortunate in that I’m building HTML snippets out of simple objects, so this seems to suffice:

class Object
  def canonicalize; to_s; end
end

class Array
  def canonicalize
    collect { | e | e.canonicalize }
  end
end

class Hash
  def canonicalize
    to_a.sort_by { | a | a.first.object_id }.canonicalize
  end
end

A tagging meme reveals I short-change design

There’s one of those tagging memes going around. This one is: “grab the nearest book, open to page 123, go down to the 5th sentence, and type up the 3 following sentences.”

My first two books had pictures on p. 123.

The next three (Impro: Improvisation and the Theatre, AppleScript: the Definitive Guide, and Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life) didn’t have anything that was amusing, enlightening, or even comprehensible out of context. So I kept going, which is cheating I suppose. The last, How Designers Think, had this:

The designer’s job is never really done and it is probably always possible to do better. In this sense, designing is quite unlike puzzling. The solver of puzzles such as crosswords or mathematical problems can often recognize a correct answer and knows when the task is complete, but not so the designer.

That’s a hit. It made me realize a flaw in my thinking. You see, it reminded me of one of my early, semi-controversial papers, “Working Effectively With Developers” (referred to by one testing consultant as “the ‘how to suck up to programmers’ paper”). In its second section, “Explaining Your Job”, I explicitly liken programmers to problem solvers:

A legendary programmer would be one who was presented a large and messy problem, where simply understanding the problem required the mastery of a great deal of detail, boiled the problem down to its essential core, eliminated ambiguity, devised some simple operations that would allow the complexity to be isolated and tamed, demonstrated that all the detail could be handled by appropriate combinations of those operations, and produced the working system in a week.

Then I point out that this provides a way for testers to demonstrate value. I show a sample problem, then write:

Now, I’d expect any programmer to quickly solve this puzzle - they’re problem solvers, after all. But the key point is that someone had to create the puzzle before someone else could solve it. And problem creation is a different skill than problem solving.

Therefore, the tester’s role can be likened to the maker of a crossword or a mathematical problem: someone who presents a good, fully fleshed-out problem for the programmer to master and solve:

So what a tester does is help the programmer […] by presenting specific details (in the form of test cases) that otherwise would not come to her attention. Unfortunately, you often present this detail too late (after the code is written), so it reveals problems in the abstractions or their use. But that’s an unfortunate side-effect of putting testers on projects too late, and of the unfortunate notion that testing is all about running tests, rather than about designing them. If the programmer had had the detail earlier, the problems wouldn’t have happened.

Despite this weak 1998 gesture in the rough direction of TDD, I still have a rather waterfall conception of things: tester presents a problem, programmer solves it, we all go home.

But what that’s missing is my 2007 intellectual conception of a project as aiming to be less wrong than yesterday, to get progressively closer to a satisfactory answer that is discovered or refined along the way. In short—going back to the original quote—a conception of the project as a matter of design that’s at every level of detail and involves everyone. That whole-project design is something much trickier than mere puzzle-solving.

I used the word “intellectual” in the previous paragraph because I realize that I’m still rather emotionally attached to the idea of presenting a problem, solving it, and moving on. For example, I think of a test case as a matter of pushing us in a particular direction, only indirectly as a way of uncovering more questions. When I think about how testing+programming works, or about how product director + team conversations work, the learning is something of a side effect. I’m strong on doing the thing, weak on the mechanics of learning (a separate thing from the desire to learn).

That’s not entirely bad—I’m glad of my strong aversion to spending much time talking and re-talking about what we’ll build if we ever get around to building anything, of my preference for doing something and then taking stock once we have more concrete experience—but to the extent that it’s a habit rather than a conscious preference, it’s limiting. I’ll have to watch out for it.