Exploration Through Example

Example-driven development, Agile testing, context-driven testing, Agile programming, Ruby, and other things of interest to Brian Marick
191.8 167.2 186.2 183.6 184.0 183.2 184.6

Fri, 30 Dec 2005

Working your way out of the automated GUI testing tarpit (part 5)

part 1, part 2, part 3, part 4

In the last installment, I made an automated GUI test faster—but it still takes three seconds to run. In this installment, I'll increase it to unit-test speeds. In fact, I'll argue that it really is a unit test. The next-to-last step out of the GUI testing tarpit is to convert existing GUI tests into unit tests of rendering and of the business logic behind what's rendered. (The final step is to create workflow tests that really do have something to do with the GUI.)

The existing tests call enter and press methods on a Browser object (after going through differing types of indirection). That Browser object turns presses into HTTP requests. They're sent to localhost:8080 and received by a Server that's a separate process. The server picks apart the HTTP Request and sends commands like login and new case to an App. The App manipulates a Model, then returns the name of the next page to display. The server renders that page and sends it back to the browser.

We can speed up the declarative test by cutting out the network. NullBrowser has the same interface as Browser, but it calls the App directly. The test now runs in around 0.5 seconds. Almost all of that time is spent in XML parsing and XPATH searching. I wish the test were faster, but not enough just now to find a different XML parser.

(You can skip this section unless you care about what power the rewritten test loses.)

Have I weakened the test? This sequence of the Server's code (spread among several methods) is now unexercised:

      @dispatched << [command, args]
      @current_page_name = @app.send(command, args)
      @current_xhtml = @renderer.send("#{@current_page_name}_for", @app)
      response.body = @current_xhtml
      raise HTTPStatus::OK

But how many tests do I need to be confident this code works? And does this test need to be one of them? I think not, so we can live with this weakening, but I'll make a note to later ensure that some test checks the sequence.

Some server setup now also goes untested. It looks like this:

  def install_UI_servlets
    install_generic_proc('/') { | request, app |
    install_command(:login, 'login', 'password')
    install_command(:record_case, 'client', 'clinic_id')
    install_command(:record_visit, 'diagnosis', 'charges')
    install_command(:record_audit, 'auditor', 'variance')

If, for example, record_audit were misspelled in the next-to-last line, our changed test would no longer detect that. So we need at least one test that exercises each application command through HTTP. It could be a separate test for each command, or one test for all the commands together, or anything in between—but this test no longer has anything to do with that. I'll defer the issue of those tests until what I think will be part 7. (Note that exercising each command will check the dispatching and rendering code shown three paragraphs ago, so I can erase my earlier reminder.)

The real HTTP server renders a page for each command, so the earlier version of this test did as well. The new version only renders the one page it cares about. So certain bugs in rendering might not be caught by this test. (They'd have to be very unsubtle bugs, since even the earlier version never actually checked any of the HTML along the way to the page-under-test. Only something like a thrown exception would be noticed.) Still, we need at least one test that checks each rendered page. I'll keep that in mind as I continue.

I next did a little cleanup, removing the fake browser object from the execution path since it really adds no value. I'll skip the details. Suffice it to say that the effort surfaced some duplication hidden behind this surface:

  def test_cannot_append_to_a_nominal_audit
    as_our_story_begins {
      we_have_an_audit_record_with(:variance => 'nominal')
    assert_page_title_matches(/Case \d+/)

The duplication made me wonder: what's this test really about? Does it have anything at all to do with movement through pages? No, it's about the rendering of pages in the presence of model state that ought to affect what gets rendered. These kind of tests are better described like this:

Given an app with particular state,
when rendering a particular page:
    I can make certain assertions about that page.

Or, in code:

  def test_nominal_audit_prevents_the_add_audit_action
    given_app_with {
      audit_record('variance' => 'nominal')
    when_rendering(:case_display_page) {

This is a business-facing test in that it describes a business rule: if you've got one nominal audit, there should be no way to add any more audits. It's also like a unit test in that it gives very specific instructions to a programmer. In my case, the fact that this test fails instructs me to change a particular localized piece of code:

  def case_display_page(app)
                        submit('Add an Audit Record'))))

(I'll talk about my rendering peculiarities in some later installment.)

A lot of Fit tests share this property of being about localized business rules (or business rules that should be localized). It seems to be a distinct category of business-facing test, one that often gets overlooked because of the assumption that a customer/acceptance/functional test must be end-to-end and must go through the same interface as the user does.

My test here should be one of a file-full of tests that describe what's most important—from a business point of view—about the presentation of a particular place (or interaction context) in the application. Another test of that sort would be this one:

  def test_typical_case_display_page
    given_app_with {
      case_record('clinic_id' => 19600219)
    when_rendering(:case_display_page) {
      assert_page_title_matches(/^Case 19600219/)

This test describes three facts about the Case Display page's default appearance that must survive any fiddling with how it looks: it must have a title that includes the case's clinic ID, and there must be a way to cause the add-visit and add-audit actions in the App. (This test passes, by the way, though the previous one continues to fail.)

Consider this test something like a wireframe diagram in code.

Most tarpit GUI tests are addressing, explicitly and implicitly, several issues all jumbled together. If you separate them, you get something that's both faster and much more clear. Here, I've addressed the particular issue of what must be true of a page. Later, I'll address the particular issue of what must be true of navigation among pages. But first, I'll make my test pass and see what that suggests about hooking business rules into rendering.

See the code for complete details.

## Posted at 20:30 in category /testing [permalink] [top]

Tue, 20 Dec 2005

Two Agile Alliance programs you may be interested in

  1. The Academic Research program "aims to encourage researchers to focus on research questions and issues concerned with agile software development. Researchers are encouraged to apply for small grants to support activities such as conducting a series of visits to practitioner sites, performing interviews, supporting a researcher for a short time to extend existing work into agile development, running workshops, and so on." I'm on the approval committee. We've approved two proposals so far.

  2. The "Agile Times" newsletter is being reborn. The first of the new issues will be out in March 2006. Remaining issues will come out quarterly. The editor is Rebecca Traeger, formerly editor of Better Software. Both Mike Cohn and I have worked with her. She's good. She's also being paid, so the deadline is more firm than it can possibly be with volunteer work.

    The newletter is currently looking for agile how-to articles, case studies, opinion pieces, book reviews, conference reports, and emerging trends pieces. Articles can cover soft skills (selling, cooperation, interactions) or hard skills (coding, testing, etc.). Deadline for the first issue: January 23.

    Also contact Rebecca if you want to advertise.

## Posted at 11:11 in category /misc [permalink] [top]

Mon, 19 Dec 2005

Working your way out of the automated GUI testing tarpit (part 4)

part 1, part 2, part 3

The story so far: One of my main goals for tests is that they contain no excess words. That means that a GUI test should not describe the path by which it gets to the page under test. In part 1, I described a declarative format. With it, the test writer specifies all and only the facts that should be true of the app at the point the test begins. Part 2 gives a simple implementation that figures out a path through the app that makes those facts true. Part 3 recommends that you migrate tests to this format only as they fail.

The new format tests, though, run as slowly as they did before being migrated. Now it's time to make them faster. I'll do that in two steps. The first doesn't even double their speed. That's hardly sufficient, but the implementation has a side effect that helps the programmer and exploratory tester.

Previously, I only pretended the app talked across the network. Since that fakery would make any timings useless, it's now running on a real server (WEBrick), fielding real live HTTP. So localhost:8080 shows this stunningly attractive UI:

Welcome to the Case Management System

Authorized Users Only


In part 1, I wrote three versions of a test. All three of them communicate with the server in exactly the same way: they send eight different HTTP GET commands (just as a browser would if you visited the app and then pressed seven buttons on seven pages).

To speed up the test, I've made it remember all eight of the commands the first time it runs. (That all happens behind the scenes; there are no changes to the test.) Now later runs can send the commands in a big glob via a side channel. That avoids seven of the round trips.

The results are underwhelming. The original test takes 5.1 seconds. The version that sends the big glob takes 3.2 seconds. The more complicated the test setup path, the greater the speedup would be, but still—is this worth the trouble?

Not so far, but it will be after the next speedup. I hope. In the meantime, there's a useful spinoff feature. One of the reasons I hate anything to do with improving a UI is that every time you tweak a page, you have to navigate to it to check whether the change looks right. Having to do that four or five times in a row drives me wild. So I wish this were a universal law:

You can get to any page in an app in one step.

Now that we can remember application state, that's possible here. Imagine the following:

You have to tweak a particular page in the UI. You navigate to that page, then type this:

  ruby hyperjump.rb --snapshot myfix

You go into the code, make a change, reload the app, and return the app to its previous state like this:

  ruby hyperjump.rb myfix --open

The --open tells hyperjump to open localhost:8080/refresh in the browser. That shows the page corresponding to the saved state, which is the page you're tweaking.

This jump-to-page feature would also be useful for exploratory testing. It's common to go to the same place in the program multiple times during a bout of exploratory testing. Perhaps you're trying to learn more about the circumstances in which a bug occurs (a kind of failure improvement). Or you're trying different paths through the program, each of which starts some distance into it.

There's nothing new about using captured commands to accelerate tasks. People have been using GUI capture/replay tools for this kind of thing since the dawn of time. But it's nice that the feature fell out of a different goal.

For more about the implementation, refer to part 4b. The code has the complete details.

## Posted at 22:49 in category /testing [permalink] [top]

Working your way out of the automated GUI testing tarpit (part 4b)

Here are some (decidedly optional) details about the implementation described in part 4.

Consider this test:

  def test_cannot_append_to_a_nominal_audit
    as_our_story_begins {
       we_have_an_audit_record_with(:variance => 'nominal')

    assert_page_has_no_button_labeled('Add Audit')

as_our_story_begins sets up the application state by deducing a sequence of commands to send to the browser. After that's done the first time, the sequence is stored in a file devoted to a single test method. The one for the test we've been using is path-cache/declarative-test.rb/test_cannot_append_to_a_nominal_audit. Its contents look like this:

[[:login, ["unimportant", "unimportant"]],
  [:new_case, []],
  [:record_case, ["unimportant", "213"]],
  [:add_visit, []],
  [:record_visit, ["unimportant", "100"]],
  [:add_audit, []],
  [:record_audit, ["unimportant", "nominal"]]]

The next time declarative-test.rb is run, as_our_story_begins notices there's a cache file, and sends its contents over an XMLRPC connection. The server turns it into an array:

  command_descriptions = eval(command_string)

Then each command is dispatched to the App object:

  command_descriptions.each { | one |
      @current_page_name = dispatch(*one)

  def dispatch(command, args)
    @dispatched << [command, args]
    @app.send(command, *args)

That dispatch method is exactly the same method used to react to requests from the browser:

    @current_page_name = dispatch(command,
                                                          values(request, *required_args))

By doing that, I reduce the suspicion that the restored state is somehow different than the one that app had at the moment the snapshot was taken.

The only difference between the two different routes into the app is what happens after dispatching. dispatch returns the name of the next page to send to a browser. When the request comes from a browser, the page is rendered and sent back. When it comes by the XMLRPC side channel, nothing is done, but the most recent page name is stashed away. When the browser visits localhost:8080/refresh, the name is used to render the page:

  install_generic_proc('/refresh') { | request, app |


  • There's a way in which all of my tests could be broken (even before caching). My test doesn't drive a real browser. Instead, I use a Browser object that sends GET requests directly to the server. As a result, nothing in the test will fail if the wrong pages are rendered. The tests would appear to work perfectly fine if every GET request returned a blank page instead of any of the correct forms.

    That wouldn't be a problem in real life. In real life, my tests would be issuing commands to a browser via Watir or Selenium. I should probably use one of them for this demo but (1) Watir only works with Windows IE and I use a Mac, and (2) I'm too lazy to learn Selenium right now.

  • The list of commands is stored in the app, not in the test. When it's time to cache the state, the test asks the app for the list. No thought went into the decision to do it that way. Maybe some should have.

  • Previously, the tests bogusly succeeded. They fail now, so that I can later write the code to make them pass.

  • The tests launch the server in a subprocess (see test-util.rb). They use fork(), kill(), and wait(). I don't know if those work on Windows.

Credit: The idea of replaying server-level commands just popped into my head. It might have been put there by Michael Silverstein's "Logical Capture/Replay".

## Posted at 22:49 in category /testing [permalink] [top]

Thu, 15 Dec 2005

Bugs matter

A working undo command can sure be handy:

After initially denying any responsibility for the J-Com snafu, exchange executives acknowledged this week that flaws in their electronic trading system prevented Mizuho from correcting its order and minimizing losses. Mizuho traders realized their mistake within 85 seconds of placing the erroneous order and made four attempts to cancel it. It was rejected each time. [...]

On Monday, Japan's market regulator estimated Mizuho's loss at $331 million.

Evidence continues to mount that no honorable election official should support voting machines without a paper trail:

[...] in three separate attempts over a four month period, computer experts Dr. Herbert Thompson and Harri Hursti visited the Leon County Elections Office in their efforts to penetrate the county voting tabulation equipment and alter election data. [...]

Granted the same access as an employee of our office, it was possible to enter the computer, alter election results, and exit the system without leaving any physical record of this action. [...]

Based upon the data developed out of this exercise it is the opinion of the Leon County Supervisor of Elections that any effort to limit or remove the manual examination of paper ballots to confirm the correctness of election results is not in the public interest.

## Posted at 07:35 in category /misc [permalink] [top]

You know you overuse particular examples on your site when...

... you receive mail like this:

I'm impressed with your professional website related to goat and cow milking devices. My website is a NON-COMPETING informational site offering collections of articles, diagrams, and other publications that relate to and may help supplement your website. Since some of your site visitors may be interested in learning more about animal milking devices, feel free to link to my site. By linking to my site, you will be adding value to your own site by providing relevant content for your site visitors. Following is the link I think will be the most helpful for your customers: http://www.braindex.com/products/175+-ANIMAL-(GOAT-&-COW)-MILKING-DEVICE-RELATED-PATENTS-ON-CD-29.htm. Please visit this link yourself and consider inserting it on your website.

(I don't think this is phishing or some variant, which would be less interesting than having a search bot—albeit one with wide tolerances and low cleverness—think my site is actually about cows.)

## Posted at 07:35 in category /junk [permalink] [top]

Tue, 13 Dec 2005

Convention over configuration workbook?

Item: I'm fond of Bill Wake's Refactoring Workbook because it shows lots of examples of refactorings in action.

Item: Rails is hot, hot, hot these days.

Item: One reason it's popular is convention over configuration, which...

... places ease of use for the majority of situations ahead of the need to provide maximum flexibility for the few. The way this is done is through the adoption of coding conventions that automatically embed a certain amount of configuration right into the framework. Convention makes certain assumptions about how things will be put together and by making these assumptions implicit in the code it frees the framework from the burden of having to spell out every intention through explicit configuration. The conventions can be overridden to handle cases where the convention might not be optimal but speed and ease of use are the big benefit that comes from adopting them.

Item: Berin Loritsch says:

Java applications can be developed using [convention over configuration], but often aren't. The problems come into play when the framework you are using works against you. Other times its just too difficult to do right. You will have to resort to reflection and other black magic tricks.

Item: Better Software has had an author drop out. Three times before when that's happened, I've quickly written a replacement article. Two of them have worked out rather well, I think. (You can see them on the sidebar: "Behind the Screens" and "Bypassing the GUI".)

Therefore, I'm thinking of writing an article on convention over configuration in Java-style languages. (Despite the chain of thought implied here, the idea was really Mike Cohn's.) The problem is, I don't have any personal experience to draw on. Do you have examples that would let me produce an article with something of the flavor or Wake's book? If so, you know how to reach me.

## Posted at 07:19 in category /misc [permalink] [top]

Mon, 12 Dec 2005

Working your way out of the automated GUI testing tarpit (part 3)

part 1, part 2

In the real world, you can't leap out of a tarpit in one bound. The same is true of a metaphorical tarpit. Here's a scenario to avoid:

  • You have 2500 tests. At any given moment, some 200 of them are failing. Most of the failures are because of irrelevant interface changes, not because the code has a bug. As a result, hardly anyone looks at the failing tests.

  • Someone invents a much more compact, much more maintainable way of writing tests.

  • Someone (likely that same person) is assigned the task of rewriting all the tests in the new form.

  • She gets through about 300 before something urgent comes up. Rewriting the tests becomes a background task. So tedious was it that somehow it never makes it back to the foreground.

  • A year later, you have 2500 tests. 336 of them are rewritten (perhaps not the most important ones—no one knows which of the old suite are the important ones). At any given moment, those 336 are trustworthy, but 173 of of the unconverted tests are failing for the same old reason. No one looks at those tests.

Even if the task is plowed through to the end, it has not changed the habits of the team, so there's no counterforce to whatever forces caused the problem in the first place. I'm with William James on the importance of habit:

only when habits of order are formed can we advance to really interesting fields of action [...] consequently accumulate grain on grain of willful choice like a very miser; never forgetting how one link dropped undoes an indefinite number.

Therefore, my bias is toward having everyone convert the test suite one failure at a time:

  • As part of every story, spend around 20 minutes fixing failing tests in the untrustworthy suite. You're probably better off just fixing the next failing one than trying to find which one is most worth fixing.

    If a test has found a legitimate bug, either fix that bug immediately (if the fix doesn't take long) or put it on the backlog to be scheduled as a story.

  • Fixed tests get moved over to a reliable suite. That suite is run as part of the continuous integration build. No story is done if any of those tests fail. (I would not include tests for backlog bugs in this suite.)

  • This process continues ad infinitum. You may never eliminate the untrustworthy suite. If some test there never fails, it will never get converted.

Some fraction — perhaps a large fraction — of the old tests are likely to be worthless. (More precisely, they're worth less than the cost of reviving them.) It's hard to persuade people to throw away tests, but nonetheless I'd try. (There are unknown risks to throwing tests away. My bias would be to do it and let the reality of escaped bugs make the risks better known. Tests can always be un-thrown away by retrieving them from Subversion.)

A tempting alternative is simply to delete the old test suite and start over. Spend the 20 minutes writing a new test instead of reviving a failed one. That might well be time better spent. But it's a tough sell because of the sunk cost fallacy.

## Posted at 13:06 in category /testing [permalink] [top]

UI design links from Jared M. Spool

What makes a design intuitive? is nice and readable short article about the two ways to make an interface that people will call intuitive.

Designing embraceable change is a follow-on that talks about how to introduce a new UI to an existing community. This has relevance to Agile projects that are continually tinkering with the UI.

The series ends with The quiet death of the major relaunch. Here's a trivial example of the approach:

At eBay, they learned the hard way that their users don't like dramatic change. One day, the folks at eBay decided they no longer liked the bright yellow background on many of their pages, so they just changed it to a white background. Instantly, they started receiving emails from customers, bemoaning the change. So many people complained, that they felt forced to change it back.

Not content with the initial defeat, the team tried a different strategy. Over the period of several months, they modified the background color one shade of yellow at a time, until, finally, all the yellow was gone, leaving only white. Predictably, hardly a single user noticed this time.

The key point in this last article is this:

Our findings show that consistency in the design plays second fiddle to completing the task. When users are complaining about the consistency of a site, we've found that it is often because they are having trouble completing their tasks.

## Posted at 10:22 in category /links [permalink] [top]

Agile consultants

In my role as the overcommitted and underskilled Agile Alliance webmaster, I add new corporate members to the site. I realized today that we really have quite an impressive variety there. You can find companies in out-of-the-way places (Topeka, Kansas, USA). It's less easy to find companies that have particular skills, since the blurbs don't generally focus on a company's specific competitive advantage. Nevertheless, I recommend it to you if you're looking for a consultancy.

P.S. Not me, though. Exampler Consulting isn't a corporate member because I've never gotten around to getting a logo.

P.P.S. Corporate membership was Rebecca Wirf-Brock's idea.

## Posted at 10:22 in category /agile [permalink] [top]

Sun, 11 Dec 2005

Working your way out of the automated GUI testing tarpit (part 2)

In the previous installment, I described a test that looked like this:

  def test_cannot_append_to_a_nominal_audit
    @browser.as_our_story_begins {
       we_have_an_audit_record_with(:variance => 'nominal')

    assert_page_has_no_button_labeled('Add Audit')

The test doesn't tell how to get to the case display page, create an audit record, create the visit record that audit records require, etc. The code behind the scenes has to figure that out.

I won't show that code. You can find it here. It's a spike, so don't give me a hard time about the lack of tests. What matters is that it works off a description of what transitions are possible in the program. They look like this:

Given a complete set of definitions, statements like this one:

we_have_an_audit_record_with(:variance => 'nominal')

name "milestones" along the path the program has to take to get ready for the test. A simple breadth-first search constructs a complete path out of the milestones. The path contains appropriate instructions to fill in fields and press buttons. Thus a declarative test is turned into a procedure.

P.S. In the declaration, the button's name is given. That's wrong. It should be the HTML id. Like I said: a spike.

## Posted at 09:36 in category /testing [permalink] [top]

Thu, 08 Dec 2005

Working your way out of the automated GUI testing tarpit (part 1)

In this series, I'll present two ideas that have been percolating in my head for a while. Last week, I began thinking they might be appropriate for a client. We ended up taking a different approach, but not until after I'd spent an evening building a prototype. Yesterday, I was so sick of replying to mail, chipping away at a task backlog that's metastasized during recent travel, and slogging through other things I really ought to be doing that I rebelled and decided to rewrite the prototype. It was fun.

The general idea here is (1) to gradually work your way toward declarative tests that generate their own page navigation and (2) to use caching to speed up tests and maybe improve program structure.

I've never tried these ideas for real. They might be impractical in the wild.

Here are three GUI-oriented tests, in increasing order of goodness. The scenario has something to do with a veterinary clinic (of course). In each test, a case record is created, an animal visit is recorded, and an audit record is appended. (All the steps are necessary, because you can't record a visit until there's a case, and you can't create an audit record until there's a visit.) Normally, there can be multiple audits attached to a case. But if the first audit is marked as "nominal", it's the only one that can ever be created. If so, there should be no "Add Audit" button on the Case Management page. That's what the test checks. (It also uses the title of the page to make sure the assertion is checking the right page.)

The first test is like one you might get from a straightforward use of Watir or jWebUnit.

  def test_cannot_append_to_a_nominal_audit

    enter(:login, 'unimportant')
    enter(:password, 'unimportant')

    press('New Case')

    enter(:client, 'unimportant')
    enter(:clinic_id, '213')
    press('Record Case')

    press('Add Visit')

    enter(:diagnosis, 'unimportant')
    enter(:charges, '100')
    press('Record Visit')

    press('Add Audit')

    enter(:auditor, 'unimportant')
    enter(:variance, 'nominal')
    press('Record Audit')

    assert_page_title('Case Management')
    assert_page_has_no_button_labeled('Add Audit')

What are the problems with this test?

  • In all this code, what's important? Only two lines, which I've highlighted so you can find them easily. In real life, the important lines aren't in bold blue font, so such tests are hard to read.

  • The test is fragile in the face of change. Change the name of a field, introduce another field that has to be filled in, split a page in two: all of these will break this test and many, many others besides. Now you get to fix them all. Because they're hard to read, it's easy to fix them badly. (There are a lot of tests out there that inadvertently no longer test what they're supposed to test.)

  • The test is likely to be slow, because it drives a browser. Programmers who are used to a fast test-code-refactor cycle won't put up with that. So the tests will be run infrequently, and they'll provide information well after it'd be most valuable.

To solve the problem of fragility, some people put a library between the tests and the browser. Here's what such a test would look like:

  def test_cannot_append_to_a_nominal_audit

    login('unimportant', 'unimportant')
    new_case('unimportant', '213')
    new_visit('unimportant', '100', nil)
    new_audit('unimportant', 'nominal')

    assert_page_title('Case Management')
    assert_page_has_no_button_labeled('Add Audit')
  • The test is easier to read, but it has some problems. The fact that an audit record exists is essential to the test, whereas the existence of a visit is incidental. Yet they're given equal prominence. The use of the "unimportant" token makes the use of "nominal" stand out - that particular value must be important to this test. But what about "213" and "100"? They're not important, but there's no convenient "ignore this value" token for numbers.

  • It is more resistant to change than the previous test. If there are changes within a page, you might only have to change one library method.

    But other changes can still break a bunch of tests. In the next iteration, suppose an FDA contact record has to be added before an audit can happen. That means every test that goes directly from adding a visit to adding an audit record will become broken. Either you fix all the tests or you change new_visit to silently add an FDA contact record - which I guarantee will make for some frustrating debugging down the road.

  • It's just as slow as the previous version.

I believe such a test is still not good enough. It's still procedural - it's still of the form "do this... now this... now this... finally you can check what you care about." Here's a better test:

  def test_cannot_append_to_a_nominal_audit
    @browser.as_our_story_begins {
       we_have_an_audit_record_with(:variance => 'nominal')

    assert_page_has_no_button_labeled('Add Audit')
  • This test is declarative. It says that there must be a case with an audit record, but it doesn't say how that record's created. Moreover, it strives to be minimal, to use no word unless it's clearly related to the intention of the test. It says nothing about any of the fields that the previous tests described as "unimportant". It's even silent on the existence of case records and visits, simply assuming that whatever's required for there to be an audit record has happened. (Presumably, requirements like "you can't add an audit record unless there's been a visit" have been tested elsewhere.) All of this makes the test still easier to read.

  • The test is even more resistant to change. Because there's no sequence of steps in the test - no workflow - changes to the workflow will require localized changes in the support code, not to the tests themselves.

  • However, the test is still just as slow as the other ones, so there's room yet for improvement.

In the next installment, I'll show what the code behind the scenes looks like. Right now, I want to emphasize that all three tests do the same thing. Here's an execution log for the third test:

$ ~/src/procedural2declarative 601 $ ruby declarative-test.rb
Loaded suite declarative-test
Go to <http://app.com/app>
Enter "unimportant" into field :login
Enter "unimportant" into field :password
Press "Login"

Press "New Case"

Enter "unimportant" into field :client
Enter "213" into field :clinic_id
Press "Record Case"

Press "Add Visit"

Enter "unimportant" into field :diagnosis
Enter "100" into field :charges
Press "Record Visit"

Press "Add Audit"

Enter "nominal" into field :variance
Enter "unimportant" into field :auditor
Press "Record Audit"

Finished in 0.005032 seconds.

1 tests, 1 assertions, 0 failures, 0 errors

## Posted at 08:02 in category /testing [permalink] [top]

Mon, 21 Nov 2005

Story card style

I've been corresponding with Rachel Davies about story card style. She said something wise. Here's a slightly edited version of the correspondence (with her permission).

It all began when I asked a question about the "As a [role], I want [ability], so that [benefit]" style of writing stories. (See Mike Cohn's User Stories Applied for a description.)


Incidentally, I dropped this story format about 2 years ago because it encourages people to think of story cards as mini requirements documents and encourages the grumpy refrain "but it says on the card". I now encourage teams to write only the story name with a marker pen in large caps on 6x4 unlined index cards (easier to read by people standing around planning table or board) because this encourages conversation to continue during the iteration.


Interesting. What now encourages focus on the benefit and person who benefits?


I agree that novice teams need to be encouraged to ask their customer about this information in the planning game. I recommend using a checklist for each story (do we understand business value, story beneficiary, acceptance test). However, if this information all gets transcribed onto the card then developers just read from the card during the iteration and even if the customer is sitting nearby they don't tend to ask questions. If you leave only the story name on the card then the developers are forced to replay the conversations with the customer (which is a good thing).

## Posted at 09:38 in category /agile [permalink] [top]

Two oblique commentaries on abuse

Not all Americans wanted to [treat prisoners well]. Always some dark spirits wished to visit the same cruelties on the British and Hessians that had been inflicted on American captives. But Washington's example carried growing weight, more so than his written orders and prohibitions. He often reminded his men that they were an army of liberty and freedom, and that the rights of humanity for which they were fighting should extend even to their enemies. Washington and his officers were keenly aware that the war was a contest for popular opinion, but they did not think in terms of 'images' or 'messages' in the manner of a modern journalist or politician. Their thinking was more substantive. The esteem of others was important to them mainly because they believed that victory would come only if they deserved to win. Even in the most urgent moments of the war, these men were concerned about ethical questions in the Revolution.

David Hackett Fischer, Washington's Crossing, p. 276

Confirmation bias is a phenomenon wherein decision makers have been shown to actively seek out and assign more weight to evidence that confirms their hypothesis, and ignore or underweight evidence that could disconfirm their hypothesis [...]

Among the first to investigate this phenomenon was Wason (1960), whose subjects were presented with three numbers (a triple):

2 4 6

and told that triple conforms to a particular rule. They were then asked to discover the rule by generating their own triples and use the feedback they received from the experimenter. Every time the subject generated a triple, the experimenter would indicate whether the triple conformed to the rule (right) or not (wrong). The subjects were told that once they were sure of the correctness of their hypothesized rule, they should announce the rule.

While the actual rule was simply "any ascending sequence," the subjects seemed to have a great deal of difficulty in inducing it, often announcing rules that were far more complex than the correct rule. More interestingly, the subjects seemed to only test "positive" examples; that is, triples that subjects believed would conform to their rule and thus confirm their hypothesis. What the subjects did not do was attempt to falsify their hypotheses by testing triples that they believed would not conform to their rule.

Confirmation Bias, Wikipedia.

In an October 2002 speech in Cincinnati, for example, President Bush said: "We've learned that Iraq has trained al Qaeda members in bomb-making and poisons and gases." Other senior administration officials, including Secretary of State Colin L. Powell in a speech to the United Nations, made similar assertions. Al-Libi's statements were the foundation of all of them.

Al Qaeda-Iraq Link Recanted, Washington Post, July 31, 2004.

According to CIA sources, Ibn al Shaykh al Libbi, after two weeks of enhanced interrogation, made statements that were designed to tell the interrogators what they wanted to hear. Sources say Al Libbi had been subjected to each of the progressively harsher techniques in turn and finally broke after being water boarded and then left to stand naked in his cold cell overnight where he was doused with cold water at regular intervals.

His statements became part of the basis for the Bush administration claims that Iraq trained al Qaeda members to use biochemical weapons. Sources tell ABC that it was later established that al Libbi had no knowledge of such training or weapons and fabricated the statements because he was terrified of further harsh treatment.

CIA's Harsh Interrogation Techniques Described, ABC News, Nov. 18, 2005.

## Posted at 09:15 in category /misc [permalink] [top]

Fri, 18 Nov 2005

Two milestones, noticed while paying bills

I am now a million-mile member of the American Airlines frequent flier program. This entitles me to two luggage tags.

I am also entitled to a Free! "Guide to Planning and Promoting Your Business Anniversary" in honor of fifteen years of business.


P.S. Hugh Sasse points out that the Ruby extensions library has a method like the one that's mentioned below. After a quick glance, I think it's better than mine.

P.P.S. Oh, OK, I also get eight upgrade segments and permanent Gold membership. And a membership card that says "1 Million" on it.

## Posted at 11:38 in category /junk [permalink] [top]

Mon, 14 Nov 2005

Attractive Ruby tests that use multi-line strings

Suppose you're testing some method whose input is a multi-line string. You could write something like this:

  def test_tags_can_be_deeply_nested
    table = "<table>
                             Way nested
    slices = TagSlices.new(table, "table")
    # blah blah blah

That's fine - unless whitespace in the middle of the string is significant. The above method has no whitespace on the string's first line, but a whole lot on the others. What if I needed it all to be flush left? This is ugly:

  def test_tags_can_be_deeply_nested
    table =
                 Way nested
    slices = TagSlices.new(table, "table")
    # blah blah blah

I could argue that the ugliness makes it too hard to see the structure of the test and too hard to skim a file quickly and see what the tests are. That argument may even be true, but the real reason I don't like it is that it's ugly.

So I write such tests like this:

  def test_tags_can_be_deeply_nested
    table = "<table>
            . <tr><td>
            .   <table>
            .    <tr>
            .     <td>
            .       <table>
            .         <tr>
            .           <td>
            .               Way nested
            .           </td>
            .         </tr>
            .       </table>
            .     </td>
            .    </tr>
            .   </table>
            . </td></tr>
    slices = TagSlices.new(table, "table")
    # blah blah blah

unindent removes the whitespace at the beginnings of lines, together with the discrete margin made of dots. Its code looks like this:

class String
  def unindent
    gsub(/^\s*\./m, '')

I've fiddled around with unindent to a ridiculous degree, changing its name, how it works, how the margin is indicated. I think I've settled on this one.

## Posted at 17:00 in category /ruby [permalink] [top]

Sun, 13 Nov 2005

Throwing tests away

In part of "When should a test be automated?", I look at tests that have broken because the intended behavior of the code changed. My urge is to fix them, but I stop myself and ask a question: if this test didn't exist in the first place, would I bother to write it? If not, I shouldn't bother to fix it. I should just delete it.

Here's an example where that practice would have led me astray.

I've been hacking away at the code in RubyFit that parses HTML. I'm changing it so that it will support Rick Mugridge's FitLibrary. To that end, I created a class, TagSlices, that splits HTML text at tag boundaries. For example, the TagSclices of foo<tag x="y">bar</tag>quux would be foo, <tag x="y">, bar, <tag>, and quux.

I'd looked at the Java implementation before starting. That code operates on a lowercased string for tag-matching, but returns chunks of the original string. In the implementation I started moving toward, maintaining the two strings was inconvenient, so I talked myself into thinking I could downcase the original string at the start and work only with that. Stupid (people sometimes do use capital letters in web pages), but I was backing away from a frustrating implementation closely modeled after the Java one - and thus un-Ruby-like and hard to get right. I was so focused on tags that I thought what was OK for them was OK for everything.

I'd generated TagSlices using a set of tests that did not reveal the bug. After I was done, I reran the unit tests for the old version. (I hadn't used them for development because they "chunked" the problem in a way that didn't fit the path I was taking.)

Here's one of those tests:

  def test_parsing
    p = Parse.from_text 'leader<Table foo=2>body</table>trailer', ['table']
    assert_equal 'leader', p.leader
->  assert_equal '<Table foo=2>', p.tag
    assert_equal 'body', p.body
    assert_equal 'trailer', p.trailer

It failed on the line marked with an arrow. I thought about that. Was the failure due to a bug? No, I'd decided it was harmless for tags to change case. Did any other assertion fail? No. Was the test completely redundant with other tests? It seemed so. So I should have thrown the test away. But I hesitated. After all, the changed behavior was a side effect, an implementation convenience. It would be just as harmless for tags to keep their case and pass the test. Maybe that wouldn't be as hard as I'd thought when I'd started. I looked at the code and it suddenly flashed on me that lowercasing the whole string wasn't harmless at all.

And, moments later, I realized that Ruby is a scripting language, after all; as such, it lives for regular expressions. Maybe in the Java world, it makes sense to search a lowercased string for "<table". In the Ruby world, it's better to search the original string for /<table/i.

So I created a test that talks specifically about case in tags and non-tag text. I made it pass. The old test passed, too. I could have thrown it away. And yet... what else might it uncover someday? So I kept it.

I shouldn't extrapolate too much from a single example, but it makes me wonder. Seven years ago, when I wrote the paper, I was solidly embedded in the testing culture, a culture of scarcity, one in which:

  • Automated tests were expensive to write because they had to go through an interface not designed for testing.

  • Programming time to fix that was almost entirely unavailable.

  • You were never anywhere close to having as many tests as you thought you needed, so the opportunity cost of fixing an old test was high.

Those assumptions are less true today. Because of that, it makes more sense to change old tests on the off chance you might find bugs or learn something. One of my other testing guidelines is to seek cheap ways to get dumb luck on your side. I'm not smart enough to do without luck. (That's not false modesty: I bet you aren't either.) Fiddling with tests is perhaps now cheap enough to be a luck generator.

(P.S. The bug would have certainly been caught by even the simplest realistic use, so it wouldn't have survived long.)

## Posted at 18:02 in category /testing [permalink] [top]

A rant: filenames

Java is a good thing. Rails is a good thing. But just as Java /perpet[ur]ated/ the horrors of StudlyCaps on those of us who like to read quickly, Rails is /perpet[ur]ating/ underscores in filenames on those of us who like to write quickly.

There are legions of examples of people not acting according to their rational self-interest. Yet another is the prevalence of filenames like webrick_server.rb over webrick-server.rb. Does an underscore take more energy to type than a dash? Yes. Does avoiding a typo require more coordination? Yes. So why this pathology?

  • C used to be the programmer's lingua franca. Since '-' in C means subtraction, variable names conventionally contain underscores. Did the trailblazer Unix programmers not realize that filenames don't have to follow the same rules?

  • Is it because VMS only allowed underscores in filenames, and VMS is just so totally cool?

Whatever the reason, we must consider the result. Thousands upon thousands of people already suffer from Emacs Pinky. To add injury to injury, those fragile pinkies must suffer additional unnecessary damage striking the shift key. How many people have been forced to switch to vi because those underscores pushed them over the edge to pinky RSI? That's a tragedy no caring person can ignore.

You have a lot to answer for, David HH. I was this close to convincing the world to use dashes, and now there's no hope.

## Posted at 18:02 in category /junk [permalink] [top]

Sun, 06 Nov 2005

Errors as essential

[Austin's procedure] consists in recognizing that [...] failure is an essential risk of the operations under consideration; then, in a move that is almost immediately simultaneous [...] it excludes that risk as accidental, exterior, one which teaches us nothing about the [...] phenomenon being considered.

Jacques Derrida, Limited Inc, p. 15.

This puts me in mind of a commonplace of UI design: that a popup error dialog should prod you to reexamine the system. Can it be changed to make the error impossible? or a modal dialog unneeded? For the latter, see Ward Cunningham's Checks pattern language, which - if I recall correctly - treats entering bad input as an inherent part of entering input, something to be fixed at the human's convenience, not something to interrupt the flow.

It also reminds me of my insistence that Agile projects are learning projects, and that you're probably not learning how to do something right unless unless you try variations and extensions that turn out to be wrong. But there has to be a way of talking about it that doesn't use the words "mistake" or "wrong" because - hard as it may be to believe - a lot of people think those are bad things.

## Posted at 22:01 in category /misc [permalink] [top]

Fri, 04 Nov 2005

Coming to AYE? Bring trinkets

In my first AYE session ("An amateur's guide to communicating requirements"), up to 1/3 of the participants will teach some skill to other members of their group. It might be a card trick, a coin trick, origami, building a house of cards, juggling, situps, headstands, or flipping pancakes over in a skillet. It's best if the skill involves some object. Due to Circumstances Beyond Our Control, attendees weren't sent email asking them to bring objects if they have a skill to demonstrate.

So if you're coming to my AYE session, please bring any objects you need to demonstrate your skill. Thanks, and please spread the word to anyone you know is coming.

## Posted at 07:35 in category /misc [permalink] [top]

Wed, 02 Nov 2005

A thought, inspired by the CSS2 specification

Specifications are a tool for resolving disputes. They are not a communication or teaching tool.

Sentences in a specification are the tangible and checkable evidence that a dispute among specifiers has been resolved. The specification is also used as strong evidence when two programmers have an implementation dispute, or when a tester and a programmer do. But almost no one, given the option, would choose to learn CSS by reading the specification.

That suggests that a specification should not be written to a consistent level of precision. Precision is needed only where disputes have already occurred or are likely. You can be happy when politics and economics allow you to let all precision be driven by actual, rather than anticipated, disputes.

## Posted at 21:46 in category /misc [permalink] [top]

Three useful links

Here are three links I plan to point clients at:

## Posted at 21:46 in category /fit [permalink] [top]

Tue, 01 Nov 2005

Four questions

Jonathan Kohl and I have been having a little conversation prompted by my comments on the Satir model of communication. He listed three questions he asks himself as he interacts with his team:

  • Am I trying to manipulate someone (or the rest of the team) by what I'm saying? An example he gave me is exaggerating a testing problem so that a programmer will look at a bug that's being ignored. (Sometimes testing can't proceed until a bug is dealt with.)

  • Am I not communicating what I really think? One example would be agreeing with people to avoid conflict. (That's different than disagreeing with a proposal, acknowledging the disagreement, and then agreeing to try the proposal anyway. After all, you're roughly as likely to be wrong as anyone else.)

  • Am I putting the process above people? An example that Jonathan gives is deciding that the Customer is by definition right on assessments of value and that the programmers should swallow their discomfort and start coding. I'm sometimes guilty of that.

He thinks of these in terms of Satir's notion of congruence, which is not an idea that rocks my world. (I'm more interested in external behavior than internal state: what I do in the world rather than my position in relationship to it.) The value of the questions is independent of their background, I think.

I've added a fourth:

  • Will those people have good reason to trust me more after this conversation?

I think I use that to square Jonathan's sentiments with my suspicion that there's a lot of useful manipulation out there.

Consider what I do as a coach. An ideal coach - which I am not - will act mostly by exploiting opportunities to jiggle someone else into discovery. I learn best through discovery, and it seems most people I work with are the same. So I am actively training myself to hold back, keep my mouth shut, and let my pair run with an idea, all the while being ready to say things that will help her realize what's happening. Then we can talk about it. (I've noticed that Ron Jeffries is substantially better at this than I am.)

The nice thing about this approach is that it gives me room to be wrong. A lot of the time, what I thought would happen doesn't - her idea was better than mine - and a variant lesson gets learned (by both of us).

Nevertheless, it'd be fair to call me manipulative. The saving grace is that I'm happy for people to know what I'm doing; I don't believe writing this note will make people trust me less.

The focus on trust also keeps me from overdoing it. I resent it when teachers put me in an artificial scenario where they know precisely the troubles I'll have and what lessons I will not be able to avoid learning. I don't trust such people. From experience, I doubt they'll be tolerant of the perverse conclusions I tend to draw. So when I draw them, things turn into an Authority Game with a dynamic of them proclaiming Trvth at me and me being resistent.

(The devolution into such a game is an example of putting process above people. I bet my fourth question is, strictly, subsumed by the remaining three. But "men need more often to be reminded than informed" (*). Given that I do have a regrettable authoritarian streak, redundancy is OK.)

I think Jonathan's questions will help me, going forward.

(*) Warren Teitelman, I think. He was giving a talk on the Cedar programming language and programming environment.

## Posted at 07:35 in category /agile [permalink] [top]

Sun, 30 Oct 2005

Welcome, Better Software readers

In my editorial for the November Better Software, I use the string of numbers at the top of my blog as an example of a big visible chart. They've helped me lose about 25 pounds. Especially once I started meeting people who told me they tracked them, my impulse to show the world steady progress remained strong for about 20 of those 25 pounds. (I've even met someone who said he was inspired to do the same with his blog.)

More recently, progress has been more fitful, as can be seen by the abundance of red. (Green means I've lost at least two pounds in the week, red means I've lost some but not enough, and bolded red means I've gained weight.) My just-ended three-week trip (to PNSQC, RubyConf, OOPSLA, and a client site) did special damage. Naturally, I did that damage just as the November issue is being mailed. But next week will be green.

## Posted at 21:48 in category /misc [permalink] [top]

A thought on mocking filesystems

(Sorry about the earlier version of this. I wrote some notes locally and then accidentally synchronized with testing.com.)

Sometimes people mock (or even just stub) some huge component like a filesystem. By doing that, you can test code that uses the filesystem but (1) have the tests run much much faster, (2) not have to worry about cleaning up because the fake filesystem will disappear when the test exits, (3) not have to worry about conflicting with other tests running at the same time, and (4) have more control over the filesystem's behavior and more visibility into its state.

I was listening to someone talk about doing that when I realized that any component like a filesystem has an API that's adequate for a huge number of programs but just right for none of them.

So it seems to me that a project ought to write to the interface they wish they had (typically narrower than the real interface). They can use mocks that ape their interface, not the giant one. There will be adapters between their just-right interface and the giant interface. Those can be tested separately from the code that uses the interface.

Arguing against that is the idea that giant mocks can be shared among projects, thus saving the time spent creating the custom mocks required by my alternative. But I'm inclined to think it's a good practice to write an adapter layer anyway. Without one, it's painful to swap out components: uses of the old component's API are smeared throughout the system.

## Posted at 21:28 in category /coding [permalink] [top]

Thu, 13 Oct 2005

A Watir win

At PNSQC, Michael Kelly gave a talk. Among other things, it covered converting functional test scripts into performance test scripts. He gave examples using several tools, one of them Watir.

As he described how little of the test script had to change to make it a performance test script, I realized that there was a way to make it require no changes. I didn't quite finish a demo during his talk, but I did shortly after. Here's an example:

Load (a fake version of) Watir, get an object representing IE, and ask it to go to a URL.

irb(main):001:0> require 'watir'
=> true
irb(main):002:0> ie = IE.new
=> #<IE:0x329144>
irb(main):003:0> ie.goto('url')
=> "here's a bunch of HTML"

(I faked out Watir because I didn't have net access and, anyway, this machine is a Mac.) What you can't see in the above is that goto delays a random number of seconds before returning.

Now I want to run the same "test", timing all gotos.

irb(main):004:0> require 'perf'
=> true
irb(main):005:0> IE.time(:goto)
=> nil
irb(main):006:0> ie.goto('url')
=> "here's a bunch of HTML"

It took just over a second.

Here's the code. Mike will be seeing about getting something like it into the Watir distribution.

class IE
  def self.time(method)
    method = method.to_s
    original_method = '__orig__' + method
    new_def = "alias_method :#{original_method}, :#{method}
               def #{method}(*args, &block)
                  start = Time.now
                  retval = #{original_method}(*args, &block)
                  puts Time.now - start

Take that, WinRunner!

## Posted at 15:08 in category /ruby [permalink] [top]

Tue, 11 Oct 2005

Hoist by my own petard

I started my PNSQC talk by asking for three volunteers. I handed each a Snickers bar and told them to eat it. After they did, I asked whether they were confident their body would be successful at converting that food into glucose and replenished fat cells. Then I gave them part of the CSS specification. I asked them whether they thought they could be successful at converting that information into a conformant implementation. Unsurprisingly, they thought digestion would work and programming wouldn't. How odd, I said, that digestion and absorption works so much better than the simpler process of programming.

The idea here was to set the stage for an attack on the idea that (1) we can adequately represent the world with words or concepts, and (2) we can transmit understanding by encoding it into words, shooting it over a conduit to another person, and having them decode it into the same understanding.

Things did not go exactly as planned. After I gave them the Snickers bars, I was surprised when they balked and asked me all kinds of questions about eating it. I thought they were deliberately giving me a hard time, but one of them (Jonathan Bach) later told me that he was honestly confused. He said something like, "it would have been much clearer if you'd shown us what you wanted by eating one yourself."

... if I hadn't tried to transmit understanding down the conduit...

... if I'd explained ambiguous words with an example. In a talk about the importance of explaining with examples.

I'm glad Jonathan was clever enough to catch that, because the irony of it all would have forever escaped me.

P.S. It now occurs to me that another problem was that they didn't know why they were to do it. That's something I also covered in the talk: "justify the rules" from the list of tactics. I don't mind not telling them why, since telling them would have spoiled the effect, but not using an example just makes me slap my head.

## Posted at 07:39 in category /agile [permalink] [top]

Communication between business and code

In a few hours, I'll be giving a presentation at PNSQC. It's on communication between the business experts and the development team. After some audience participation involving Snickers® bars, trapezes, and Silly Putty® (actually, only Snickers bars) and some airy-fairy theorizing, I get down to discussion of 16 tactics. Here they are.

When it comes to teaching programmers and testers about a domain, examples matter more than requirements. It's fine to have statements like "always fill stalls up starting with the lowest number, except for the sand stalls and the bull stall". But when it comes time to be precise about what that means, use examples (aka tests). I think of requirements statements as commentaries on, or annotations of, examples.

It's best to build examples with talk around a whiteboard. There, a business expert explains an example to a small audience of programmers, testers, technical writers, etc. The conversation includes these elements:

  • People should ask questions about details. If the business expert casually says, "So we have a cow in stall 1", ask why it's in stall 1. The answer might be, "well, actually, it probably wouldn't be in stall 1, because that's the bull stall" - which now alerts everyone that there are rules surrounding which animals go in what stalls. Those rules might not matter soon, but it doesn't hurt to be aware of them.

  • Turn stories into rules. If the business expert says things like "well, since the bull stall is reserved for dangerous animals, we'd put an ordinary case in the next available stall," you have a rule that stalls are allocated into increasing order. That rule is something that will probably be found, in some form, in the code.

  • Still, favor complete examples over complete rules. The rules don't have to be precise; they're mainly a reminder to write precise examples. Expect the real heavy lifting of creating rules to be part of the programming process; the programmers will discover rules that cover the examples. (See my old favorite, the Advancer story.)

    Nevertheless, some early attention to rules helps shift the emphasis from procedural examples to declarative examples and from UI gestures to business logic.

  • Participants whould ask the business expert to justify the rules. Why is it that stalls are allocated in increasing order? There might be no particular reason, but it might be that stalls are numbered counterclockwise, so by housing cases in numerical order, a student working on her cases in stall order would walk directly from case to case instead of having to plan a route.

    What's happening here is that the development team is learning facts about the domain. Any set of requirements, examples, or other kinds of instructions to the team will leave them underconstrained. At some point, they'll make decisions that are not forced by anything the business expert said. If they understand the "why" behind statements, they're more likely to make sensible decisions.

  • People should ask about exceptions: "when is it not done like that?" It's the exceptions that make rules tricky, and the exceptions that will drive the creative part of programming.

    Now, it's awfully easy to ask an expert for exceptions to the rules, much harder for the expert to think of them. So there are tactics for eliciting exceptions (as well as new rules and new domain knowledge).

    • Ask for stories from different points of view. The most natural point of view is probably that of a user of the system. So find opportunities to ask for the story of a medical case from the first call to get an appointment to the last time someone touches its record. Or look at the path of an inventory item through the system. (As an example of this, see the opening scenes of the movie Lord of War. I consider that a spoiler, but seemingly every critic saw fit to describe it.)

    • When telling the story of a user, you have the opportunity to pick a persona. Don't always use a normal one. Consider how Bugs Bunny (a trickster character, a rule breaker) would use the system. How about the Charlie Chaplin of the factory scenes in Modern Times: the completely overwhelmed worker who can't keep up? (I learned this trick from Elisabeth Hendrickson.)

    • You can also try Hans Buwalda's soap opera testing (an example). In soap opera testing, you construct an example of use that bears the same relationship to normal use as a soap opera does to real life: dramatically compressed, full of cascading and overlapping implausibilities.

    • Be alert for synonyms. Suppose a clinician uses the words "release" and "discharge" in different contexts but cannot articulate the difference between them. It's natural to just pick one of them and use it henceforth. I'm more likely to want to make the system support both words (by one common routine) in the hopes that a distinction will eventually emerge.

In all of this, attend to pacing. The programmers have to learn about the domain. It's easy to overwhelm them with exceptions and special cases while they're still trying to grapple with the basics. So start with basic examples and consider the elaborations once they've demonstrated (through working code) that they're ready for them.

Give the product expert hands-on fast feedback. Anything written down (like examples, tests, or requirements document) puts the reader at one remove from the actual thing. Consider the difference between test-driving a car and reading about a test drive of a car. So quickly implement something for the product owner to look at. That will allow her to correct previous examples and also learn more about how to communicate with the team.

It's also important for everyone to work the product. You don't learn woodworking by looking at examples and listening to someone talking about woodworking. You learn by working wood. The programmers, testers, etc. on a team don't need to become experts in the business domain, but they do need to learn about it (again, so they can make unforced choices well). Having people use the product realistically, especially in pairs, especially with the business expert near, will help them. I recommend exploratory testing techniques. James Bach's site is the best place to learn about them.

I think of the team as building a trading language. This is a language that two cultural groups use to cooperate on a common goal. (See also boundary objects.) In a trading language, the programmers and business expert will both use words like "bond" or "case" -- indeed, it's best if those words are reified in code -- but they will inevitably mean different things by them. It's important to accept that, but also to attend to cases where the different meanings are causing problems. I happen to also think that the business expert should become conversant in the technology domain, just as programmers become conversant in the business domain. That doesn't mean to become a programmer, but it does mean to come to understand enough of the implementation to understand implementation difficulties and opportunities.

Finally, since understanding is built, not simply acquired, it's important to attend to learning through frequent mini-retrospectives. Is the development side of the team learning the domain? Is the business side learning about the implementation? Is the business side learning about the domain? -- I think any project where the business expert doesn't gain new insights into the business is one that's wasted an opportunity. Is everyone on the team learning about communication?

## Posted at 07:39 in category /agile [permalink] [top]

Sun, 09 Oct 2005

PNSQC annotated bibliography

In my Pacific Northwest Software Quality Conference talk, I'm going to throw out a blizzard of references. Here they are.

"The conduit metaphor: A case of frame conflict in our language about language", Michael J. Reddy, in Metaphor and Thought (2/e), Andrew Ortony (ed.)

Shows how our standard metaphor for communication is one of shipping something from one mind to another via a conduit. Many examples. I don't think that's the way communication really works, so the metaphor misleads us into thinking requirements documents are a good idea, and that the reason they so often fail is that we're not smart or dedicated enough.

Philosophy and the Mirror of Nature, Rorty

Takes apart the idea that concepts and words are direct mappings of hard-edged categories in the world. Sometimes that works. Sometimes it doesn't. Requirements documents assume that words can capture what the solution to a problem essentially is. But if that's not possible, in general...

Women, Fire, and Dangerous Things: What Categories Reveal About the Mind, Lakoff

A more empirical treatment of the same idea. Categories have fuzzy edges, partly because most lack a single defining characteristic that must be present.

Personal Knowledge: Toward a Post-critical Philosophy, Polyani

A discussion of tacit knowledge.

Refactoring, Fowler

I tell the Advancer story, which is uses the Method Object refactoring, which is described in Fowler.

Cognition in the Wild, Hutchins

I suggest the Advancer story is an example of Hutchin's distributed cognition, the notion that it sometimes doesn't make sense to say that particular people solve a problem. Instead, it's more useful to point to an assemblage of people and things as doing the thinking. So the common statement "the code is trying to tell us something" is not meaningless.

Fit for Developing Software: Framework for Integrated Tests, Mugridge and Cunningham

I use Fit tables as examples of examples.

"Soap Opera Testing", Buwalda

Soap opera tests have the same relationship to normal uses of the product as soap operas have to real life: drastically condensed and exaggerated.

Image and Logic: A Material Culture of Microphysics, Galison

Galison discusses how different groups of people collaborate on shared goals. He claims that they develop "trading languages" (like trading pidgins and creoles) to organize their work. I believe his analysis fits project teams.

Domain-Driven Design, Evans

Evans's "ubiquitous language" is an example of a Galison trading language.

"The Good, the Bad, and the Agile Customer", Alesandro, Better Software, November/December 2005.

An example of gradually tuning communication to suit both a business representative and a development team.

Situated Learning: Legitimate Peripheral Participation, Lave & Wenger

A description of how types of learning like apprenticeship work. Not an easy read. I wrote a review and summary.

How to Do Things with Words (2/e), Austin

Austin talks about "performatives", which are sentences that don't describe the world but rather do things in it. ("I now pronounce you husband and wife.") Performatives don't really fit in the "words map to categories in the world" framework.

Limited, Inc., Derrida

Derrida (I'm told) argues here that all statements are performatives. (I haven't read the book yet - I've not found Derrida to be easy reading.) That makes sense to me: we utter things to change the world. Even when I'm defining a word for my daughter, I'm changing the world inside her head.

That's where I end, having given 16 tactics for improving communication, each compatible with this nonstandard view. My ending motto is this:

As intermediaries, we do not need to send abstractions down the conduit from the business to the programmer. Anything that provokes the programmer to write the right code is fine by us.

## Posted at 09:49 in category /misc [permalink] [top]

Sat, 08 Oct 2005

Blaming and lecturing

Satir's models of communication, change, and communication stances are influential among those who worry about software team dynamics. I'm uneasy about them on two grounds.
  • One is that the categories they draw strike me as too big. Consider the communication stances. The model identifies three "things in the world": Self, Other, and Context. People take bad communication stances when they (try to) ignore one or more of those things. For example, a Placating person will ignore Self in favor of Other and Context.

    My difficulty is that there are so many Others and pieces of Context ready-to-hand at every moment (even if you're talking to one person about one thing) that I'm uneasy about the idea of ignoring Context or Other. That probably means ignoring a lot of the Context or Other, but the parts you don't ignore are probably awfully important. (And, as a teensy bit of a postmodernist, I'm not 100% sure it's always that useful to think of a unitary self, so even ignoring Self is maybe not such a straightforward idea.)

    Now, I expect the model has been expanded, but my informal encounters haven't shown me the elaborations. Perhaps I will at the AYE conference.

  • The other source of unease is that Satir's models are grounded in family therapy. That, it seems to me, often leads to overconcentration on the negative. Function becomes the absence of dysfunction, joy becomes the absence of frustration. One becomes "congruent" by ceasing to ignore one, two, or three of the things in the world.

    For example, in the change model crisis kicks off change. Change must push through resistance. Again, that's certainly often true (and consultants must often deal with resistance). But that's not the way all change happens. Some people like change, and others are agnostic (the change threatens nothing they particularly care about). My impression is that a lot of the elements of XP were more motivated by a harkening back to an idyllic time at Tektronix Labs than by stark necessity.

    (A preference for Satir may be a product of selection bias. Back when I was a pure testing consultant, I - like an awful lot of consultants - got called almost exclusively into companies with problems. There, the Satir model is so often appropriate that it must be easy to see dysfunctional family life everywhere. Now that I'm consulting in Agile, I more often go to companies that are doing perfectly OK and want to do better. That promotes a sunnier view of life.)

But that's not what I mainly wanted to write about. In the communication stances model, the ignoring of Other leads to Blaming behavior. If I model my own behavior that way, I'd say Blaming is not often the result. What I do more is Unstoppable Framing and Advice-Giving. It's figuring out what the problem and its context are, plus throwing out all kinds of potential solutions. That's different than Satir's Super-reasonable behavior, which is "cool, aloof, reasonable, and intellectual". I'm not cool or aloof; I'm usually passionate and determinedly optimistic - "hey, how about this. It would turn the problem on its head and make it a neat opportunity."

That's helpful behavior except when it becomes more about me and less about the Other I'm supposedly helping, when it becomes a way to shift the issue away from what the other person needs to what I'm good at doing: problem-solving, idea generation, and talking. I'm using the Context as a way of making my Self comfortable. The solution (there I go again) is to make sure to let the Other guide the conversation.

I bet Dawn (who's witnessed more of this from me than anyone else has) would describe it as stereotypically male behavior. It probably is statistically more common among males. But I'd be willing to bet it's an occupational hazard for consultants.

So, by Box's criterion that all models are wrong, but some are useful, Satir's model is useful. I don't use it much, though.

## Posted at 18:31 in category /misc [permalink] [top]

Wed, 05 Oct 2005

Tweak to CalculateFixture

I've become fond of FitLibrary's CalculateFixture. The second table below is an example:

create clinic with 7 stalls
which are sand stalls? 4, 6
which are bull stalls? 1


When there's no room for an animal in a normal stall, put it in a sand stall, or in a bull stall as last resort.

how stalls are assigned
special stall? stalls in use stall assigned
no 2, 3, 5, 7 4
no 2, 3, 4, 5, 6, 7 1

The columns to the left of the blank column are input values. Those to the right are expected results. The blank column is a nice visual separator.

Each line of a table should be easy to understand. Sometimes that means annotation. I've hacked my copy of CalculateFixture to allow notes after yet another blank column. Like this:

create clinic with 7 stalls
which are sand stalls? 4, 6
which are bull stalls? 1


When there's no room for an animal in a normal stall, put it in a sand stall, or in a bull stall as last resort.

how stalls are assigned
special stall? stalls in use stall assigned notes
no 2, 3, 5, 7 4 Only sand stalls are free, so use one.
no 2, 3, 4, 5, 6, 7 1 No place to go but bull stall

That's easily done in the 9Feb2005 version. In bind:

for (int i = 0; heads != null; i++, heads = heads.more) {
    String name = heads.text();
    try {
	if (name.equals("")) {
+           if (pastDoubleColumn) break;
//          if (argCount > -1)
//              throw new FitFailureException("Two empty columns");
            argCount = i;
            targets = new MethodTarget[rowLength-i-1];

And remove an error check in doRow:

if (row.parts.size() != argCount+methods+1) {
    exception(row.parts,"Row should be "+(argCount+methods+1)+" cells wide");

I'd like to see this change become part of FitLibrary. It cannot break existing tables (because of the error check). The error check would no longer catch mistakes in the table, but I can't see such a mistake lasting past the first time someone tried to make the test pass. It would be easy enough to change the error check to take blank columns into account, but I doubt I'd bother.

## Posted at 15:31 in category /fit [permalink] [top]

Sat, 01 Oct 2005

Need project pictures

For my talk at the Indianapolis Quality Enrichment Conference (October 7), I could sure use some pictures of an Agile project in action. I'd like to show a daily standup and a product owner explaining a story, preferably using a whiteboard. If anyone can mail me some, I'd be much obliged and would give credit. Thanks.

I've got other pictures I think I need, but if you can think of events or settings that people really ought to see, send them along. Even if I don't use them this time, this probably won't be my last talk about Agile.

## Posted at 18:01 in category /agile [permalink] [top]

Fri, 30 Sep 2005

First mover disadvantage

A while back, I stayed at a Marriott hotel. Around US$150 at a reduced conference rate, plus US$10 for wired high-speed internet. Not long before then I'd stayed at a no-name hotel for US$70. It had free wireless.

Why the difference? I suppose it's just whatever it is that makes hoteliers think the more expensive the room, the more expensive should be the bottle of Aquafina water placed in the room.

But I also toyed with the thought that it could be a form of first-mover disadvantage. Marriott no doubt put in high-speed internet before wireless was even an option. The other hotel waited. I suspect it's a lot more expensive to string wires all over the hotel than to put up wireless hubs. Having strung those wires, was Marriott now stuck when wireless came along? Were there financial or emotional (sunk cost fallacy) reasons for sticking with a solution that's been superseded?

Are early adopters of Agile subject to the same? Are they (we?) prone to getting stuck at local maxima? I suppose that's inevitable. It's sometimes said that the fiercest critics of this generation's avante garde isn't the boring old mainstream; it's last generation's avante garde.

Bob Martin said at Agile 2005 that industry seems to be converging on a standard Agile approach: Scrum with a subset of the XP practices. That's good, but what's the next step beyond that? I personally think it's about the Product Owner / Customer. People used to say things like "programmers can't test their own code" or "programmers are asocial" or "requirements must be complete, testable, and unambiguous." Turns out we now know (mostly) how to make those things untrue. There are a lot of statements about product owners that are like that. How many of them have been made untrue? Doesn't seem like many. So I bet there's work to do.

One such statement is "To the product owner, the UI is the software." I've heard that as a support for Fit ActionFixture over DoFixture. Not because ActionFixture describes the actual UI, but because product owners can grasp that metaphor, whereas DoFixture would be too abstract.

Now, I'm biased, because I think that it's almost always the business logic that delivers the value. The UI is a way to make that value accessible. It's secondary. So I want the product owner to think, um, like me. Or better than me - to come to some product understanding / conceptualization / language that's surprising, that reveals product possibilities in the way that refactoring reveals code possibilities.

The question is: what nitty-gritty techniques can be used for that? It makes no sense to go up to someone and say, "Reconceptualize your domain!" I have a low opinion of teaching people how to think, and a high opinion of teaching them how to do.

## Posted at 19:53 in category /agile [permalink] [top]

Wed, 28 Sep 2005

Upon the occasion of a school meeting

Our children go to publicly-funded schools, partly because I buy the argument, from The End of Equality, that it's been important for US society that we've had places where citizens of different incomes and classes mix.

My son has some developmental difficulties. Nothing major, nothing romantic - but a need for help with fine motor control, speech, and some social interactions.

I've worked with big organizations and small, old ones and young ones, monopolies and competitors. I have a bias - both instinctive and learned - toward the small, the young, and the competitive. (I was once on the technical board of a small company that had a niche market to itself, and I watched how the appearance of a competitor concentrated their mind on bettering the product. It was a wonderful example of how competition is a burden placed on organizations for the benefit of the rest of us.)

So I should be - am - naturally suspicious of the public school system, which is large, set in its ways, bureaucratic, and much like a monopoly. But I'm here to tell you that the people - teachers, administrators, school therapists - have, almost without exception, been wonderful. I've been around. I know the difference between people just doing a job and people motivated to do a good job. These people would be a credit to any organization, even the smallest, youngest, and leanest.

It's almost as if employees can be motivated by something other than money and status.

So the next time you hear a politician speaking scornfully of the teacher's union or the school system, just remember there are a lot of good people in those organizations. Not only are they working hard with resources so limited they make your organization look like the US Congress funding public works in Alaska, they're doing it while enduring constant insults.

## Posted at 07:53 in category /misc [permalink] [top]

Being wrong

Hardly anyone thinks the software industries do a satisfactory job of getting the requirements / architecture / design right up front. The reaction to that, for many many years, has been that we should work smarter and harder at being right. Agile flips that around: we should work smarter and harder at being wrong. We should get so good at being wrong that our predictive failings do no harm to us, our project, our employer, or our users. In fact, we should strive to make mistakes---the need to redo---a constructive resource.

## Posted at 07:53 in category /agile [permalink] [top]

Fri, 23 Sep 2005


Long ago, I learned to fly gliders. Since they don't have engines, they're towed up into the air. So you spend the first minutes of your flight at the end of a long rope that's tethered to a tow plane. As a pilot, your job is to keep your glider in a good position relative to it.

As a novice pilot, I had a problem with "porpoising." I might drift up out of the right position, so I'd push the stick forward to descend, but I'd overcorrect so I'd descend too far, so I'd pull the stick back but this time get even higher out of position, so... Eventually, you can oscillate so badly that you become a danger to the towplane.

One time, my instructor gave me some advice. "Don't just do something, sit there," he said. Let your status stabilize and become clear before you correct. And, I extrapolate, make small corrections that stabilize faster so that you know more quickly what you've done.

I think of that slogan every once in a while.

## Posted at 09:45 in category /misc [permalink] [top]

Wed, 21 Sep 2005

A tour through a Fit episode

On the agile-testing list, someone asked this question:

What I mean is a workflow requirement like:

If A then
  do something (which may be an entire function in itself)
Else if B
  then do something else
Else if C
  Then do nothing

Is it possible to express a requirement of this sort using Fit/Fitnesse?

Here's my answer, which goes afield into business-facing test-driven design.

My inclination would be to test the business rule directly. Here, I'm using Rick Mugridge's CalculateFixture. With it, the inputs and expected results are separated by a blank column.

All Significant Events in the Reactor's Life
condition   operator notified? auto shutdown?
really hot   YES YES
slightly hot   YES no
temp goes in range   no no

(An alternative would be to have a single column named reaction?. Each cell would contain a list of all relevant reactions for a condition. My problem with that is it's too easy to overlook a reaction if you're just going down a reaction column and filling in what comes to mind. Enumerating the possibilities in columns forces you to come up with an exhaustive list of all noteworthy reactions and also think about each reaction for each condition.)

What are we really testing here? Assuming that the reactor is event-driven-ish, it's that the reactor has the right reaction to a particular event. Were the fixture written in Ruby, it would have two methods like this:

  def operator_notified_question_condition(condition)
    # create an event for the condition.
    # Send the event into the system.
    # check whether the operator has been notified.

The name of the method indicates that it's the one called to check operator notified? after the reactor code processes condition. CalculateFixture, unlike ColumnFixture, doesn't stuff values into fields (which isn't something that bothers me, though it does bother some people).

We have to decide what the system under test does and what the fixture does. (A truly hardcore TDD person would put everything in the fixture until there was some reason to move it into the product; I'll skip that.) I'm going to assume the system under test is in the variable @model, and flesh out the above this way:

  def operator_notified_question_condition(condition)
    event = create_event_for(condition)

  def create_event_for(condition)
    # how to create the event?

  def was_operator_notified?
    # how to check that?

The two questions in that code raise questions for the team. Consider the second. It's the product expert we go to when we want to know what form operator notification should take. (Hoping all the while that she knows more about human factors than we do.)

  • Her response might be that such and so a light should flash and such and so a message should appear on such and so a screen. In that case, the code could check whether the right messages are sent to the right hardware interfaces (or mocked-out versions of them).

  • But I'd be tempted to ask the product owner about the log at this point. A reactor presumably has some sort of logging (perhaps some form of "black box"). Each test column names a significant reaction. Should each be logged? If so, we can just check the log:

      def operator_notified_of?(event)
        notification = model.log.most_recent_entry(:type => :operator_notification)
        return "yes" if notification.timestamp.later_than?(event.timestamp)
        "no"  # returns "no" unless previous line returned "yes".

What the second choice lets us do is defer the decision about the details of operator notification until later tests. All I need to know now is that it at least makes a log entry. Never decide today what you can put off til tomorrow.

So I could write accept_event like this:

  def accept_event(event)
    case event.class
    when ReallyHotTemperatureEvent:

So what this test is forcing me (as the programmer) to do is lay down the first bits of event-handling code and possibly the first bits of logging code (which would be a matter of starting to use Log4R or some other logger). Similarly, writing create_event_for is going to mean a little translation of strings and then calls into (as yet nonexistent) event creation code.

So this single high-level test only produces some high-level structural code. Each of the inexact terms - really hot, auto shutdown - will later be examplified (sic) with tests. I find this sort of top-down test-driven approach comfortable to work with. For me, it does seem to result in a lot of backtracking to remake bad decisions about structure. It seems I ought to be able to do better, but that's the way I feel about pretty much everything I do.

## Posted at 12:48 in category /agile [permalink] [top]

Ruby makes the big time

My son is going to a birthday party this weekend. He wanted to get the birthday boy (Jody) two plush toy giant microbes. He chose Ebola and flesh-eating bacteria. (Yes, this was after I made several attempts to get him to choose something less gruesome, like the common cold or ulcer, because I didn't want his mother to think "what kind of bizarre child does my Jody hang around with?" But Paul insisted that Jody really likes that kind of thing. (What kind of bizarre child does my Paul hang around with?)

In any case, he prevailed, and I purchased the two toys at ThinkGeek. Along the way, I took their 46.4 second feedback survey. It asks what your favorite language is. I scanned down to the Rs and was appalled to find that Ruby wasn't listed, though Rebol and Rexx were and Python was just above them. But I later noticed that the list is not entirely alphabetical. Certain languages have been promoted to the front of the list: C, C++, Objective C, Perl, Java, ..., and Ruby. This is the big time, folks.

P.S. If you're from ThinkGeek: sorry about the rant in your Suggestions / Comments / Thoughts / Rants text area. That was before I noticed Ruby at the top.

## Posted at 09:25 in category /ruby [permalink] [top]

Wed, 14 Sep 2005

Behind closed doors

It seems that while I've been struggling with a single chapter of Scripting for Testers, Esther Derby and Johanna Rothman have written an entire book: Behind Closed Doors: Secrets of Great Management.

(I think the fourth complete rewrite will nail this chapter & the hardest part of the book. Smooth sailing from then on, he said with a confident grin, unaware of the iceberg looming behind him.)

(Just kidding, by the way: they've been working on this book for quite a while, and I'm sure it will show.)

## Posted at 07:29 in category /misc [permalink] [top]

An agile test plan template

Janet Gregory has provided a sample test plan she's used on agile projects. (In Word and RTF formats.)

It's common for the words "test plan" to be used for a list of descriptions of test cases. (Thus, you sometimes hear people on agile projects responding to the question "Where's your test plan?" by pointing to the automated test suite.) That's never made sense to me.

When I think of test planning, I think in terms of handoffs. At some point, the code is handed off from one set of people to another. For example, a release is a handoff from the production team to the users. That's the main handoff, but a geographically-distributed, multi-team project might have internal handoffs as well.

Everyone should want the receiver of the code to smile. Testing is the activity of predicting whether they actually will smile. Test planning is making sure that everything is in place for that activity to happen smoothly when it's time for it to happen. For example, machines may need to be bought for configuration testing, and the purchasing process may need to start several iterations before the testing starts.

I wrote a paper about that once (PDF).

## Posted at 07:12 in category /agile [permalink] [top]

Thu, 08 Sep 2005

Chad Fowler's book

The Pragmatic Press has published Chad Fowler's book My Job Went to India (And All I Got Was This Lousy Book). I haven't read it, but I know Chad and he did write an article for me. The article was about ways to think about career choices, tricks for keeping your skills up to date, and advice on marketing yourself for the marketing-averse. The book appears to be on the same topics. Check it out.

## Posted at 11:49 in category /misc [permalink] [top]

Unit tests used to generate documentation

Grig Gheorghiu has been poking around with the idea of using tests to generate documentation. Here is what he's been doing. Here is a sample of API documentation that another project generates from this code. (Note: the tests aren't written in xUnit style with explicit asserts, but rather by using a line markup style that I first saw in Updoc.)

(Grig's summary note here.)

See also Brian Button's PDF article about tests as documentation (from Better Software). Brian's article isn't about generating documentation, it's (mainly) about how to write tests so that they're useful explanations of how to use an API.

Fit uses the "look at the tests themselves" style, since the tests are HTML. I find myself fiddling around, trying to make tests nicely readable, but I find I'm hobbled by the lack of a good HTML editor (much less a good Fit-aware HTML editor). I've been put off using Word because (a) my versions generate horrible-looking web pages, (b) I am morally offended by the garbage it generates as HTML source, (c) I try to avoid using binary files with version control systems because you can't usefully diff them, (d) it sometimes produces incorrect HTML (most memorably when it added extra columns to tables), and (e) I don't know how to script Word so that running the test I'm working on is one button click in IDEA. I haven't tried Open Office (because I think it's lagging on the Mac), though I've heard people prefer it to Word for HTML generation.

I like simple layouts, so simple HTML would be fine for me (were it easier to edit). If a colocated team wanted prettier documentation, I'd be suspicious of their focus and priorities. But if the onsite product owner has to bring the Marketing Department into the decision loop, and the Marketing Department is three floors up with important representatives in another city, and they're used to attractive Word documents, weaning them off of requirements documents in favor of annotated customer-facing tests will sometimes require a certain amount of flash. For example, it should be easy to drop in a wire-frame sketch of a GUI.

## Posted at 07:56 in category /agile [permalink] [top]

Fri, 02 Sep 2005

Faith in people

I don't have a television, but I've been following the Katrina situation a bit obsessively on the net while Sophie's been home sick from school. I'm afraid of a narrative taking hold about New Orleans: that the people trapped there are less deserving of sympathy because (1) they had the chance to get out and didn't take it, and (2) they are lawless looters and shooters.

I'd like people to read this first-hand account of eight people behaving nobly, written by a volunteer rescuer. Because his permalink doesn't work (at least for me), I reproduce it in full:

The Positive Stories Must Get Out
Name: Robert LeBlanc

Home: 9858769172

Email: rlrenrec@aol.com

Subject: My Hurricane Story -- The Positive Stories Must Get Out

Story: Please help me to get this story out. We need to get the truth out and these people helped.

Jeff Rau, a family and now personal friend to whom I will forever be linked, and I were volunteering with a boat and pulling people out of the water on Wednesday. I have a first-hand experience of what we encountered. In my opinion, everything that is going on in the media is a complete bastardization of what is really happening. The result is that good people are dying and losing family members. I have my own set of opinions about welfare and people working to improve thier own lot instead of looking for handouts, but what is occurring now is well beyond those borders. These people need help and need to get out. We can sort out all of the social and political issues later, but human beings with any sense of compassion would agree that the travesty that is going on here in New Orleans needs to end and people's lives need to be saved and families need to be put back together. Now.

I will tell you that I would probably disagree with most of the people that still need to be saved on political, social, and cultural values. However, it must be noted that these people love thier friends and families like I do, desire to live like I do, and care for their respective communities (I was even amazed at the site of seemingly young and poor black people caring for sickly and seemingly well-to-do white people and tourists still needing evacuation from New Orleans' downtown area) the same way I care for mine.

Eight people in particular who stood out during our rescue and whose stories deserve to be told:

1.) We were in motor boats all day ferrying people back and forth approximately a mile and a half each way (from Carrolton down Airline Hwy to the Causeway overpass). Early in the day, we witnessed a black man in a boat with no motor paddling with a piece of lumber. He rescued people in the boat and paddled them to safety (a mile and a half). He then, amidst all of the boats with motors, turned around and paddled back out across the mile and a half stretch to do his part in getting more people out. He refused to give up or occupy any of the motored boat resources because he did not want to slow us down in our efforts. I saw him at about 5:00 p.m., paddling away from the rescue point back out into the neighborhoods with about a half mile until he got to the neighborhood, just two hours before nightfall. I am sure that his trip took at least an hour and a half each trip, and he was going back to get more people knowing that he'd run out of daylight. He did all of this with a two-by-four.

2.) One of the groups that we rescued were 50 people standing on the bridge that crosses over Airline Hwy just before getting to Carrolton Ave going toward downtown. Most of these people had been there, with no food, water, or anyplace to go since Monday morning (we got to them Wed afternoon) and surrounded by 10 feet of water all around them. There was one guy who had been there since the beginning, organizing people and helping more people to get to the bridge safely as more water rose on Wednesday morning. He did not leave the bridge until everyone got off safely, even deferring to people who had gotten to the bridge Wed a.m. and, although inconvenienced by loss of power and weather damage, did have the luxury of some food and some water as late as Tuesday evening. This guy waited on the bridge until dusk, and was one of the last boats out that night. He could have easily not made it out that night and been stranded on the bridge alone.

3.) The third story may be the most compelling. I will not mince words. This was in a really rough neighborhood and we came across five seemingly unsavory characters. One had scars from what seemed to be gunshot wounds. We found these guys at a two-story recreational complex, one of the only two-story buildings in the neighborhood. They broke into the center and tried to rustle as many people as possible from the neighborhood into the center. These guys stayed outside in the center all day, getting everyone out of the rec center onto boats. We approached them at approximately 6:30 p.m., obviously one of the last trips of the day, and they sent us further into the neighborhood to get more people out of homes and off rooftops instead of getting on themselves. This at the risk of their not getting out and having to stay in the water for an undetermined (you have to understand the uncertainly that all of the people in these accounts faced without having any info on the rescue efforts, how far or deep the flooding was, or where to go if they want to swim or walk out) amount of time. These five guys were on the last boat out of the neighborhood at sundown. They were incredibly grateful, mentioned numerous times 'God is going to bless y'all for this'. When we got them to the dock, they offered us an Allen Iverson jersey off of one of their backs as a gesture of gratitude, which was literally probably the most valuable possession among them all. Obviously, we declined, but I remain tremendously impacted by this gesture.

I don't know what to do with all of this, but I think we need to get this story out. Some of what is being portrayed among the media is happening and is terrible, but it is among a very small group of people, not the majority. They make it seem like New Orleans has somehow taken the atmosphere of the mobs in Mogadishu portrayed in the book and movie "Black Hawk Down," which is making volunteers (including us) more hesitant and rescue attempts more difficult. As a result, people are dying. My family has been volunteering at the shelters here in Houma and can count on one hand the number of people among thousands who have not said "Thank You." or "God Bless You." Their lives shattered and families torn apart, gracious just to have us serve them beans and rice.

If anything, these eight people's stories deserve to be told, so that people across the world will know what they really did in the midst of this devastation. So that it will not be assumed that they were looting hospitals, they were shooting at helicopters. It must be known that they, like many other people that we encountered, sacrificed themselves during all of this to help other people in more dire straits than their own.

It is also important to know that this account is coming from someone who is politically conservative, believes in capitalism and free enterprise, and is traditionally against many of the opinions and stances of activists like Michael Moore and other liberals on most of the hot-topic political issues of the day. Believe me, I am not the political activist. This transcends politics. This is about humanity and helping mankind. We need to get these people out. Save their lives. We can sort out all of the political and social issues later. People need to know the truth of what is going on at the ground level so that they know that New Orleans and the people stranded there are, despite being panicked and desperate, gracious people and they deserve the chance to live. They need all of our help, as well.

This is an accurate account of things. Jeffery Rau would probably tell the same exact stories.

Robert LeBlanc


Update: If you follow the original link and read the cries for help, it will break your heart. It will also put the lie to the "they should have known better" narrative. Some should have. Not all:


## Posted at 17:10 in category /misc [permalink] [top]

Wed, 31 Aug 2005

Pacific Northwest Software Quality Conference

I've long had affection for PNSQC. It's a regional conference, but the only evidence of that is its size. The content is more like what you'd expect from a national conference. It's in Portland, Oregon, which is a little bit of a hotbed of Agile: Ward Cunningham, Jim Shore, Rebecca Wirfs-Brock, Steven Newton - all these people are Portland locals (though I suppose Ward may be completely gone now). Of the six keynote and invited speakers, four are pretty firmly associated with Agile: Mike Cohn, Esther Derby, Linda Rising, and me.

They have a PDF brochure.

October tenth through the twelfth.

## Posted at 16:39 in category /conferences [permalink] [top]

Static typing, again

Sometimes when demoing Ruby, I get the reaction that runtime typing is too error-prone. I'm not alone in responding by saying that unit tests will catch the bugs a static typechecker would, plus others. So why use both?

That's something of a flip response, and it's not really true. Here's an example I stumbled over while working on a chapter of Scripting for Testers.

Here's code that sets a variable:

start_date = month_before(Time.now)

Here's one of the methods that consume it:

def svn_log(subsystem, start_date)
  credentials = "--username notme --password not-the-real-one"
  timespan = "--revision 'HEAD:{#{start_date}}'"
  root = "https://svn.pragprog.com/Bookshelf/titles/BMSFT/Book/code/"

  `svn log #{credentials} #{timespan} #{root}/#{subsystem}`

Here's how I called that method in a test:

svn_log('inventory', '2005-08-01')

The test passed. All the methods were tested, and all the tests passed. But the program failed:

$ ruby churn.rb
subversion/clients/cmdline/main.c:832: (apr_err=205000)
svn: Syntax error in revision argument 'HEAD:{Wed Aug 03 12:51:38 CDT 2005}'
subversion/clients/cmdline/main.c:832: (apr_err=205000)
svn: Syntax error in revision argument 'HEAD:{Wed Aug 03 12:51:38 CDT 2005}'
Changes since 2005-08-03:
inventory (-1)
churn (-1)

The problem is that I unit-tested svn_log with a string in the format Subversion expects, but month_before returns a Ruby Time object. When svn_log gets one of those, it formats it into the wrong format, which makes Subversion complain.

The root cause of the bug was my cretinous choice of a variable name. I should not have named a variable expected to contain a Time with the name "date", especially not when the idea "date" is part of the usage domain. Nevertheless, static typing would have caught it while dynamic typing did not.

However, it seems to me that such errors will be caught if the whole-system tests exercise every call in the app. Suppose you're doing TDD starting from business-facing tests. Every call in the app would have to be traceable to a test whose failure motivated (perhaps very indirectly) that call, therefore every call in the app would be exercised by the complete suite (modulo some smallish background rate of human coding/design error), therefore there is no need for typechecking. But I'm perhaps being too flip.

P.S. Reviewers of Scripting: the example has been slightly tweaked for this posting.

## Posted at 16:17 in category /agile [permalink] [top]

Tue, 30 Aug 2005

Video capture tools

The author of the configuration video I mentioned earlier writes:

I stumbled across Wink whilst looking for a way to more easily describe how to use Hermes - text is woefully inadequate for a complicated GUI.

I've also used Wink to capture examples of GUI bugs and then pass them on to the companies/developers and attach them to bug reports.

It only takes about a quarter of the time to create a quick tutorial or demo compared to trying to describe in words and pictures - it's also more fun as an author and when time is tight on an opensource project it's invaluable.

Wink creates flash which is easily viewable by all - and best of all it's free. Do the author a favour and spread the word as it's a great tool to use in testing."

It's for Windox and x86 Linux. What do people use on Macs? I was going to use Snapz Pro X and iMovie - but I'm completely ignorant.

## Posted at 06:36 in category /misc [permalink] [top]

Growing a test harness

I'm fond of this article (pdf), by Kevin Lawrence, about growing a test harness.

I have worked with many testing common organizations where a common pattern is repeated over and over. It goes something like this - we realize there is more manual testing to do than time available, we decide to automate the testing, and we begin working on a test harness. Several weeks later, we have the start of a harness, but it's barely useful and we still have not written any tests. At this point, we're behind - so we abandon the harness and revert to manual testing.

## Posted at 06:23 in category /testing [permalink] [top]

Sun, 28 Aug 2005

Hints instead of rules

Via John D. Mitchell, an interesting article about a minimalist reaction to the tunnel vision rules promote:

Hans Monderman is a traffic engineer who hates traffic signs. Oh, he can put up with the well-placed speed limit placard or a dangerous curve warning on a major highway, but Monderman considers most signs to be not only annoying but downright dangerous. To him, they are an admission of failure, a sign - literally - that a road designer somewhere hasn't done his job. "The trouble with traffic engineers is that when there's a problem with a road, they always try to add something," Monderman says. "To my mind, it's much better to remove things."

Monderman is one of the leaders of a new breed of traffic engineer - equal parts urban designer, social scientist, civil engineer, and psychologist. The approach is radically counterintuitive: Build roads that seem dangerous, and they'll be safer.

Update: In response, Randy Mcdonald writes:

Build roads that seem dangerous, and they'll be safer.

was a principle that I first encountered when I read that Kowloon Airport in Hong Kong (I think that is its name, ICBW) had a remarkable safety record, given that the landing process was very terrifying. Ever since I read that, I use the term I put in my subject line [The Kowloon Airport Syndrome] for such situations.

Having flown into the old Hong Kong airport, that strikes a chord with me.

It occurs to me that I want to take the idea beyond safety. What I want to remember from the article is how physical space can be engineered to cause people to attend to what's easily ignored. What they might not see "in a state of nature" can be made visible. Ditto for those things that rules divert attention from (because following the rules is supposed to eliminate the need for attention).

Consider our friend the big visible chart. They are supposed to fall in that middle ground - not rules, but still acting to nudge people. Sometimes they don't work. Why not? Ditto for standup meetings, bullpens, etc. How can they be better engineered?

## Posted at 08:55 in category /misc [permalink] [top]

Fri, 26 Aug 2005

Logging and children

A program should be able to answer the two questions parents so often ask young children:

  • All right, which of you [subsystems] did it?

  • What on earth were you thinking?
    (What chain of events led to the wrong action? What relevant "facts" were believed at that moment?)

Logging is an important tool toward that end. I think it's underused and too often misused. I wrote some patterns for using logging for PLoP 2000 (pdf). Pretty mediocre. Who's written what I should have?

The TextTest people advocate using logging to capture the expected results of a business-facing test. I like when idea working on ugly legacy code.

## Posted at 08:57 in category /coding [permalink] [top]

Wed, 24 Aug 2005

Still more on counterexamples

Due to conversations with Jonathan Kohl and John Mitchell, a bit more on counterexamples.

I now think that what I'm wondering about is team learning. I want to think more about two questions:

  • Say someone comes up with a counterexample, perhaps that one kind of user uses the product really differently. How is that integrated into the mindset of the team? That is, how does it become an example of an extended model of product use? (I fear too often it stays as an awkward, unintegrated counterexample.)

    Take the blocks world example. In Winston's work, he taught a computer to identify arches by giving it examples and counterexamples. (Eugene Wallingford confirms that the counterexamples were necessary.) In that world, an arch was two pillars of blocks with a crosspiece. The counterexamples included, if I remember correctly, arches without a top (just two pillars) and maybe a crosspiece balanced on a single pillar.

    It's fine and necessary for a researcher to teach a computer - or a product owner a development team - about already understood ideas like "arch". But it's even more fine when the process of teaching surprises the teacher with a new, useful, and more expansive understanding of the domain. I want more surprise in the world.

  • Is there a way to give counterexamples elevated importance in the team's routine action? So that it isn't exceptional to integrate them into the domain model?

    One thing testers do is generate counterexamples by, for example, thinking of unexpected patterns of use. What happens when those unexpected patterns reveal bugs? (When, in Bret Pettichord's definition of "bug", the results bug someone.) The bugs may turn into new stories for the team, but in my experience, they're rarely a prompt to sit down and think about larger implications.

    An analogy: that's as if the refactoring step got left out of the TDD loop. It is when the programmer acts to remove duplication and make code intention-revealing that unexpected classes arise. Without the refactoring, the code would stay a mass of confusing special cases.

    Sometimes - as in the Advancer example I cite so compulsively - the unexpected classes reflect back into the domain and become part of the ubiquitous language. So perhaps that reflection is one way to make incorporating counterexamples routine. We tend to think of the relationship between product expert and team as mainly directional, one of master to apprentice: the master teaches the apprentice what she needs to know. Information about the domain flows from the master to the apprentice. There's a conversation, yes, but the apprentice's part in the conversation is to ask questions about the domain, to explain the costs of coping with the domain in a certain way, to suggest cheaper ways of coping - but not to change the expert's understanding of the domain. Perhaps we should expect the latter.

    Put another way: suppose we grant that a project develops its own creole - its own jargon - that allows the domain expert(s) and technical team to work effectively with each other. Something to keep casual track of would be how many nouns and verbs in the creole originated in the code.

## Posted at 08:02 in category /ideas [permalink] [top]

More on video

In response to my note on Jim Shore's video, Chris McMahon points to HermesJMS, an open source tool for managing message queues. He says:

Configuring any such tool is a chore, but the HermesJMS author has included video on how to configure all of the options of the tool: check out the "Demos" links from the left side of the home page for a really elegant use of video to explain complex activity in a sophisticated tool.

I should also mention the video for Ruby on Rails. The speed with which things get done is much more apparent on video than it would be in text.

## Posted at 08:02 in category /misc [permalink] [top]

Mon, 22 Aug 2005

OOPSLA hotel room promotion

Richard P. Gabriel, program chair of OOPSLA, has started a promotion. The person who refers the most attendees by September 15 (up to a max of one per day) gets to stay in rpg's suite. (Hotels "comp" conference organizers with free suites. It's one of the compensations for doing the work.) I don't care about staying in a suite. What attracts me is the idea of a free hotel room. If you register through the link in this paragraph or the image on the right, you will subsidize my cheapskatery.

Or you may prefer to compete with me.

## Posted at 09:29 in category /conferences [permalink] [top]

More on counterexamples

Andy Schneider responded to my counterexamples post with the following. I think they're neat ideas.

  1. I express project scope in terms of what the project is delivering and what it is not delivering. I learnt to do this in 1994, after listening to a bunch of people interpret my scope statements in different ways, depending on what they wanted to read into them. On the surface it seems daft to list all the things a project is not, it'd be a long list. However, there is always some obvious set of expectations you know you're aren't going to fill and some obvious confusions. I use those to draw up my 'Is Not' Lists.

  2. I'm writing a lot of papers laying down design principles for common architectural scenarios, trying to get some re-use at the design level and also trying to improve productivity by having the boring stuff already sorted for 80% of the cases. I communicate my principles with a narrative text within which the user 'discovers' the principles (which I highlight so it can be read by consuming the principles). At the end of the paper I normally write a section labelled something like implications. Here i walk through a set of counter-examples that describe practices that contradict the principles. This gets people to think about the implications of what's being said. Creates me a bunch of work working through the feedback, as these sections always elicit more feedback than the rest. If I didn't provide counter-examples no one would consider the space not covered or excluded by the principles.

    So, I've learnt it is useful, I have seen the fact it gets people to think about what something is not and the feedback from people is always better for it. In many ways it is opposite to a politician's approach, where they avoid counterexamples because they want you to read into their words what you want to. They don't want you to consider the space not covered or excluded.

(Reprinted with permission.)

## Posted at 08:47 in category /ideas [permalink] [top]

Sun, 21 Aug 2005


At Agile2005, Josh Kerievsky gave an invited talk. He advocated using what I guess you could call computer-aided instruction. That was partly inspired by the amount of time his coaches spend re-covering old ground and partly by what he's seen at the North American Simulation and Gaming Conferences. (That conference has been on my list for years - because of Josh - but to my shame I've yet to go.)

One of the things he pointed out is that instructional videos are now much much easier to create and distribute. In his talk, I resolved to produce one about using Fit in a test-first style. James Shore apparently also resolved to create one, and he did. It's a slide show explanation, rather than a demo, but I thought the use of animation got ideas across in a much tighter and memorable way than text+pictures could have.

It's on NUnitAsp, in which I have absolutely no interest. But now I know something about it, and I'm glad.

## Posted at 10:06 in category /misc [permalink] [top]

Flow-style vs. declarative-style Fit fixtures

Jim Shore has a note about the dangers of flow-style Fit fixtures (ActionFixture, DoFixture, etc.). One of the dangers has been known for a while in the pure testing world (Fit is not the first tool to use a tabular format for tests): it's that the temptation to add programming language constructs is strong, and the resulting programming languages are usually bad ones.

But the more novel point is that flow-style fixtures divert attention from the essence of what is to be programmed (typically a business rule). By doing so, they stop team learning short. There's no question but that business experts like to think in flow style: do this, see that, do this, see that... However, the point of team communication is to build a group understanding of the domain, problem to be solved, and good solution. It is not to draw forth some existing solution buried in the business expert's head and shoot it across to the technical side.

Building that group understanding means building a way of talking to each other that lets the group make progress. Just as programmers should not stop with their first design, but continue to work the code in response to stories, the team should not stop with the first way of notating tests. They should shape useful abstractions.

To my mind, that's especially important because I believe that the story of a project should partly be a story of how the business expert came to understand more deeply the domain or what people need to do within it. A project in which that doesn't happen is a failure. Sticking with the first and easiest way of describing tests is a way to fail.

I'm only a little bit bold in saying that skittishness about flow fixtures is an emerging consensus in the Fit world. Which is not to say that you should never use them. (And, if you do, you ought to be using the DoFixture.)

## Posted at 09:52 in category /fit [permalink] [top]

Capture/replay below the GUI

What with this whole test-first thing, some people's attention has drifted away from capture/replay GUI testing tools. I think that's good. But I was reminded today of an oldish article from Mark Silverstein on doing capture below the GUI (pdf). It relies on a clear separation between GUI and guts, with the GUI talking to the guts using the Command pattern.

If you're going to do capture, that seems the way to do it, especially since it's yet another way to encourage layering. If - as is extremely likely - the resulting scripts will be read by humans, not just executed, it might also encourage good (business-facing) names for the guts API.

## Posted at 09:52 in category /testing [permalink] [top]

Mon, 08 Aug 2005


In my thinking about tests as examples, I've been thinking of them as good examples:

The right system behaves like this. And like this. And don't forget this.

But what about counterexamples?

A system that did this would be the wrong system. And so would a system that did this.

There's some evidence that differences are important in understanding.

  • The linguist Ferdinand de Saussere taught that meaning of the word "boat" isn't "a small vessel for travel on water." Rather the meaning of "boat" is generated by contrast with other words like "ship", "raft", "yawl", "statue of a boat", etc. (Derrida would later go on to make perhaps too much of the fact that there's no limit to the recursion, since all those other words are also defined by difference.)

  • In the early '70s, Patrick Winston wrote a program that learned the concept of "arch" from a series of examples and "near misses". My copy of his book has long since gone to the place paperclips, coathangers, and individual socks go, so I can't check if the near-miss counterexamples merely improved the program or were essential to its success.

  • My kids are now of the age (nine and ten) where they ask for dictionary-like definitions of words. But when they were younger, they more obviously learned by difference: they'd point at something, give the wrong name, then accept the correction without further discussion. ("Duck." "No, that's a goose." "Dog." "Yes, a nice dog.") Presumably the counterexamples helped with that amazing burst of vocabulary young kids have.

So what about those times when the programmer proudly calls the product owner over to show the newest screen and watches her face fall just before she says, "That's not really what I had in mind"? Or those times when a small group is talking about a story and a programmer pops up with an idea or a supposed consequence that's wrong? That's an opportunity to - briefly! - attend to what's different about the way two people are thinking.

Does anyone make explicit use of counterexamples? How? What have you learned?

## Posted at 20:05 in category /ideas [permalink] [top]

Link: Crispin on test-first customer tests

Lisa crispin has a nice, short article on how her team uses business-facing tests to drive development. A couple of points I particularly like:

  • Good examples of questions someone (often the tester) should ask about even a simple story.
  • An emphasis on just-in-time test creation.
  • How the need to support tests encouraged business logic to go in the right place.

## Posted at 11:07 in category /agile [permalink] [top]

Sun, 07 Aug 2005


As of today, I've lost 20 pounds. (Naturally, I'm now heading off for New Hampshire, Land of the Giant Ice Cream Cone, for a week of vacation.) It seems to me I need to lose around 10 more pounds of fat and gain back at least 5 pounds of muscle to be in non-pathetic shape. Blogging my weight has definitely helped.

## Posted at 20:28 in category /misc [permalink] [top]


Somewhere around 1983, I took Ralph Johnson's very first course in object-oriented programming. That led to Ralph becoming my graduate advisor, which led to my being part of his reading group, which led to my reviewing some of Martin Fowler's books in draft, which led to Martin knowing me as a tester guy, which led to him getting me invited to a workshop in Utah, which meant I became an author of the Manifesto for Agile Development, which concentrated my attention on Agile, which is the reason I'm writing this sentence.

It's not just me: even the most talented and virtuous of us need luck. The same is true of fields like Agile. Suppose Kent Beck hadn't met Ward Cunningham? Suppose they hadn't become Smalltalk programmers? (I've talked a little to Ward about Smalltalk's influence on Agile, and one of the things he said was, "Smalltalk people were more offended by software engineering than other people." So they did more about it.) What if Alistair Cockburn hadn't early on gotten the job of surveying projects within IBM to see what really worked? I don't know the whole story of refactoring, but would it have made the big time if Ralph Johnson hadn't been immersed in the Smalltalk world at the same time Bill Opdyke was looking for a dissertation topic?

One of the things I've long taught testers is that they must work to get dumb luck on their side. When filling a database, it's better to make up new names than to reuse the same ones out of habit - even if what you're testing has nothing to do with name-handling code. The more variety in the names you use, the more likely you are to stumble over a name-handling bug.

Moreover, all things being equal, one complicated test that incorporates five test ideas is better at accidentally finding bugs than five simple tests. (This is for after-the-fact testing, note, and all things aren't equal. You have to be aware of the tradeoffs.)

There are a whole bunch of talented people out there poised to learn new Agile ideas and techniques. If we get luck on their side - our side - they'll learn and then teach us. If we don't, some of them will take their talents elsewhere. Others will learn, but what they've learned won't spread.

The Gordon Pask award, I'm now realizing, was partly motivated by that. It's a way of shining a spotlight on people, of saying "Pay attention to these two." That will cause their ideas to spread faster. And, as a public and tangible acknowledgement of accomplishment, it will motivate them to work harder for us.

It has its flaws, though:

  • It's only two people, and the pool of deserving talent is much larger. (I especially worry that deserving people will feel slighted and perhaps reduce their efforts.)

  • There's no mechanism for rewarding groups, such as user groups. The award shares some of the problems of applying traditional, individual-centric mechanisms of employee evaluation and promotion to Agile teams.

  • It's maybe uneasily focused on the potential to become a Big Name. Becoming a big name is not usually merely a matter of invention, discovery, or synthesis; it also requires the ability to explain and to promote. To the extent that we reward all of those things, are we shortchanging what we need most?

  • It's a big reward - something like getting tenure. But academics don't just toil away for years and then either get tenure or not. Instead, they proceed incrementally, paper after paper, amassing a cumulative record of smaller rewards (in the form of their Curriculum Vitae). Should there be an equivalent?

  • Maybe amplifying the rewards for smaller accomplishments - like experience reports - would produce a better return.

  • It's subjective, and that means both bias and the perception of bias are inevitable. The political minefields are many.

I'm still happy about it, but I said we'd be improving it over the next year. What more should be done?

(Thanks to Steve Freeman for conversation about this.)

## Posted at 17:55 in category /conferences [permalink] [top]

Sat, 06 Aug 2005

Two solutions

(After a couple of days trying, off and on, to make a narrative of threats to Agile work, I give up. I also give up on the cybernetics tie-in. It gave me ideas, but I can't use it to express them. So it comes down to three solutions and their justifications. And every time I write about the last of them, I see it in a different way, so I'm going to cut my losses and put up the first two until I can say something coherent about the third.)

What we need is...

... an undergraduate-level textbook and course

I'm thinking here of an Agile alternative to texts like Sommerville and Pressman: books that intend to be the only text for a survey/project course, books that go through edition after edition, books that are referred to by author name rather than title.

I'm inclined to agree, with Kuhn, that textbooks both mark and help cause the transition of a particular style ("paradigm") from something revolutionary to something normal. As Agile goes mainstream, a textbook will be a significant marker.

A textbook will, I'm assuming, come with instructor support: teaching guides, sample problems and solutions, advice on fitting Agile teamwork into student schedules and the physical infrastructure (dorm rooms, and carrels filled with people who don't want to hear chatter). I get the impression that Agile courses are more work for instructors than conventional high-ceremony courses, so such materials will be needed to get overworked instructors to switch to something new.

The existence of such courses is important in two ways. First, it will route more students into Agile. They'll have a "routinized" way of discovering it, instead of relying on chance contacts with inspiring people or texts. Second, it will slot Agile into the routine of hiring. Yes, that routine is impersonal and lossy in many ways. But companies hiring, say, undergraduate physics majors have an advantage. They can look at a resume and be pretty sure that the candidate knows a particular set of facts and has been drilled in a set of techniques. So scarce interview time can build on that base. We know nothing of the sort: does the person know TDD? Has she ever refactored? Does she understand the test-code-refactor cycle? Has she ever been in a standup meeting? Does she know why they work? (This ties in with Josh Kerievsky's claim at Agile2005 that the amount of time coaches spend teaching basics is retarding the spread of Agile within companies.)

I realize I'm talking about something intensely political: even when the author describes a superset of Agile ideas and techniques, rather than a common core, such a book serves to define a field. When ideas get left out, people get mad. I am convinced my schtick about tests as examples will get left out, and that's wrong, wrong, wrong. But I'll have to live with it.


  1. At Agile2005 a couple of instructors told me that they happily assign multiple books: Refactoring, a test-driven-design book, etc. I'd prefer that - if only because such books stand a chance of being opened after the final exam - so I could be wrong about some of the need for a survey textbook. But there's still the need for instructor support materials.

  2. I'm a fan of alternative education, like Dave West's Bachelor of Arts in Software Development Technology or RoleModel's apprenticeships or the Planetary Collegium. But we'd be foolish to bet against society's dominant form of job training, despite its well-known inadequacies.

  3. Bill Wake's Refactoring Workbook seems to me a great instructor guide or supplementary text. I wonder how much it's used?

... to take over a computer science department

If you say "Agile in academia" I think of people: Rick Mugridge, James Noble, Grigori Melnik, Robert Biddle, etc. The only university name that comes to mind is North Carolina State, but I just checked the home pages of all the faculty there, and it appears that Laurie Williams is the only one interested enough in Agile to mention it.

Scattered individuals have extra trouble making progress. Compare to my wife's department. There are several professors whose interests are close enough to hers that they routinely publish together. Because they're all in the same place, they can bounce ideas off each other at random moments. And I'm especially interested in how they trade favors with each other and thus exploit comparative advantage to the benefit of all. Conferences are great because they're concentrated events that raise people's emotional energy and increase cultural capital, but they're not enough.

The lack of departments with a concentration on Agile hurts development of the field in other ways. In 1976, I wanted to be an astronomer. Going to Caltech was an obvious choice. Eight years later, I was peripherally involved in the big AI boom (as a Lisp implementer). Had I wanted to get a graduate degree in AI, there would have been obvious places to go: MIT, CMU, Stanford. Going to a department with a specialty in AI is less risky than going to work with a person who specializes in AI: it's easier to recover from bad luck - personality mismatches, your mentor leaving for another university. Because of that, some potentially productive people will go a different field. We want them all.

Accomplishing this takeover is an exercise left to the reader.

## Posted at 12:56 in category /conferences [permalink] [top]

Wed, 03 Aug 2005

Prelude to an explanation of a fear

In my Agile2005 co-keynote, I talked about why I think Agile is akin to cybernetics, at least the British cybernetics of the late 40's and on. They share an emphasis on these things:

Performance over representation

Rosh Ashby wrote in 1948 that "...to the biologist the brain is not a thinking machine, it is an acting machine; it gets information and then it does something about it." What we usually think of as thinking - building a model in the brain of the world outside it - is secondary; a tool that can be used when useful, ignored when not.

Similarly, an Agile team doesn't depend on a requirements document - a model of a solution - for success. The team is gets a demand for more features and does something about it: it uses whatever tools and internal resources generate the features and keep the paychecks flowing. A good model of the end product turns out not to be such an important tool.


In an as-yet-unpublished manuscript, Andrew Pickering writes that, to these cyberneticians, "... the brain's special role [is] to be an organ of adaptation. The brain is what helps us to get along in situations and environments we have never encountered before."

Similarly, the Agile team's job is to produce a steady stream of working software. Whenever the environment changes and upsets that steadiness, the team adapts until it reaches equilibrium (steady productivity or consistent growth in productivity).


The cyberneticians were fond of building complex things, working with them, and opportunistically taking advantage of their unplanned capabilities. I gave the example of Gordon Pask's ear: a device that was trained like a neural net. It took in electrical inputs, sent out electrical outputs, and the outside world's judgments on the outputs trained it to get better and better at producing favored outputs. One way it differed from neural nets is that it didn't start with artificial neurons and connections. It grew them from an undifferentiated soup.

This device had one sense designed in: the ability to detect electrical inputs. In what seems to me a later bout of whimsy, the experimenters attached a microphone to it. By vibrating the whole assembly, the experimenters provoked it to grow a sense of hearing - a wholely unplanned and unexpected sense good enough for it to distinguish 50 cycle from 100 cycle sounds.

In agile programming, it's considered unsurprising if the coding takes you in an unexpected direction and even causes the team to develop new concepts. (See Ward Cunningham's story of Advancers.)

And then I said that's not what I wanted to talk about.

What I wanted to talk about is the fact that cybernetics fizzled. If we share its approaches, might we also share its fatal flaws?

More tomorrow.

## Posted at 16:34 in category /conferences [permalink] [top]

Sat, 30 Jul 2005

The Gordon Pask Award

One day before the start of Agile2005, the Agile Alliance board voted to create the Gordon Pask Award for Contributions to Agile Practice. Here's its description: a cybernetic device

The Gordon Pask Award recognizes people whose recent contributions to Agile Practice demonstrate, in the opinion of the Award Committee, their potential to become leaders of the field.

The award comes with a check for US$5000 and some physical object inscribed with the recipient's name. We expect two recipients per year.

The idea behind the award is that we in the Agile community need to do more to promote and encourage the rising stars of tomorrow. These are people who help other people: both indirectly, by producing tools or ideas other people use, and also through direct support of some Agile community. Rather than planning out the award, thinking through all the gotchas of deciding on recipients, and giving the first award in 2006, we decided to give the Award at the conference's closing banquest just five days later, trusting our membership to be tolerant of mistakes so long as they lead to improvement next year.

The two recipients this year are:

J. B. Rainsberger, for spending a great deal of time helping people on the testdrivendevelopment mailing list, for writing JUnit Recipes, for XP Day Toronto, and for being this year's Agile2005 tutorial chair.

Jim Shore, for his performance as a paper shepherd; for a fine experience report he gave at ADC2003 that, together with his blog, suggest a cast of thought that deserves cultivation; for his work on the Fit specification and the C# version of Fit; and for being a person who holds the Fit world together by doing the sort of organizational and cleanup tasks that are usually thankless.

The selection committee was Rachel Davies, Big Dave Thomas (a different person than Pragmatic Dave Thomas and Wendy's Dave Thomas), and me. We'll be evolving our notion of the prototypical recipient over the years. Putting nominations from the conference members into affinity clusters got us well started on that. But it was remarkably hard to narrow down to a cluster of two: there were six or seven obvious choices.

(Next year, we'll be taking nominations from the entire Agile Alliance membership. And we'll start much sooner.)

I'm really happy we did this.

Oh - you wonder who Gordon Pask is? As constant readers know, I harp on the similarities between the attitudes and approaches that characterize Agile and those of the British cyberneticians of the last century. Of those people, Gordon Pask is perhaps the closest to us. Also, we needed a logo, some cybernetic device seemed appropriate (given how the idea tied into my keynote), and Pask's ear was the only one I had to hand. (It's the picture above.)

## Posted at 07:52 in category /conferences [permalink] [top]

Thu, 14 Jul 2005

What I want to accomplish at Agile 2005

  • Survive giving a joint keynote with someone far more charismatic.

  • Sit down with people to work further on my design-driven test-driven design example. (I've been too busy.)

  • For the benefit of people considering business-facing test-driven design, find collaborators to discuss these questions:

    1. In what situations is it a bad idea?

    2. What groundwork needs to be laid? (For example, do you need to have programmer testing well ingrained before making the leap? Or can product-level TDD work before unit-level TDD?)

    3. Anyone doing something they haven't before will hit snags surprising to them but predictable by others. For example, a beginning weightlifter might not expect progress to come in spurts separated by distressingly long plateaus. They'd be more likely to succeed if warned. What snags should new "examplers" be warned about?

    Write up the results.

  • Talk to Eric Evans, Rick Mugridge, and others about the intersection between tests and ubiquitous language.

  • I sometimes think our rhetoric is too behaviorist: stimulus comes from outside the project, the project responds appropriately and also reconfigures its "circuitry" to be better at responding to such stimuli. As my foreword to the Fit book hints, I'm hung up on the notion that surprise can be internally generated - that ideas from within the project can shape the systems that "drive" it. (See Ward's story of Advancers for an example.)

    If that's (a) possible and (b) desirable, we've woefully understudied the internal conditions that bring it about. There's got to be more to it than refactoring, removing duplication, etc. I'd like to provoke some conversations about what else there is. As Pasteur would have said had he been terser, "Chance favors the prepared mind." Prepared how? (Being attracted to Hutchin's notion of distributed cognition, I'm more interested in the preparation of the team, environment, and flow of events than in preparation of the individual.)

## Posted at 09:35 in category /conferences [permalink] [top]

Fri, 08 Jul 2005

Fit book released

Mugridge and Cunningham's Fit for Developing Software is out now. If you're using or evaluating Fit, you ought to buy it. You can find my reasons here (pdf). There are sample chapters here.

## Posted at 17:25 in category /fit [permalink] [top]

Wed, 06 Jul 2005

Breaking tests for understanding

Roy Osherove has a post about breaking code to check test coverage. That reminded me of a trick I've used to counter a common fear: that changing something here will break something way over there.

I started to proudly write up that trick. But the process of putting words down on pixels makes me think I was all wrong and that it's a bad idea. Read on to see how someone who's supposed to know what he's talking about goes astray. (Or maybe it's a good idea after all.)

The picture is of a standard layered architecture. The green star at the top level is a desired user-visible change. Let's say that support for that change requires a change at the lower level - updating that cloud down at the bottom. But other code depends on that cloud. If the cloud changes, that other code might break.

The thing I sometimes do is deliberately break the cloud, then run the tests for the topmost layer. Some of those tests will fail (as shown by the upper red polygon). That tells me which user-visible behaviors depend on the cloud. Now that I know what the cloud affects, I can think more effectively about how to change it. (This all assumes that the topmost tests are comprehensive enough.)

I could run the tests at lower layers. For example, tests at the level of the lower red polygon enumerate for me how the lowest layer's interface depends on the cloud. But to be confident that I won't break something far away from the cloud, I have to know how upper layers depend on the lowest layer's to-be-changed behaviors. I'm hoping that running the upper layer tests is the easiest way to know that.

But does this all stem from a sick need to get it right the first time? After all, I could just change the cloud to make the green star work, run all tests, then let any test failures tell me how to adjust the change. What I'm afraid of is that I'll have a lot of work to do and I won't be able to check in for ages because of all the failing tests.

Why not just back the code out and start again, armed with knowledge from the real change rather than inference from a pretend one? Is that so bad?

Maybe it's not so bad. Maybe it's an active good. It rubs my nose in the fact that the system is too hard to change. Maybe the tests take too long to run. Maybe there aren't enough lower-level tests. Maybe the system's structure obscures dependencies. Maybe I should fix the problems instead of inventing ways to step gingerly around them.

I often tell clients something I got from The Machine that Changed the World: the Story of Lean Production. It's that a big original motivation behind Just In Time manufacturing was not to eliminate the undeniable cost of keeping stock on hand: it was to make the process fragile. If one step in the line is mismatched to another step, you either keep stock on hand to buffer the problem or you fix the problem. Before JIT, keeping stock was the easiest reaction. After JIT, you have no choice but to fix the underlying problem.

So, by analogy, you should code like you know in your heart you should be able to. Those places you fall through the floor and plummet to your death are the places to improve. I guess not by you, in that case. By your heirs.

## Posted at 10:40 in category /coding [permalink] [top]

Test-driven Development and Beyond

Dave Astels and I will be hosting a workshop at Agile 2005 titled Test-driven Development and Beyond.

Test Driven Development is becoming a mainstream practice. However, it is a step along the way, not a final destination. This workshop will explore what the next steps and side paths are, such as Behavior Driven Development, Example Driven Development, Story-Test Driven Development.

The idea is that we'll spend up to an hour having participants give brief pitches for what they think "lies beyond," then split into focus groups to discuss particular topics. The deliverable is a short paper outlining the various approaches discussed, together with people involved in each.

You need not submit a position paper to attend. If you'd like to present your idea (briefly! which rules out fumbling with a projector), send us your topic. Thanks.

## Posted at 08:55 in category /misc [permalink] [top]

Mon, 04 Jul 2005

Amplifying your effectiveness

I'll be leading two half-day hands-on sessions at the Amplifying Your Effectiveness conference (November 6-9, Phoenix Arizona, USA). The titles of my sessions bear witness to my belief that I'm no expert in my topics---but I do expect that collectively we can get somewhere good.

An Amateur's Guide to Communicating Requirements

We're all familiar with traditional requirements gathering: interview and observe a subset of users, then try to write clear, unambiguous, complete, and testable statements of their requirements. Many of us have tried hard to do that and failed. From that, some of us conclude that we should try harder and smarter. I conclude that the whole idea is broken. You not only can't write precise statements in here that represent the world out there, you can't even come close enough.

In this session, I hope to convince you that my claim is at least plausible. The next question is: "And then what?" We'll start to explore ways of putting ourselves in situations where we can create better systems without being able to specify requirements.


  • Flaws with the default model.
  • At least one technique that doesn't depend on the default model.
  • The merits of practice vs. observation.
Another Amateur's Guide to Communicating Requirements

Since Plato, at least, we've been talking about creating mental models of the world. We usually think of them as like pictures, where everything you can point to in the picture matches something in the world. What if that kind of mental mode is mostly beside the point?

Using exercises, we'll ask two questions: What if the power of a mental model isn't inherent in the model itself, but in the way you explain it to someone else? And what if model-building is powerful when it builds on our expertise, as social animals, at predicting what actions will make someone smile?

This session is related to An Amateur's Guide to Communicating Requirements. It's not necessary to attend both sessions.

Key points:

  • You can explain many things using examples and not much more.
  • We extrapolate better about specific people than about abstractions.

## Posted at 21:02 in category /misc [permalink] [top]

Thu, 30 Jun 2005

Hard Deadlines

A month ago, I made a consulting trip. I recommended that the client do the usual Agile thing: have shippable code with new features every few weeks, break each of those releases into stories with user-visible results, make the stories more granular than seems possible, make tasks within a story short with sharp end-points, etc.

Yesterday, I asked how it was going. I got this in reply (reprinted with permission):

I was actually really fortunate. Shortly after you were here, my wife's pregnancy came to full term. As such, I had to think of life in terms of 2-3 hour slices in case she called to tell me she was in labor. That made it much easier to have the discipline to break a project into smaller chunks.

So here's my new Agile training regimen:

  • The team must be composed of people who plan on having a child soon.

  • All pregnancies must commence roughly seven months before the project is to start. Remember: on Agile teams, it's not someone's job, it's everyone's job.

  • I do not require the use of drugs to keep the mother-to-be on the edge of labor until I judge that she or her partner (whichever is the programmer) really gets the idea of short tasks. As a liberal, I don't believe family life should be subordinated to the needs of the corporation.

It's odd, with innovative ideas like this, that I don't get more business.

## Posted at 09:16 in category /agile [permalink] [top]

Wed, 29 Jun 2005

Agile 2005 may sell out

As of Monday, Agile2005 had over 500 people registered. The hotel is full, and the conference may be capped at 600. If you're a wait-until-the-last-minute type, this may be the last minute.

## Posted at 09:56 in category /agile [permalink] [top]

Thu, 16 Jun 2005

Where to put code

On the agile-testing list, we were discussing how reuse could work in an Agile project. How do you find code you can reuse? What prevents you from reinventing it? I rambled for a bit, then finished with this:

As I paraphrased at my PhD prelim exam (to unimpressive effect), "Knowledge is of two kinds. You can know a thing, or you can know where to find it." (Samuel Johnson) In a system that's constantly migrating toward a good structure, the problem of knowing where to look it up is simpler than in the typical big ball of mud system. You can look in the place it ought to be. If it's not there, there's a reasonable chance it's not anywhere.

To this, Kevin Lawrence gave this wonderful advice about what to do after you don't find the code and start writing it yourself:

Put it where you looked.

He gives the C2 wiki as the source, but I can't find it there.

## Posted at 17:16 in category /coding [permalink] [top]

Wed, 15 Jun 2005

User modeling in exploratory testing

Jonathan Kohl has an illuminating story about how he used an informal, observation-based user model to repeat a previously unrepeatable bug.

The developer thought I was insane when he saw me rocking my desk with my knee while typing...

## Posted at 08:59 in category /testing [permalink] [top]

Sat, 11 Jun 2005

Reviewers needed for Scripting for Testers

I have four chapters of Scripting for Testers almost ready for review. Right now, I'm mainly looking for people who don't know any programming language or people who have a passing knowledge of some scripting language (such as the ones GUI testing tools use). I want to see if the tone and level are right.

If interested, send me mail.

## Posted at 15:19 in category /testing [permalink] [top]

Wed, 08 Jun 2005

A certain sense of balance (links)

Some time ago, I wrote a post about adding an if statement. I mentioned a satire about conceptual cleanliness leading to complicated code. This isn't it, I don't think, but it's along the same lines. (Thanks to John Wiseman and a couple of others I forget.)

Richard P. Gabriel passed along a piece he wrote on the Y combinator. Like the one I linked to, it uses factorial. Quite interesting to compare and contrast the two explanations.

## Posted at 08:16 in category /misc [permalink] [top]

Tue, 07 Jun 2005

My niche as an editor

I'm dialing down my work on Better Software magazine to one article an issue. I'll concentrate on helping early-adopter authors bring now-safe techniques to the early mainstream. Not entirely coincidentally, these are the articles that I pull out of my back pocket for clients. "You need an X, and here's an article about it."

Here are some examples:

There are other examples, but those are ones that came to mind both because of recent experience and because I didn't have to search for links. In upcoming issues, I'll have articles on using examples (Fit tests) for specification, what it's like to be an XP customer, and building a strangler application.

If you have ideas for an article that matches this theme, send me a proposal. Forgive tardy replies: I'm digging myself out of a pile of urgency.

## Posted at 09:10 in category /misc [permalink] [top]

Mon, 06 Jun 2005

Threat trees and others

Eric Jarvi writes a post on threat trees in response to my post on fine-grained guidance in exploratory testing. It reminded me of a short paper by Cem Kaner that I've always liked: Negotiating Testing Resources: a Collaborative Approach. Should be appealing to the big visible charts crowd. I recommend both links to your attention.

## Posted at 11:57 in category /testing [permalink] [top]

Form, content, and the structure of relationships

From an op-ed piece by Stanley Fish:

On the first day of my freshman writing class I give the students this assignment: You will be divided into groups and by the end of the semester each group will be expected to have created its own language, complete with a syntax, a lexicon, a text, rules for translating the text and strategies for teaching your language to fellow students. The language you create cannot be English or a slightly coded version of English, but it must be capable of indicating the distinctions - between tense, number, manner, mood, agency and the like - that English enables us to make.

You can imagine the reaction of students who think that "syntax" is something cigarette smokers pay, guess that "lexicon" is the name of a rebel tribe inhabiting a galaxy far away, and haven't the slightest idea of what words like "tense," "manner" and "mood" mean. They think I'm crazy. Yet 14 weeks later - and this happens every time - each group has produced a language of incredible sophistication and precision.

(Hat tip to rpg for the link.)

One of my pet obsessions these days is learning and teaching two guidelines of well-factored code:

Don't repeat yourself

Eliminating duplication seems key, but it's surprising how many programmers don't care about even the kind of duplication a program could find, much less subtler forms like boolean arguments.

And yet, we have people like (I believe) Ralph Johnson saying:

Once and Only Once is a profound concept, but difficult to apply. I've spent my entire professional life (25 years) learning how to apply it to programs.

And we have testimonials like Ron Jeffries':

I once saw Beck declare two patches of almost completely different code to be 'duplication', change them so that they WERE duplication, and then remove the newly inserted duplication to come up with something obviously better. [Ron: it would be tremendous if you could reconstruct the code.]

(Both quotes found here.)

Could we come up with words akin to "tense" and "mood" that make relationships of duplication clear to people who are middling language-writers but not language-understanders?

Intention-revealing names

As discussions of good method names show, we don't even have the right words to express this idea. "Intent" isn't quite right, and it's not exactly that good names say "what" vs. "how", and it's not really that names say "why" - but it is something. How to talk about that something?

Now, I'm certainly a fan of tacit understanding of what's inadequately captured by rules, so I don't believe that the two guidelines above can be completely captured by some syntax governing relationships, any more than I believe everything that natural language does is a matter of syntax. But if a knowledge of syntactic rules can help English writers, I expect a syntax of relationships (not the BNF of the language) could help Java writers.

I'm aware that I'm boldly heading to the 50's and structuralism, at least two intellectual fads ago. What of it?

## Posted at 08:51 in category /misc [permalink] [top]

Fri, 03 Jun 2005

Random thoughts

Anniversary: Next year will be the 30th anniversary of the first time I sent email. As my anniversary gift, I would like a spammer ground up and used to seed 1700 pearl oysters (1700 being the approximate number of messages in my Junk mailbox last time I emptied it). If possible, make it the "Greets! I offer you full base of accounts with passwords..." guy.

How Agile will destroy open source: I wanted to talk about Gold Cards in a consulting report, so I went to the original paper: "Innovation and Sustainability with Gold Cards"(Higman, Mackinnon, Moore, and Pierce). There I read:

It's good for [a programmer's] skills to be able to practice new techniques [...] Some of us tried to work on this stuff after working hours, however a full day of guilt-free paired programming is extremely tiring.

Where does open source software come from? From programmers who are frustrated that their day job doesn't involve very much, you know, programming. So they do it for free at night. And when the evening finds them sated...?

## Posted at 15:46 in category /junk [permalink] [top]

Sun, 15 May 2005

Expert code

I've been working on the chapter in Scripting for Testers where readers start defining classes. As the story behind the following text begins, I have to figure out how to make a failing test pass.

Here's a thought about a solution: After pushing the new context line onto potential_context, I could check if it's too long. If so, I should shorten it. That would look like this:

def unusual_lines(line_array)
  return_value = []
  potential_context = []
  line_array.each { | one_line |
    if unusual?(one_line)
      return_value += potential_context
      potential_context = []
      potential_context.shift if potential_context.length > 5

It would pass the test, but am I happy with it?

No. I fear if statements. I especially fear if statements within other if statements. They're too hard to get right and too confusing to read. Fortunately, there are often ways to set things up so that the code would always make the same decision---and hence doesn't need an if. Here, there are two possibilities: make it so that the code always throws away the first element or such that it never has to throw away any element.

When I thought about it that way, I realized that if potential_context always had five elements, the code would always shift the first one away...

Then there's text on how to make that work. I'm teaching an old idea: replace explicit runtime decisions with appropriate setup.

Just after writing that, I flashed on something that Gary Klein writes in Sources of Power. His research group was preparing to interview expert firefighters about their decision-making.

We asked the commander to tell us about some difficult decisions he had made.

"I don't make decisions," he announced to his startled listeners. "I don't remember when I've ever made a decision." [...]

He agreed there were options, yet it was usually obvious what to do in any given situation. He insisted he never [compared options]. There just was no time. (pp. 11-12)

It was not that the commanders were refusing to compare options; rather, they did not have to compare options. [...] [They] could come up with a good course of action right from the start. [...] Even faced with a complex situation, the commanders would see it as familiar and know how to react.

The commanders' secret was that their experience let them see a situation, even a nonroutine one, as an example of a prototype, so they knew the typical course of action right away. (p. 17)

[T]here are times for deliberating about options. Usually these are times when experience is inadequate and logical thinking is a substitute for recognizing a situation as typical. [...] Deliberating about options makes a lot of sense for novices [...] (p. 23)

What makes someone an expert is the ability to act appropriately without resorting to conventional rational thought, except perhaps for after-the-fact explanations. (And as both Sources of Power and oodles of old work in expert systems show, experts often find it difficult or impossible to explain the reasons behind their actions, even to themselves.)

If the parallel isn't completely bogus, expert code will have few if statements and be hard to debug.

## Posted at 16:14 in category /coding [permalink] [top]

Wed, 11 May 2005

Approaches to legacy code

Some recent and pending encounters with legacy code have made me think there may be three different broad approaches.

Rewrite and throw away
In this approach, you declare that the application will be broken for some period of time. Like a cartoon character, you dive into it, produce a furious cloud of dust that obscures you (picture gears and such flying out of the cloud), and finish with a new thing (or a thing with a shiny new component installed).

Refactor into submission
This is the approach that Feathers teaches in his fine Working Effectively With Legacy Code. In it, you gradually wrap pieces of the application in tests, gingerly fixing dependencies to make that tractable. Then, when some part of the code is wrapped and decoupled, it can be changed safely. I think of this as islands of order growing amongst the seething chaos of the product.

The term strangler application is due to Martin Fowler. The image is of a vine growing up around a tree, gradually killing it, until eventually the only thing left alive is the vine, roughly in the shape of the original tree. You don't (much) fix up existing code. Instead, when you need something new or changed, you begin by building fresh, greenfield code. The legacy code can call into the greenfield code, but access in the other direction is minimized and highly controlled. (I think it's fair to say a project Strangling its application is using Feathers's Chia Pet Pattern (pdf, slide 9) as the overwhelmingly dominant tool/metaphor.)

As the last parenthetical remark suggests, these approaches shade into one another. Nevertheless, they seem to me three distinguishable stances toward legacy code. My question is: how do you decide which stance to take?

In the 60's, renowned software engineer Rocket J. Squirrel drew on his extensive experience to make the definitive comment on Rewrite and Throw Away: "That trick never works!". Which isn't invariably true, of course, but it's a risky strategy. The other two methods can still deliver a steady stream of features to the business, albeit at a slower rate (in the short term). In this one, the team vanishes from view of the business, doing who knows what, promising great things for the future. That's an unstable situation, because the business is liable to get fed up and start insisting on features ASAP. It's also a dangerous situation for the programmers. Because they have to finish everything before anything works, they might not know they're way off course until far too late.

Still, Rewrite and Throw Away is easier than refactoring into submission. When faced by a tangled mass of code, it's hard to know where to start. And, in my (lack of) experience, you're much more likely to have a refactoring that flames out and has to be rolled back, just because of some complex interconnectedness you didn't grasp at first. Until you've learned-through-doing, it's frustrating, and frustration leads to short-cuts (which is how you got the legacy code in the first place, probably).

Strangling the application has the advantage that you've consciously decided you shan't fix the old code, so the dangers of touching it are (somewhat) lessened. It also has the advantage that you can quickly create a concrete architecture (a set of layered subsystems, say) to guide your forward movement. (This insight is due to Michael Thomas, who's writing me an article on strangling code.) I don't have the experience to speak to Strangling's disadvantages (except that I can imagine long debates about the concrete architecture preventing people from getting moving).

I have this idea that there must be a set of patterns or lore that would help people navigate among those choices. Who will write it down?

## Posted at 09:14 in category /coding [permalink] [top]

Tue, 10 May 2005

Fine-grained guidance in exploratory testing

I'm going to be hosting a couple of sessions at the AYE conference. I was tailoring my standard biographical blurb for it when a phrase leapt out at me:

[My] approach is evolving. Today, [I] emphasize [...] exploiting the similarities between exploratory testing and incremental design [...]

A lot of what I've been talking about is bringing the tools, attitudes, and biases of exploratory testing to bear on program and product design. But what about the reverse direction?

Consider: I make frequent use of a quote from Ron Jeffries:

Beck has those rules for properly-factored code: 1) runs all the tests, 2) contains no duplication, 3) expresses every idea you want to express, 4) minimal number of classes and methods. When you work with these rules, you pay attention only to micro-design matters.

When I used to watch Beck do this, I was sure he was really doing macro design "in his head" and just not talking about it, because you can see the design taking shape, but he never seems to be doing anything directed to the design. So I started trying it. What I experience is that I am never doing anything directed to macro design or architecture: just making small changes, removing duplication, improving the expressiveness of little patches of code. Yet the overall design of the system improves. I swear I'm not doing it.

-- Agile Alliance authors' mailing list, July 19, 2001

As I've since learned (and sometimes documented), such heuristics - interpreted with imagination and guided by analogies to past experience - really have a wonderful way of guiding the performance of design. They make it less likely that you'll go off into the weeds.

What fine-grained guiding heuristics are there for exploratory testing? I confess that I can't think of any. That alone doesn't mean much, since I don't claim to be a particularly good exploratory tester. But I also can't think of anything written that quite gets at what I'm looking for. Bach&Bach's session-based test management is something like it, since the short sessions force a pause for course correction. Bach, Kaner, and Bolton have written well on using risk to guide testing and on particular heuristics to use when strategizing where to test next. Elisabeth Hendrickson has some techniques particularly good at breaking people out of mental ruts.

But somehow, and it might just be me, these things seem on a larger scale than "eliminate duplication." While coding, I can use that heuristic to choose the very next thing I do, the next keyboard gesture I make. After a time, it becomes a perceptual thing as much as a cognitive one. (I think it's no accident that the phrase "code smells" is so popular. It helps toward the useful goal of removing right action from the realm of conscious decision to the realm of instant expert action.) I wonder what the equivalent in exploratory testing is, and has anyone written it down?

P.S. I learned about the book Sources of Power (the previous link goes to a review of it) from Rachel Davies.

P.P.S. Blogging is light because writing energy is going into Scripting for Testers. Today: Test::Unit.

## Posted at 09:28 in category /testing [permalink] [top]

Tue, 19 Apr 2005

For US residents, mainly


## Posted at 05:00 in category /misc [permalink] [top]

Fri, 15 Apr 2005

Clams got legs!

Not exactly, but two species of octopus walk - nay, scurry - on two legs. Video here.

## Posted at 10:53 in category /misc [permalink] [top]

Design-Driven Test-Driven Design (Part 4)

For background, see the table of contents in the right sidebar.

The story so far: I got a Fit test describing a user workflow through a medical records system. I made part of that workflow work using a Model View Presenter style and Mock Views. I'm now ready to start building the real views that will talk to the Macintosh's GUI framework.

Here's my normal practice when driving coding with Fit business-facing examples: I constantly want to make the next cell of the Fit table green. When I know that will take too long for comfort, I divide-and-conquer the problem by breaking the interval into shorter ones punctuated by xUnit green bars. There are some disadvantages to treating xUnit tests as secondary to business-facing tests, but that's the way I do it.

For this task, I'm doing something similar. In the previous episode, I got to a green Fit cell, creating two Presenter objects, two Mock Views, and two abstract View classes along the way. Now I'm going to flesh out the meaning of statements like "choosing Betsy with owner Rankin". The interaction designer's wireframe sketch shows that done through a list. But when I make that list, a host of questions pop up, like "When you add a new case, is it automatically selected in the list?" Talking about properties of particular UI elements seemed like a job for unit-size tests.

It was also painfully obvious that I needed more UI. There's a place in the wireframe to select a case, but no place to add one. For the sake of expediency, I hung that UI off the side of what I was given, well aware that I'll probably change it later. Because I've never programmed to this Macintosh UI framework, I made a simplified UI for my first leap into the unknown. The end result looked like the picture on the right.

I now worked through the Fit workflow, adding Macintosh detail to each step. At first, I began each step with a spike so I could find out what kind of messages flow from the UI to my Views. (Yes, I could have read all the documentation - think of the spike as guiding me through the reference manuals.) For my UI's Views to be really thin, there has to be a one-to-one correspondence between messages from the UI and the messages flowing to the Presenter. For example, here's what happens when a user presses TAB or ENTER in a text field:

public void animalNameEntered(Object sender) { /* IBAction */

As I learned more, I got bolder about unit-testing behavior into existence with a Mock View and only then making that behavior work with the real View. Sometimes I regretted that, discovering that my assumption about what the UI must do was wrong.

It was a little harder than usual to muster enthusiasm for the junit tests. For example, when I wanted to make the Presenter send a "highlight row 0" message to the table, I wrote a test that had the Presenter send that message to the Mock View and then make checks like these:

    assertEquals(0, inpatientView().getHighlightedRowIndex());

... and then I had to go make the Mock View remember the message was called so that my test could ask about it. Seems like a lot of work to drive one silly message from the View to the UI. (Perhaps a mock-builder would help?)

This bothers me because I have a principle that if programmers are finding testing annoying, that's a problem to fix. One way to fix it is to make the work provide value to new people.

Consider: I was working on the code that handles entering a patient name, entering an owner name, and clicking the "add case" button. At that point in implementation, it was natural for me to ask an imaginary UI designer about error cases. What should happen if you hit the button before typing in an animal name? What happens if you hit the button twice? Etc. "We" decided not to code up some of the error cases right away. But some we did. Suppose my designer had said that pressing the Add Case button without both fields filled in should produce an error. Here's a table that might record our conversation:

behavior when Add Case is pressed
animal field owner field   result? attention set to? animal field now has? owner field now has?
typed in typed in   new case added to case list case list, on the new case nothing nothing
typed in

left empty

  error owner field same as before "owner?", selected

left empty

typed in   error animal field "animal name?", selected same as before

left empty

left empty

  error animal field "animal name?", selected "owner name?"

The reason I like this is the reason I like being systematic. When you're constructing unit tests one at a time, it's easy to overlook a situation like both fields being empty. Suppose I'd only thought of the first three tests, in that order. My checking code would almost certainly first check for an empty owner name. So the behavior for that situation would also be the behavior for the situation where neither name was given. That's actually bad. Because the owner entry field follows the animal entry field, there's a fair chance the user would respond to the error by entering the owner name, hitting Enter, then getting annoyed by another error message. It's better for her to be directed to the first field in the tab order.

That particular example isn't a big deal, but sometimes the case you overlook is a very big deal indeed. By making a quick table, I not only increase the chance of thinking of all important cases, I also reduce the chance of overlooking a kind of result (like what gets highlighted).

Now, as it happens, my imaginary designer chose a different way to solve the problem. The Add Case button should be greyed out until both names are available. That behavior was driven by unit tests like these:

public void testCreationRequiresBothNames() {
    assertEquals(disallowed, inpatientView().getAddCaseCommands());

    inpatientView().animalNameEntered("animal data");
    assertEquals(disallowed, inpatientView().getAddCaseCommands());

    inpatientView().ownerNameEntered("owner data");
    assertEquals(nowAllowed, inpatientView().getAddCaseCommands());

public void testEnteringCaseErasesPreviousNames() {
    assertEquals(disallowedAgain, inpatientView().getAddCaseCommands());

So. I didn't think of making that table; I wrote jUnit tests instead. I'm tempted to keep coding. But the point of this exercise is not to produce a program, it's to learn stuff. So what I'll do is back up, write some Fit tests, and ask myself questions like these:

  • How much more painful is it to drive the code from those tests than from JUnit tests?
  • Is a conversation that turns into such Fit tests better for the customer than one that turns into JUnit tests?

(Note that throughout I'm assuming the hypothetical Good Fit Editor, one that makes creating and modifying tables as easy as creating and modifying a small RTF document with your editor of choice. We need a Good Fit Editor!)

As usual, you can find the current code in zip file. There are two separate projects - the core code and the Cocoa interface. I'd rather they were all in one directory structure, but I couldn't get IDEA and Xcode/IB to play nice together.

## Posted at 10:42 in category /fit [permalink] [top]

Tue, 12 Apr 2005

Usability errors in medical interfaces

Jakob Nielsen points to an interesting study on life-threatening errors induced by poor user interfaces. I can well believe it. Unless things have changed in the past ten years, my wife (the veterinarian) has to deal with drugs whose standard doses are given in one of three units of measure. And yes, she's stopped students from giving animals huge overdoses due to conversion errors.

## Posted at 08:06 in category /misc [permalink] [top]

Wed, 06 Apr 2005

Introducing Agile to a legacy project

I was talking with a potential client about what I might do for them. Suddenly I realized I'd had this conversation before. So I decided to write down my talking points.

I'm almost always contacted about testing in an Agile project. My biases about that are described in "A Roadmap for Testing on an Agile Project". A problem with that roadmap is that it really assumes either a greenfield project or an Agile project that's well underway. But some people I talk to are just starting with Agile, working on a legacy code base that really needs cleaning up, and don't have much in the way of testing. That makes that roadmap less applicable.

My talking points follow. Notice how few of them are about testing. When I first started consulting with Agile projects, I tried - well, some - to stick to my testing knitting. But either it made no sense to wall off testing from other concerns or the existing wall was clearly a problem. So now I stick my nose into all sorts of business.

  • Have at least half the programmers read Michael Feathers' wonderful Working Effectively with Legacy Code before bringing me in. It describes many tricks and ways of thinking for getting your arms around legacy code - mainly by wrapping it with tests. And if you bring Michael in instead of me, that's fine.

  • Be wary of large-scale code cleanups. Ron Jeffries has an excellent analogy: cleaning up legacy code is like cleaning up a horrendously messy kitchen. You can set aside a day, work like a dog, and leave it pristine. Great. What happens next? It gets dirty again, dish by dish. You don't have the habits that help you keep it clean.

    The better alternative is to clean up gradually. Every day, wash the dishes you dirty, plus a couple more. Over time, the kitchen will get clean - and you'll have the habit of immediately cleaning up your mess.

    A dirty kitchen has an advantage over software projects: you can see it getting cleaner. When projects declare that they're going to spend the next N months making the code right, then get back to adding features, that's N months where no one outside the project can see anything happening. That's an unstable situation, and those who pay the bills quite often put a stop to the cleanup partway through - which usually means the code is not a whole lot more tractable.

    For that reason, I recommend weaving cleanup into the delivery of business value. Suppose you're fixing a bug. You might have occasion to look at several classes as you follow the execution path that leads to the failure. Leave each of those classes slightly better than you found it, even if the change has nothing to do with fixing the bug. Over time, the system will get better, and you'll have the right habits.

    That latter point is important. Being a good Agile programmer means learning how to incrementally grow a good design while at the same time doing something else (like adding features). You don't learn how to do that by doing something different, like a rewrite.

  • Suppose you have a team that's not doing programmer testing, especially not doing test-driven design. I've come to believe the test-writing part of TDD is actually easier to learn than the refactoring part. A surprising number of programmers have no visceral dread of duplication, especially its more subtle forms like boolean parameters. Or they think of picking intention-revealing names as just a help for someone later, not a tool for thinking more clearly about the code in front of them. And so on.

    So the team has to commit to learning those and other similar things. The project has to explicitly become a learning project. I recommend pair programming and colocation as ways of spreading learning quickly. Use Big Visible Charts to nudge people toward more learning. (For example, consider putting up a list of Feathers' tricks and having people initial those they've used.) When people do some clever cleanup, they should announce it at the daily standup and offer to demonstrate. Actually, I'd rather they be so jazzed about what they've just done that they immediately grab someone to show. Even working on legacy code, even in learning mode, an Agile project should have a buzz.

  • When the project is ticking over smoothly, programmers will test out of desire, not duty. They will be nervous enough about not having a strong safety net of tests that they will make sure it's always strong. But getting there can be tough. You're not starting with a safety net, so the first tests you write won't help much. And writing them will be slow and painful because of all the dependencies. (See Feathers' book.) It's easy to give up halfway. All I can recommend is to consciously (if subjectively) track whether things are getting better. If they're not, do something about it.

  • It probably makes sense to pursue programmer testing and automated whole-product testing at the same time. Since the whole-product tests test from the outside, they're less affected by a rat's nest of dependencies. If they're faster to write, they'll provide confidence sooner. But avoid having certain programmers assigned to whole-product test automation. Testing should belong to everyone on the team, and everyone should be willing and able to add new tests. All the programmers should be ready to extend the test harness or other test support code.

  • Testers on the team should focus on stabilizing the product. It is not enough for tests to be good at finding bugs; they must also support the programmers. Mainly that means they should be automated, can run on any machine, and run quickly. You want the programmers to run them frequently to bolster their confidence as they change code.

    When the programmers sign up to work on a task, that means the testers are also signed up to create tests that stand a good chance of finding unintended consequences of the particular changes the programmers are going to make. That will involve much more communication than before, including pair work. Therefore, if the programmers are in a bullpen, the testers should be there too. Also, the testers should teach the programmers how to "think like a tester" so that they can avoid bugs in the first place.

    In this way, the testers serve the programmers (a role some are uncomfortable with). In return, the programmers should be prepared to make changes to the code that are directly in service of making whole-product tests easier to write. (If programmers are unwilling or grudgingly willing, the team has a task: turning duty into desire.)

    Where there's a Customer or product owner on the project, the tester also serves her. One way is to help her express what she wants in a clear form that can be turned into checkable examples (that is, tests). The tester should also try to think of questions the programmers aren't asking. That implies boning up on the business domain. (Testers often already know it well, if idiosyncratically). I often think of testers as shepherds of the conversation between Customers and programmers. It's a more social role than a lot of testers have had before.

  • Sometimes teams are jumping into Agile to avoid the horror of the previous release. The code was too buggy, or too late, or cost too much per feature, or all three. In such a case, the programmer team is probably not trusted by the business people. If so, trust is an essential deliverable. It's not enough to be better; you have to be visibly better soon. Delivering tested, working features at frequent intervals is a key way to get trust back. Another way is close cooperation with a product owner that demonstrates that the team's orientation is toward helping her meet her goals. But more generally, the team should pay active attention to how well they're doing at building trust, not just at building code.

That's all I can think of now. What else?

## Posted at 21:01 in category /agile [permalink] [top]

array[array.size], (car nil), and a certain sense of balance

Last night, I was disappointed by a bug. It was in some Python code I was writing for the Agile Alliance web site. In Python, indexing one past the end of the array throws an error. In Ruby, it returns nil. Since I'm used to Ruby, I unconsciously took advantage of its behavior.

To fix the problem, I added an if statement. A small price to pay, a Pythonist might say, for conceptual clarity. How is it sensible to talk about the value of something that isn't there?

Perhaps so. But I was at that moment reminded of a wonderful email. I think it was from George Carrette to some Common Lisp mailing list, circa 1984. He wrote of a dream in which he was inspired to eschew all the hackish shortcuts of regular Lisp and translate a program into Scheme. First he stopped taking advantage of the way traditional Lisp lets you ask for the first element of an empty list (that is, (car nil) is nil). Then he stopped treating "false" and "the empty list" as the same thing. Etc.

As he went on, the code kept getting bigger and not so nice, but he maintained the pretense of being an enthusiastic convert to logical purity. It was a tour de force of sarcasm, I thought, and I'd love to get a copy if anyone saved it.

That makes him seem just snarky, but a Google search shows Carrette also involved as a contributor to the Scheme intellectual family tree. The Great Lispers of the Past had that ability to flip rapidly between the ideal and the practical: understanding the power that comes from conceptual simplicity and uniformity, but also realizing that there are non-logical special cases with pragmatic value. Able to be at the most ethereal levels (the Y combinator) at one moment, in amongst PDP-10 assembly code the next.

It's a balance I'd like to have.

(See also the Null Object pattern. How often do null objects actually make sense when you think about them, and how often are they just ways to remove if statements?)

## Posted at 07:20 in category /misc [permalink] [top]

Sat, 02 Apr 2005

A Big Visible Chart for a big visible belly

In my mid twenties, I weighed about 190 pounds. Then I had an early midlife crisis and got into shape, using the novel method of eating less and exercising more. Now, twenty years on, what with kids, age, some chronic injuries, and resurgent gluttony, I'm back at the start. I've resolved that this shall not stand, but the past few months have shown that I need some extra oomph behind the project.

What do Agile teams do when they need some constant added pressure to do as they know they need to do? They make the important facts widely visible. So I will do the same. In the header of this blog, I'll post a running record of what the scale shows for me + hiking boots (to ease achilles tendonitis when stair-climbing). This blog gets around 120,000 hits per month. Even though the vast majority don't have an actual human behind them, there are enough that my pride will not let me fail.

Twenty years ago, losing two pounds a week was comfortable, so that shall be my progress goal. My lowest weight was 157, but I had absurdly low body fat. This time I'll shoot for 167.

## Posted at 19:30 in category /misc [permalink] [top]

Cocoa and Java

Is it possible to get to a place where you can comfortably use Cocoa and Java, or is the path always full of rocks?

<boring_tale_of_woe>I am finding plugging a Mac interface on my Jar file no fun. I am distressed that too often building produces mysterious behavior that goes away when I hit 'clean all' first. It's bizarrely cumbersome to include a jar file in the project. The link between the Nib and whatever other magic is involved in launching got so scrozzled that even backing up all the way to the Nib and regenerating all the Java sources yielded an executable that couldn't find the java UI objects. I had to start completely from scratch, redraw the UI, generate the Java sources, and paste in the code from the previous version.</boring_tale_of_woe>

Interface Builder is cool, but once past that things go downhill. I'm particularly wondering if the whole system is fragile in the face of lots of renaming of methods and redrawing of interfaces.

<boring_tale_of_woe>For example, the runtime once unilaterally ceased being able to find the action method addCasePressed. I had to rename it addCasePressed2, change the name in IB, and then it found it fine.</boring_tale_of_woe>

## Posted at 19:30 in category /mac [permalink] [top]

Wed, 30 Mar 2005

Design-Driven Test-Driven Design (Part 3)

For background, see the table of contents in the right sidebar.

The story so far: the way methods have clustered in a Fit DoFixture has persuaded me that I need to make some new classes. Heaven help me, the image that comes to mind is viral particles budding off a cell. (The image on the right shows flu viruses.) I also know that I want to move toward the Model View Presenter pattern, because it should let me get most of the UI code safely under test. And I'm driving my work using scripts / tests of the sort a User Experience (UX) designer might write.

The first time I worked through this script, I created a single class per method-cluster. For the methods that referred to the "patient detail interaction context", I created a PatientDetailContext class. The fixture would send it commands like chooseToEnterANewRecord and query results with methods like thePatientDetailContextNames.

The problem was that no test ever drove me to split that one class into the View and Presenter classes. If you think about it, that makes sense. In order to preserve design flexibility and keep detail from obscuring the essence, the test script I'm working from is deliberately vague about the details of the user interface. Since Model-View-Presenter calls for the View to do little more than relay low-level GUI messages over to the Presenter, the programmatic interface between the two (which I think of as like the mitotic spindle in cell division - what's with me today?) has to name a bunch of detail that the script leaves out. Where can that detail come from?

I think it has to come from my knowledge of my end goal. So in this do-over, I'll start by assuming such low-level messages. But rather than jump into two classes right away, I'll start by budding out the Presenter and have the DoFixture act as the View (the self-shunt pattern).

Let's take two steps in the script to work from:

choose patient Betsy with owner Rankin
now the patient detail context names Betsy

Here would be the object structure and pattern of communication for the first line:

A portion of the DoFixture acts as a kind of mock View for the Inpatient Presenter, pretending to respond to user actions. Its implementation of choosePatientWithOwner looks like this:

    public void choosePatientWithOwner(String animalName, String ownerName) {

Those look rather like UI messages. The Inpatient Presenter knows that when "Choose Patient" is clicked, it should tell the Patient Detail Presenter about it. The Patient Detail Presenter in turn has to update its View. Its View is, again, the DoFixture. So the Presenter calls a new method on the DoFixture, setPatientName:

    // patient detail view
    private String patientName;

    public void setPatientName(String patientName) { // called by presenter
        this.patientName = patientName;

Now when the next test line claims that "now the patient detail context names Betsy", Fit calls a query method that returns the right result:

    public String thePatientDetailContextNames() {
        return patientName;

That works (given the right code in the presenters). Having gotten green, I can now refactor. There are two uglinesses in this code.

  1. Here's the top of the file that declares the InpatientPresenter:

    package com.testingthought.humble.ui;

    import com.testingthought.humble.fixtures.dofixtures.InteractionDesign;

    The Presenter isn't a testing class - it lives with other UI classes - but it depends on a testing class.

    That's easily fixed by creating an interface InpatientView that's in the UI package, having the Presenter depend on it, and having the DoFixture implement it.

  2. The DoFixture is now talking about detail like text entry fields and button presses. That kind of thing obscures what I think an interaction design DoFixture ought to be: a facade over a collection of cooperating presenters and views. It shouldn't do work itself - it should just tell other objects to do work.

    (In general, I'm starting to think of DoFixtures as adapters that translate between object-speak and procedural-speak. In procedural-speak, you don't want to manage specific pointers to objects. Instead, you want to refer to them in a looser way. For example, sometimes you want to refer to them implicitly, trusting the reader or DoFixture to know which one you meant.)

By an amazing stroke of luck, I can now extract the View-ish methods into full-fledged mock View objects that implement my new interface. They will:

  1. take interaction design actions and turn them into GUI-level messages to the Presenter.

  2. accumulate GUI-level messages from the Presenter and remove detail irrelevant to an interaction design.

Like this:

I won't show the code itself, but you can find it in a zip file. As before, the build.xml file in the top level runs the Fit tests. To read it, start with the DoFixture's choosePatientWithOwner method and trace through the calls.

We've now achieved the View/Presenter separation. The mock Views do conversions. The "from" part of the conversion comes from the Fit tests, but where does the "to" part of the conversion come from? What tests directly drive the the decision about what messages to send from Views to Presenters? Well, what always drives the coding of classes distant from the "top" of the system? - unit tests.

That's kind of an interesting inversion. Usually, we think of the code that handles UI widgets as being on the outside of the system, the part closest to the user. But here that code is utility code, not really different from database access code. By burying utility code behind a layer (be it a persistence layer or a DoFixture derived from an interaction design), we let people thinking about larger issues ignore the grotty technology that lies behind the business value of the system.

Next: interlacing Views and mock Views. Plus, I finally get around to learning to program Cocoa.

## Posted at 08:21 in category /fit [permalink] [top]

Tue, 29 Mar 2005

Improv Everywhere

Via the Good Experience newsletter, I got a pointer to pranksters Improv Everywhere.

Improv Everywhere causes scenes of chaos and joy in public places [...] Improv Everywhere is, at its core, about having fun. We're big believers in "organized fun". In the process we bring excitement to otherwise unexciting locales and give strangers a story they can tell for the rest of their lives. We're out to prove that a prank doesn't have to involve humiliation or embarrassment; it can simply be about making someone smile.

I like this one best so far.

## Posted at 21:11 in category /misc [permalink] [top]

Sat, 26 Mar 2005

Design-Driven Test-Driven Design (Part 2)

For background, see the first installment
and the table of contents in the right sidebar.

When working on a Fit test in the "flow" style (step-by-step tests), it's my custom to start by creating empty methods for each of the steps in the test. I do that because I usually find something I don't like about the test, and I'd rather discover that early.

Here's the resulting Fit output:


At the beginning of the day, the caregiver navigates to the the SOAP context and fills in today's SOAP.

the starting context is inpatient
choose patient Betsy with owner Rankin
now the patient detail context names Betsy expected
should return a patient actual

navigate to the SOAP context
etc. etc. etc.

As I was writing the Java code for this most rudimentary DoFixture, I noticed that the methods fell into groups. In what follows, I've separated those groups with rules.

package com.testingthought.humble.fixtures.dofixtures;

import fit.DoFixture;
import fit.Parse;

public class InteractionDesign extends DoFixture {

    // --------------
    public void theStartingContextIs(String contextName) {

    public void navigateToTheContext(String contextName) {

    public String theContextBecomes(String ContextName) {
        return "some visibility";

    // --------------
    public void choosePatientWithOwner(String animalName, String ownerName) {

    // --------------
    public String thePatientDetailContextNames() {
        return "should return a patient";

    public String thePatientDetailContextShows() {
        return "should return a record";

    public void chooseToEnterANewRecord() {

    public String bothTheContextAndTheContextAre(
            String firstContext, String secondContext) {
        return "some visibility";

    // --------------
    public void recordThatTheAnimalIs(
            String threeCharacteristics) {

    public void recordThatItsTemperatureIs(String value) {

    public void indicateThatTheSOAPIsFinished() {

    // aliases for check - this is a clever hack due
    // to Rick Mugridge.
    public void now(Parse cells) throws Exception {

    public void noteThat(Parse cells) throws Exception {

Except for the first one, the different groups talk about different interaction contexts. I could add comments to explain that, like this:

    // navigation
    // inpatient context
    // patient detail context
    // soap entry context

But that would be wrong. To my mind, all the groups but one are crying out to be extracted into classes. Those classes sure look like they'll become the Presenter objects that are going to implement our interaction contexts. I'll extract them next time.

In the meantime, I've packaged up everything in a zip file. Feel free to fiddle. There's an ant build file in the top-level directory. The default target compiles any changed files, runs junit, then runs Fit. (I like to run junit before Fit, figuring that there's no point in running Fit if any junit tests are failing.) All the jar files you need should be in the jars directory.

The test results for each HTML file in the fit-tests directory are in a file with the same name in the fit-test-results directory.

## Posted at 16:54 in category /fit [permalink] [top]

Guy Steele, Tester

Ever since I read the "lambda the ultimate" papers back in my Lisp days, I've been awed by Guy Steele. So I read this with interest:

He also has an ability to focus very systematically on what he describes as "nits and corner cases" -- an ability that came in handy when he was asked to co-write the specification for the Java language.

"I pestered James [Gosling] with lots and lots of questions," he recalls. "How does the language behave when you write this particular statement, even though you'd never think of writing it in a real program?"

His aim was to eliminate unintended consequences.

"I made a big matrix," he says. "The rows were the places you could use a type [a description of the set of values a variable can take on] and the columns were the kinds of types you could write. Then I checked each entry in the matrix to make sure the specification addressed what happened in that case.

I was pleased to read that because one of my habits is to take any state diagram I get and turn it into a state table. The state table contains a square for every event in every state. It forces you to question what really happens in that case. (And you should be careful not to leap to the conclusion that "it's impossible".) State diagrams, in contrast, make it easier not to think about a case - a missing arc is much less visible than an empty cell.

Such trudging-through-gruntwork is characteristic of testers, but it needn't be isolated to them. It should be a property of the team. I post the above quote to make it more glamourous.

Of interest to those who promote the idea that programmers should practice, as musicians do, is this:

Steele, a 10-year Sun veteran and winner of the 2005 Dr. Dobb's Excellence in Programming Award, has written half a dozen programming languages that exist simply as folders in his filing cabinet.

"Designing technically competent programming languages is not that difficult," he says.

To him they're like the finger exercises he used to do when he played the piano -- a way to learn.

A final note: he has a huge shower, in which he spends about twelve hours a day. I don't absolutely know that, but I deduce it from the time I heard him say he only gets good ideas in the shower.

## Posted at 12:28 in category /testing [permalink] [top]

Thu, 17 Mar 2005

Design-Driven Test-Driven Design (Part 1)

At the Canadian Agile Network workshop earlier this week, Jeff Patton had what we think could be a great idea:

  • The conceptual connection between Interaction Design and programming can be the Presenter class (from Model View Presenter).

  • The performative connection can be Fit DoFixture scripts.

To illustrate, I'm going to walk through a user experience (UX) design from the beginning, through workflow design, into a Fit representation of a workflow, finishing off with test-driven design using Fit and JUnit. This recapitulates and extends what Jeff and I did during an action-packed 26 hours in Calgary.

We pretended we were constructing a medical records system for my wife (something close to her heart right now). We went through Jeff's normal UX process. He was the UX designer; I filled in for Dawn and the other users.

First, we talked about system goals, together with measurements. There were two goals:

  1. Make tracking of cases easier without increasing clinician work, measured by clinician time spent on writing medical records.

  2. Reduce the time to respond to medical records requests, measured by the time it takes the records manager to collect complete patient records.

We talked about where the software would be used, its usage context. One place is pretty typical: a reception office where secretaries meet clients. The other isn't: the patient ward. Since this is a ward for large animals, it's rather noisy, messy in spots, and has limited computer access.

We developed personae representing four kinds of users. Dawn is a representative clinician (teaching professor), Jamie is a representative caregiver (who can be either a clinician or a medical student), Hazel is a secretary who handles the creation of records and things to do with money, and Bill is a record keeper who tries to respond to data requests in the face of caregivers who are, in his view, insufficiently appreciative of order.

These personae would be turned into big pictures on the wall of the project bullpen. Acting as Big Visible Charts, they'd constantly remind the team of who they were building the system for. When the team had underconstrained decisions to make about the user experience (which will always happen), the posters would help them make decisions the users would like. In the absence of a reminder, people tend to make decisions that they themselves would like. Trust me when I say that veterinarians are different from programmers.

I then brainstormed a list of tasks that the different kinds of users have. I wrote them on cards. Jeff asked me to arrange the cards so that tasks done by similar people at similar times are close to each other (an affinity diagram). You can see from the picture that we picked only a small subset of the tasks that would be supported by a real medical records system.

Each cluster of cards can represent a "place" in the software that people visit to do a set of related tasks. In the jargon, such places are interaction contexts. From the set of places, you can create a navigation map that shows how people get from place to place and a wireframe diagram of a place's visual appearance.

Jeff's wireframe is patterned after a typical email client. The top pane contains a list of cases (like a list of email senders & subjects). Click on a case, and the bottom pane fills up with information about it. The bottom pane has tabs because there are three kinds of information caregivers will want to look at. Sometimes they'll be filling in that morning's SOAPs (subjective [observations], objective [observations], assessment, and plan), sometimes making notes throughout the day about observations made or orders given or orders filled, sometimes digging into a case's history.

I then wrote an example of how a caregiver would use that interaction context. The tricky thing was to write it so that it supported navigation and key tasks while remaining resistant to changes in the wireframe. The script below will be easier to follow, probably, if you substitute "pane" for "context". I used bolding to highlight changes in contexts and italics to draw attention to what happens after a user action.


At the beginning of the day, the caregiver navigates to the the SOAP context and fills in today's SOAP.

the starting context is inpatient
choose patient Betsy with owner Rankin
now the patient detail context names Betsy

navigate to the SOAP context
now the patient detail context shows yesterday's SOAP record
choose to enter a new record
now both the SOAP context and SOAP entry context are visible
(The SOAP entry context lets you enter today's SOAP while still viewing yesterday's. On a Mac, for example, the entry panel might slide out from the side of the main window.)

record that the animal is BAR
record that its temperature is 99.5
indicate that the SOAP is finished
now the SOAP entry context becomes invisible
note that the patient detail context shows today's SOAP record

That's both an natural-language description that a veterinarian can follow (I just checked) and a DoFixture script that a programmer can implement. The next installment will talk about how.

## Posted at 19:33 in category /fit [permalink] [top]

Mon, 07 Mar 2005

That pernicious sense of completeness

Here's a mistake that seems easy to make.

  1. You have a large test suite. In any given run, a lot of the tests fail. Some of the tests fail because they are signaling a bug. But others fail for incidental reasons. The classic example is that the test drives the GUI, something about the GUI changes, so the test fails before it gets to the thing it's trying to test.

  2. Because so many of the failures are spurious, people don't look at new failures: it's too likely to be a waste of time. So the test suite is worthless.

  3. Someone comes up with an idea for factoring out incidental information, leaving the tests stripped down to their essence. That way, when the GUI changes, the corresponding test change will have to be made in only one place.

  4. Someone begins rewriting the test suite into the new format.

It's that last step that seems to me a mistake, in two ways.

  1. The majority of the tests don't fail. There's no present value in rewriting such a test into the new format. There's only value if that test would have someday failed because of some GUI change.

    Rather than a once-and-for-all rewrite, I prefer to let the test suite demand change. When a test fails for an incidental reason, I'll fix it. If it continues to run in the old format, I'll leave it alone. Over time, on demand, the test suite will get converted. And in the steady state, new failures are worth looking at. They're either a bug or a reason to convert a test.

  2. The "convert them all and get it over with" approach also falls prey to what James Bach has called the "wise oak tree" myth. There's an assumption that each test in the test suite is valuable just because someone once found it worth writing. But what's worth writing may not be worth rewriting.

    If you're examining tests on demand, it's easier to make a case-by-case judgment. For each test, you can decide to fix it or throw it away. Does this failing test's expected future value justify bringing it back to life?

For more on this way of thinking, see my "When should a test be automated?" (pdf). Some of the assumptions are dated, but I'm still fond of the chain of reasoning. It can be applied to more modern assumptions.

## Posted at 07:22 in category /testing [permalink] [top]

Thu, 03 Mar 2005

A last call for authors

Because of my writing and consulting load, I'm going to scale back on my editing for Better Software magazine. As soon as possible, I'll be working on one article per issue.

In the past, I've been more or less guided by an editorial calendar, looking for articles on certain topics. In the future, I'll be looking for articles on any relevant topic (but keeping in mind the need to have balance over an entire year). What I'm looking to help produce is a steady stream of the kind of articles I keep telling clients they should read. Those usually fall into two categories:

  1. Gathering Tide articles introduce early mainstream people (Geoffrey Moore's pragmatists) to ideas and techniques that have been proven out by the early adopters (Moore's visionaries). Jeffrey Fredrick's recent article on continuous integration is an example. Continuous integration isn't that big a deal any more: CruiseControl is solid, a lot of people know how to use it, people who don't can go to a book (Mike Clark's Pragmatic Project Automation) to find out, and there are articles (like Fredrick's) that give beginners important tips so that they don't have to learn them painfully themselves. Those are the ingredients that reassure pragmatists that it's time to adopt the new idea. I want articles that tell pragmatists of an idea they haven't heard of yet, and tell them in a persuasive enough way that they give it a try - or at least investigate further.

  2. Try This At Work articles are for people who want specific techniques, explained well, that they can put into practice soon. My own article about using Ruby to test a product with a SOAP interface is an example.

So if you have an idea for an article, contact me. Articles can be of two lengths. Feature articles are 2500-3000 words (which is too short to treat anything but the narrowest topic in depth). Front Line articles are 1200-1500 words. They tell a story of something you did that you learned from. They'll typically start with the story, tell the specific lesson, and maybe generalize out from there.

## Posted at 21:13 in category /misc [permalink] [top]

Ten most influential computer books of the past ten years

I was asked to make a list of the above. My first reaction was "How should I know?" But then I figured I could at least list books that I believe have been influential in my circles. Here they are.

    Design Patterns by Gamma, Helm, Johnson, and Vlissides (1995)
    Although the true publication date puts it too early for this list, I'm going to include it because it marks the beginning of an important trend: that of programmers drawing attention to what they repeatedly do, on a rather small scale. It's the beginning of a shift away from grand theorizing to observation on the ground. It laid the groundwork for...
    Refactoring, by Martin Fowler (1999)
    What an absurd idea: a catalog of how to change code so it does the same thing as it did before. But this book led to refactoring IDEs, which have given programmers enormous power to shape programs like a potter shapes clay. It also made a revolutionary claim: it's good to do things over.
    The Pragmatic Programmer: from Journeyman to Master, by Andy Hunt and Dave Thomas (1999)
    Once, people really did think that programmers needed to know only programming and design languages plus a few big ideas. (Build software like bridges!) Book like those above chipped away at that. This book capped the trend by unequivocally treating programming as a craft. We need no longer long to be engineers.
    Extreme Programming Explained: Embrace Change, by Kent Beck (1st edition 1999)
    This book grew out of the same trend, but strongly emphasized two additional ideas: that programming is a social activity, and that programmers gain freedom by shaping themselves and the code in response to the arbitrary demands of the business. It also spurred the coalescing of a variety of underground methods into a visible alternative to conventional software development.
    Agile Software Development, by Alistair Cockburn (2001)
    The best summing-up of what the Agile methods have in common, concentrating on the social. Many people who haven't read the book have been infected by ideas someone else learned from it.
    Programming Perl, by Larry Wall, Tom Christiansen, Jon Orwant (2000, 3d edition)
    This book's influence is tightly tied up with the influence of Perl itself, which allowed many nonprogrammers (like testers) to automate what they otherwise wouldn't have, blurred the boundaries between systems and quick 'n' dirty scripts, and paved the way for a reconsideration of dynamically typed languages that unashamedly favor programmer power over execution speed.
    Working Effectively With Legacy Code, by Michael Feathers (2004)
    Relentlessly practical, "WELC" is, to my knowledge, the first book to successfully attack the problem of how to shape programs that have already hardened into an ugly and unmaintainable form. Just published, it hasn't had time to be hugely influential, but it will be.
    Lessons Learned in Software Testing, by Cem Kaner, James Bach, and Bret Pettichord(2001)
    This is the testing book for non-testers to read. It treats software testing as it is, not as it should be - and it shows that testing as it is, if treated seriously, can be very good indeed.
    UML Distilled, by Martin Fowler (1st edition 1997)
    A wonderful example of telling less than you know, this book is the one for the person who wants UML as part of her toolkit, but doesn't intend to make a way of life out of it. I've heard that its unexpectedly good sales made thin computer books respectable again, reason enough for inclusion on this list.
    Structure and Interpretation of Computer Programs (1996, 2nd edition)
    OK, this is cheating, since only the second edition was published within the last ten years. But it's an enduring description of how simple, powerful ideas build upon themselves - something easy to forget, but the reason computers are both so marvelous and so useful.

(I distrust lists with ten items. I always suspect the author squeezed something out, or strained to come up with a last item, just to hit that magic number. By coincidence, I really listed ten, first try.)

## Posted at 20:07 in category /misc [permalink] [top]

More on Fit style

Jim Shore has some thoughts about my earlier posting on Fit style. I'm intrigued by the idea of using, uh, relational verbs like "is" and "has" to make Fit statements look more like they're for understanding than for testing.

## Posted at 20:03 in category /fit [permalink] [top]

Fri, 25 Feb 2005

Fit style

Requirements documents or specifications explain how a program is supposed to behave for all possible inputs. Automated tests explain how a program is supposed to behave for certain possible inputs. The understanding gained by reading tests duplicates some of the understanding gained by reading documents. Duplication is (often) bad. One of my goals is to find out how to write and annotate tests so that the redundant parts of those other documents can be eliminated.

Fit has potential for that because the test tables can be embedded in whatever HTML you like. Rick Mugridge's Fit Library increases that potential by providing an improved set of default tables. But we still have to realize that potential by using them well. I've been exploring how. Here's an example, patterned after a table I recently wrote for a client. I have some comments about the style after the table.

A person reading this page would come to it knowing some context. She would know that things called "nodes" are arranged hierarchically. (In the original table, what I'm calling "nodes" were something less abstract.) She would know that nodes are sometimes visible, sometimes invisible.



Making a node invisible makes its descendants invisible, no matter where the search begins.

  • node 1
    • invisible node 1.1
      • node 1.1.1
check that visibility from invisible node 1.1 is (nothing)
check that visibility from node 1.1.1 is (nothing)
check that visibility from node 1 is node 1

Siblings are not affected.

  • node 3
    • invisible node 3.1
    • node 3.2
      • node 3.2.1
check that visibility from invisible node 3.1 is (nothing)
check that visibility from node 3.2 is node 3.2, node 3.2.1
check that visibility from node 3 is node 3, node 3.2, node 3.2.1

  • First, notice the little Java class name up at the top right. That's a table with the border turned off. The class is a DoFixture, and it will interpret all the other tables on the page. I've made it small and out of the way because that name has no meaning for the business. It's technology-facing, and I want business people to quickly learn not to notice it.

    This is to be the only technology-facing name on the page. I think that's important.

  • The next table is a row in the DoFixture that does setup for the test. Our friend Rick has written code that turns bulleted lists in a cell into a tree data structure. That's just what I need, so I use it.

    I could have made this setup table a ColumnFixture or Rick's new SetupFixture, but both of those would have required more in the table. I will only grudgingly add non-data words to a test. They make it harder to read (usually).

  • The next set of sentences are more DoFixture rows. I've again turned off the border, this time because I don't want that ornamentation to distract the reader. I want the checks to look like sentences you'd read in a textbook example. (It would be better if "check" could be written "observe that". Maybe I can talk Rick into that.)

    I did, however, follow the convention of making the non-data words in italics as a way of emphasizing what's data. (I left "check" in non-italic font because it's an important signal to the reader.)

  • But wait: Those check lines violate the "I will only grudgingly add non-data words to a test" rule. Why all those repetitions of "that visibility from "? Would it be better to put the checks in a ColumnFixture?

    Given this hierarchy:

    • node 3
      • invisible node 3.1
      • node 3.2
        • node 3.2.1

    expect that visibility from
    node is()
    invisible node 3.1 (nothing)
    invisible node 3.2 node 3.2, node 3.2.1
    node 3 node 3, node 3.2, node 3.2.1

    I'm not sure. My reaction to the two versions is different. The first is more like an explanation of the feature. The second is more like a checklist than something you read for understanding. For example, when reading the second, I'm more bothered that not all the nodes are listed and that node 3 doesn't come first. (The order is not an accident - I wrote the check sentences in the order I'd explain it to a person while pointing at nodes with my finger.)

  • It was well after supposedly finishing the tables that I thought of greying out the invisible nodes. I did it because at one point I glanced at a node, wondered if it were visible, then looked up the tree to check. Too much work.

## Posted at 15:31 in category /fit [permalink] [top]

Tue, 22 Feb 2005

Improving NTF

A clever story about achieving metrics nirvana from Kevin Lawrence.

## Posted at 12:56 in category /misc [permalink] [top]

Adjectives and adverbs

In exploratory testing, you need a way to kick yourself out of your rut, your habitual way of thinking about the software. One of the ways Elisabeth Hendrickson does that is to use lists of adjectives and adverbs, chosen randomly or no, to prompt new ideas. She's got a short writeup here, together with a phrase generator.

My overeducated alter ego can't help pointing out the similarity between what Elisabeth does and the written version of the Surrealist game called "exquisite corpse".

## Posted at 07:17 in category /testing [permalink] [top]

Mon, 21 Feb 2005

Rant 2

Sometimes you just gotta say people are more attractive in the abstract than in reality.

The Constant Reader will have seen my earlier rant about how some spyware producer has chosen to take over machines and direct them to testing.com. Since putting up the note on my front page and resume page, the emails have slowed down to somewhere around four a day, but the remainder are largely of a strikingly aggressive stupidity, such as this gem from "mp3pirate":

Get off my Mother-Blankin Machine You Bl Ank Licking Blank...........
(typography reproduced for full effect)

I can't help but think:

  • It's excusable, I suppose, to get spyware on your machine.
  • You have to be a little dim to suppose that my site is a plausible destination for real spyware, and that a spyware author would leave his real email address and phone number on the site.
  • And isn't it aggressively stupid to send mail like that to someone who you think already 0wnz your machine?

I dutifully reply with a little note reiterating what's on my site and ask if they saw it there (if not, I want to put it where they would see it). Not one has replied.

Heirs to the legacy of the Enlightenment. Bah!

## Posted at 13:37 in category /junk [permalink] [top]

Links to burndown charts

[Update: two more good links]

I'm writing a report for a client, and I mention variant burndown charts. I want to put all my links in one place.

## Posted at 08:58 in category /agile [permalink] [top]

Wed, 16 Feb 2005

Scripting for testers

Prodded by Bret Pettichord, I've finally committed to writing Scripting for Testers. The manuscript is due by the end of the year, to be published in Dave Thomas and Andy Hunt's Pragmatic Bookshelf.

Here's a version of the plan I sent to Dave, followed by a request for help.

Target size is typical for the Pragmatic series (unit test, CVS, etc.)

I have three goals for the book:

  1. Teach testers how to program, and program at least moderately well.

  2. Move testers from a position of dependence to one of sufficiency. Too often I see testers who are helpless to do a lot of things for themselves and are cut off from lots of conversation, both of which reinforce their peripheral position. For example, if they knew programming, they'd both be able to ask for and receive better testability hooks.

  3. Push Ruby toward being the tester's language (which is going to require some catching up with Python and Perl in terms of libraries).

The style will be similar to Dive Into Python: learning by examining and building examples. I'll make the examples as progressive as possible.

The first example will build on a way of doing exploratory testing, which is to bang away at the GUI while watching the log scroll along in a nearby window and noting exceptions.

  • It will start with simple IRB usage: reading from a file.
  • Reading from a file with a program.
  • Reading from a file that's continuously filling.
  • Filtering / regexes.
  • Adding a simple GUI
  • Maybe making the app beep or flash when it sees something interesting.
  • Different types of filters - pluggability - unit testing
  • XML parsing and tree-munging

I will probably add a simple HTTP server as a way of demystifying networking.

This is all toward the end of showing that testers are not limited to using tools just for running tests. They can build what they need to build.

Then I move to testing. I'll probably start with a simple introduction to Watir (testing that uses IE via COM). There's a possibility I'll also introduce Selenium (which is a Javascript test harness that lives in the browser and has Java and Python bindings today). I'll definitely do a web-services chapter (again, as much to demystify as anything else). Then on to the real problem: testing a windowing app. I prefer to test under the GUI here (actually, with web clients too). The web services chapter and the HTTP server chapter have primed them for that: because now it will seem to them quite unexceptional to stick something in the app that talks some protocol and translates remote calls into whatever below-the-GUI API is there.

It would be nice to talk jruby for testing Java apps. I will if it seems robust enough.

Last part (depending on space) is about using various tools - whatever's out there - to do various testing-ish tasks.

So the request for help: what do you think of that? What kinds of tasks should be covered? What tools should I talk about? Mail me.

## Posted at 07:24 in category /testing [permalink] [top]

Tue, 15 Feb 2005

Links to burndown charts

I'm writing a report for a client, and I mention variant burndown charts. I want to put all my links in one place.

  • A Mike Cohn variant that also tracks when work is added.

  • My rendering of a Kelly Weyrauch variant to the same end.

  • One from Wayne Allen that uses area charts instead of bars.

  • A burnup chart from Ron Jeffries.

## Posted at 11:07 in category /agile [permalink] [top]

Fri, 11 Feb 2005

Rick Mugridge rocks

This week I traveled to a client who was using FitNesse, the Wiki-enabled version of Fit. They were using ActionFicture, which I've never been fond of. Earlier I'd replaced it with my own StepFixture, but I knew Rick Mugridge had a new type of fixture called the DoFixture, so I prevailed upon him to let me take a copy with me.

I'm in the first flush of enthusiasm, but I think it's a big step beyond StepFixture and similar fixtures. When I read about it, I thought it was an improvement, but I didn't appreciate how well it suits my goals for acceptance tests. Here's an example:

charge 50 dollars against account 89-64P
schedule 10 dollar payment monthly from checking

I strongly prefer Fit tests to be business-facing, written in the language of the business instead of the language of the implementation. Because the rows can be read as sentences, there's less of a translation gap between what a product owner says about the desired product and the way the tests are written. That seems to help keep them business-facing.

Moreover, it may be easier to envision a succession of tests as a progressive explanation. If reading the tests is supposed to help you understand a feature, the tests should start with the simple cases and build up progressively to the complicated ones. There's a tendency in test writing to set up an enormously complicated state once, then execute a series of actions and checks. I think that's bad for two reasons. One is that the test becomes hard to understand. The other is that it gives the programmer no obvious place to grab hold and get started. With a series of progressive tests, the programmer, product owner, and tester can devise a few simple cases right away. Then the programmer can launch off into coding while the product owner and tester mull over the more complicated cases. The programmer doesn't face the choice between coding without a concrete goal or waiting around for something to do.

Another nice feature of the DoFixture is the way it wraps other fixtures. Before explaining that, here's how a DoFixture would handle the above table. The table translates into calls to these two methods on the DoFixture:

  void chargeDollarsAgainstAccount(int dollars, String account)
  void scheduleDollarPaymentFrom(int dollars, String frequency, String source)

The DoFixture doesn't care much about table boundaries. It can run any number of tables in succession. But when it encounters the first row of a table, it does a special check. If the row begins with a fixture name, it handles it in the regular Fit style. But if it's a method that returns a fixture instance, the DoFixture recursively uses that instance to process the table, returning to the DoFixture after the table's done.

One practical result of that is that a table that once looked like this:

com.exampler.writer.fixtures.TemplateCreator letter template
field default
Salutation To whom it may concern:
Closing Respectfully yours,

... can now have a first row with the same effect but a different style:

create template named letter template with these fields
field default
Salutation To whom it may concern:
Closing Respectfully yours,

That doesn't seem like such a big deal, but the difference betweeen a DoFixture style page and a standard Fit page is that there's only one mysterious technology-facing row that says something like com.exampler.writer.fixture.FixtureName. The visual appearance of the page is more pleasant, looks more like a document written for humans.

The DoFixture isn't in general release yet, I don't think, but it should be soon. Watch for it. Thanks, Rick.

P.S. There's one thing that I think Fit still needs to convert it from a tool for the test-infected visionary to a mainstream tool, and that's a Fit-specific editor. For example, if I were in "row-fixture mode", I'd want to type the above table like this:

[command to start a row fixture]
charge[tab]50[tab]dollars against account[tab]89-64P[return]
schedule[tab]10[tab]dollar payments[tab]monthly[tab]from[tab]checking[return]

The table would grow itself around the words I'm typing instead of my having to create it first. That would allow me to type the table quickly, instead of boring the product owner while I fiddle with table formatting. And the editor would take care of italicizing the keywords. (Although that's not required, I think it helps.) It might also touch up the table with the appropriate <colspan> attributes to keep it looking tidy. Oh, and it should support modifying tables so that changing the tests is as pleasant as changing the code in an editor like IDEA. And it would be cool if certain table refactorings (like "extract method" from DoFixture tables) also automagically refactored the underlying Java code. I ain't asking for much.

## Posted at 21:36 in category /fit [permalink] [top]

Sun, 06 Feb 2005

Three books

I'd read chunks of Michael Feathers' book, Working Effectively With Legacy Code, before publication, but it's only on the last few plane rides that I've read it straight through. It's really good: gobs of experience distilled, delivered in a consistently readable style and with an encouraging, even gentle, tone.

The Pragmatically Publishing Programmers have just come out with Mike Mason's Pragmatic Version Control Using Subversion. Since CVS and I have a stormy relationship, particularly regarding deleting directories, I bought the paper+PDF version post-haste.

Rick Mugridge and Ward Cunningham's Getting Fit for Developing Software is in copyedit now. They're at a particularly difficult game: writing for both nontechnical Customers or testers and programmers, weaving themes and a common thread of examples through what is inherently a lot of not-essentially-connected subtopics, tackling both How To and Why Bother. I was pleased and honored to write the foreword. But there's more: the book brings with it Rick's Fit library, with his DoFixture that's along the lines of my StepFixture but seems to be much better (though I haven't used it yet). That and Rick's other fixtures will help Fit a lot, I think.

## Posted at 21:26 in category /misc [permalink] [top]

Notes from a business traveller

You can get an entire exit row to yourself if you travel to a Superbowl team's home city on the evening of the Superbowl.

But once you check in, those same old questions come crowding into your mind:

  • What's with having no power outlets within reach of anything moveable that might need power? It was bad enough when the iron couldn't be plugged in except in the most awkward corner of the room, but it's truly annoying in this day of lounging on the bed with a low-battery laptop.

  • Who exactly is it that thinks those bathtub/shower drains that you open by pulling them up and close by stepping on or near them - like, say, one might while rinsing one's hair - are a good idea? Does anyone but hotels buy them?

  • And why do fitted sheets for king-size beds never fit?

## Posted at 21:25 in category /junk [permalink] [top]

Sat, 05 Feb 2005


So I'm a Macintosh user. I get tons of spam, a big chunk of it from zombie PCs. I can live with that. My spam filter works OK. But now I find that some rapidly spreading spyware takes over the screen of the PC it's installed on and displays my site, www.exampler.com/testing-com. I suppose the spyware authors were testing their program and "testing.com" came to mind.

Here I am, having to actively fend off outraged PC owners even though I paid extra money to avoid sharing their wretched, spyware-infested, decent-shell-lacking, backslashian lives.

Classic market failure. A negative externality. Grr.

## Posted at 14:51 in category /junk [permalink] [top]

Mon, 31 Jan 2005

I'm a Customer on an Agile project

The Agile Alliance is revamping our website. Micah Martin of ObjectMentor wrote the first version and did a fine job, but the site's been around for four years, and technology moves on. I've been keeping it up for a year, tinkering here and there, but I don't have time to do the complete rewrite we need.

So we took bids for the revamp. Unsurprisingly, we're running the project in an Agile style. (If the second iteration goes well, we're going to start deploying in two weeks.) And I'm the partly-on-site Customer for the project, since I live about five miles from the company that's doing the work.

I'm having a blast so far. It's great fun steering the project against the background of our original story list. But it's serious fun, because all three bids came in at about the same price, and it was about twice what we wanted to pay. So we're paying what we wanted to pay, which means that I'm steering the project knowing that I have to pack the most value into the time we have, since a lot of stories will fall off the end.

I hope this will give me a hint of the insider's view of what product owners go through, especially since I hope to concentrate a chunk of my effort this year on supporting them.

## Posted at 20:06 in category /aa-project [permalink] [top]

Tue, 25 Jan 2005


Mike Clark has a nice little video showing CruiseControl in action. He needs to affix a disclaimer, though, that he writes his books with Emacs.

Dave Hoover tells of an interesting pairing technique: Ping-Pong Programming.

## Posted at 09:28 in category /misc [permalink] [top]

Sun, 23 Jan 2005

Agile 2005 call for papers

The Agile 2005 call for papers is out. Key dates are March 1 and March 15. (But you'll have to follow the link to see key dates for what.)

Agile 2005 is July 24-29 in Denver, USA.

## Posted at 16:48 in category /agile [permalink] [top]

Usability testing

Jonathan Kohl has an interesting note on team usability testing using personas. It's his position paper for the Third Canadian Agile Network Workshop, where Jeff Patton (Mr. Agile Interaction Design) and I will be leading the group in (we hope) figuring out more about how interaction design, testing, and the customer-facing parts of agile projects hang together. I don't know if workshop is closed yet.

## Posted at 09:11 in category /testing [permalink] [top]

Sat, 22 Jan 2005

Testers who can script

I would like testers on an agile project to be able to code, preferably in some scripting language like Ruby or Python, secondarily in the languages their programmers are using to write products. In a note on the software testing list, Cem Kaner challenged that assumption. Here's my interpretation of his point. It may not be a correct interpretation, but it's the one I want to address.

As Bret Pettichord has pointed out (PDF), testers tend to be generalists. They know testing, but they also need to know the product's business domain. They might have a wide though uneven understanding of technology issues (like differences between Windows NT and Windows 2000, or between IE and Firefox). They need to have the "soft skill" of interpreting the desires of multiple interest groups because part of what they're doing is looking for the Customer's mistakes. As I've been learning from Jeff Patton, they ought to have some background in user-centered design. They're likely to switch projects more often than programmers, so they also need to be quick studies (which Bret also notes). So why then do people like me make such a big deal out of programming?

  1. I would think less of a programmer who was unwilling to learn about the things testers know. So I don't believe I'm putting a burden on testers that programmers are exempt from.

  2. I would think less of a tester who was unwilling to learn those things. What's interesting to me is that everyone I consider a reasonable tester would agree with me. Programming sticks out (to me) as somehow being treated specially: it's a burden many testers think they should be exempt from.

  3. That's weird. So much of what afflicts testers is because they're completely dependent on the programmers to make the product testable. Being able to program reduces that dependence, but being able to talk to programmers in their own terms probably helps even more. I've noticed that often, and Bret makes a point of it too.

On balance, it seems to me that programming is underemphasized as a part of a balanced tester's skill set. So I am justified in emphasizing it.

I suspect programming is singled out because of historical accident, though my hunch may be skewed by the circles in which I've moved. This is what I saw:

  • In the 80's, when I was coming up, testing was too often a dumping ground for failed programmers. If you weren't good enough to be a programmer, you were sent to what became, by definition, a job for people without (the right) skills. It's natural to rebel against the value system of those who consider you inferior.

  • In the 90's, especially during the .bubble, more testers seemed to be hired from non-technical fields. A history major I worked with is emblematic to me. She'd been hired fresh out of college into a startup. Given that programmers were magical (having the mystical power to turn elevator speeches into gold), it was easy to think they had mental powers mere testers couldn't hope to grasp. (Whereas I think competent programming isn't all that much harder to learn than lots of other skills.)

History needn't be destiny.

## Posted at 19:50 in category /agile [permalink] [top]

Tue, 11 Jan 2005

Another tester succumbs

From Elisabeth Hendrickson, testing consultant and my occasional partner in training:

I'd been resisting Ruby for such a long time, thinking that I already knew enough scripting languages. I figured I'd be better off spending my time learning Java and C#. After seeing what WATIR could do and how neat Ruby is, I became a convert. I dug in and learned the basics of Ruby over the weekend (though there is still much I need to learn).

## Posted at 20:42 in category /ruby [permalink] [top]

Mon, 10 Jan 2005

OOPSLA call for papers is out

The OOPSLA Call for Papers is out. I'm chair of the Essays track. Here's its blurb:

Some ideas are the result of research and others of reflection. Sometimes it takes someone sitting down and just thinking about how things are connected, what a result really means, and how the world really is. Some of the most impressive products of civilization are its essays - philosophy, for example, is reflection captured in essays. An essay presents a personal view of what is, explores a terrain, or leads the reader in an act of discovery. Some contributions to computing come in the form of philosophical digressions or deep analysis. An essay captures all these - one at a time or all at once.

Each essay will be afforded a 45-minute speaking slot and allocated about 20 pages in the proceedings.

I've assembled a wide-ranging committee: a business school professor, head of the MIT AI lab, head of the Illinois sociology department, professors of philosophy, sociology (again), and statistics, a Forrester researcher, a Pragmatic Programmer, the director of the Warren-Wilson Master of Fine Arts program, another software consultant, and me.

As that list implies, we're looking for submissions from both those within the software fold and those outside it. Spread the word, please.

See also this:

Submissions are due March 18.

## Posted at 14:25 in category /oopsla [permalink] [top]

Sun, 09 Jan 2005

Programming the PDP-1

A story, following up on my post about the virtues of knowing languages close to the machine. One of the things that always impressed me about the great Lisp hackers was the way they moved effortlessly between levels of abstraction. At one moment, they could be thinking extremely abstract ideas like call-with-current-continuation (often abbreviated call/cc). The next, they could be hacking PDP-10 assembler. But are the two levels so unconnected?

There was an OOPSLA workshop organized around reading PDP-1 assembler. An interesting machine, the PDP-1. It had one general-purpose register, one IO register that could be used for scratch space, and no stack pointer.

In the workshop, we read parts of Peter Deutsch's first Lisp for the PDP-1, struggling through the unfamiliar idioms. Even for people who know assembly, there's a big difference between idioms that assume at least six available registers + a stack pointer (the PDP-11) and those that assume two.

One idiom looked something like this (eliding any complications due to having such a small peephole to memory):

  104:	store 106 in 100
  105:  goto 203
  106:  next instruction in the computation
  203: do something
  204: goto 303

  303: do something
  304: goto 404

  404: do something
  405: goto the instruction stored in 100

The main program invokes a subcomputation by storing the eventual return address and jumping to the first subroutine. All the components of the subroutine know they're part of a chain of computation, so they just jump to the next link. The final one looks in a known place to find where the main program's next instruction is and jumps there. No stack.

When we figured out what was going on (I think Dick Gabriel had to explain it), the thought that flashed through my brain was "call/cc!" It's not - it's closer to setjmp/longjmp - but I wondered whether experience at that low level primed the Lisp hackers to be receptive to ideas like call/cc. The idea of a function call is sort of a closed conceptual universe. It encourages you think about what you can do with function calls, not necessarily what you can do that's like function calling. But if you know assembler, especially from before there were modern-day function calls, you're perhaps more likely to think of a function call as a bricolage: something made up of pieces, pieces that can be assembled in different ways or used independently in combination with new pieces. Maybe call and return don't have to go together.

I suspect that's all bogus. Call/cc probably has more to do with Lisper's tendencies to try to make everything a function. (Witness Steele and Sussman's Lambda the Ultimate papers.) And there's all that denotational semantics / lambda calculus stuff. Perhaps experience using assembly to code something vaguely similar didn't till the soil for call/cc. But isn't it pretty to think so?

## Posted at 12:51 in category /misc [permalink] [top]

Thu, 06 Jan 2005

Low-level is higher now

For old fogies who think there's no progress in software, this advice from Joel Spolsky to college students:

Learn C before graduating

C. Notice I didn't say C++. Although C is becoming increasingly rare, it is still the lingua franca of working programmers. It is the language they use to communicate with one another, and, more importantly, it is much closer to the machine than "modern" languages that you'll be taught in college like ML, Java, Python, whatever trendy junk they teach these days. You need to spend at least a semester getting close to the machine or you'll never be able to create efficient code in higher level languages.

I agree wholeheartedly with that advice. However, I can't help but be amused that in 1981 (when I got my BS), that advice could be - and was - expressed in these terms:

Learn an assembly language before graduating

Assembly. Although assembly is becoming increasingly rare, it is still the lingua franca of working programmers. It is the language they use to communicate with one another, and, more importantly, it is much closer to the machine than "modern" languages that you'll be taught in college like Pascal, C, whatever trendy junk they teach these days. You need to spend at least a semester getting close to the machine or you'll never be able to create efficient code in higher level languages.

It sounds hard to believe people said that of C, which has been called "portable assembly language", but people did argue (rightly, I think) that people who only knew C did not really understand the costs of things like function calls and passing structures as arguments (as opposed to pointers to structures). Thus, they wrote inefficient C code. I remember serious and heated arguments on USENET about whether C compilers could possibly get good enough to allow C to be used for serious applications like operating systems. (Though I do think the assembly defenders were definitely losing by then.)

Nowadays, I wouldn't recommend just learning C, I'd recommend learning the C-coded virtual machine of some higher-level language (Ruby, Python, Lisp, Smalltalk). Learning how closures/blocks/lambdas really work is the modern equivalent of understanding function calls. Ditto garbage collection and passing structs. Not only will you learn about efficiency, you'll learn what these features really do - so you'll more readily recognize situations where they apply, and you'll make fewer puzzling mistakes.

Then the next important thing to master is when not to pay attention to what you know about efficiency.

## Posted at 08:52 in category /misc [permalink] [top]

Wed, 05 Jan 2005

Something for fans of CRC cards, Big Visible Charts, and the like

I have never followed a science, rich or poor, hard or soft, hot or cold, whose moment of truth was not found on a one- or two-meter-square flat surface that a researcher with pen in hand could carefully inspect.

-- Bruno Latour, Pandora's Hope, p. 53.

## Posted at 07:08 in category /agile [permalink] [top]

Mon, 03 Jan 2005

Are companies more like wheat farmers or rice farmers?

Twice a year, Dawn and I drive from the Grandparents' house to the White Mountains to hike. On the way, we always read a nonfiction book to each other. This vacation, it was James Surowiecki's The Wisdom of Crowds. I may have more to say about it later, but for now, this extended quote:

In [the Green Revolution in India] rice farmers and wheat farmers made their decisions about new crops in very different ways. In the wheat-growing regions [...], land conditions were relatively uniform, and the performance of a crop did not vary much from farm to farm. So if you were a wheat farmer and you saw that the new seeds substantially improved your neighbor's crop, then you could be confident that it would improve your crop as well. As a result, wheat farmers paid a great deal of attention to their neighbors, and made decisions based on their performance. In rice-growing regions, on the other hand, land conditions varied considerably, and there were substantial differences in how crops did from farm to farm. So if you were a rice farmer, the fact that your neighbor was doing well (or poorly) with the new crop didn't tell you much about what would happen on your land. As a result, rice farmers experimented far more with the new crop on their own land before deciding to adopt it.

Pages 120-121 in the large print edition, which was the only one our
traditional Barnes and Noble had when we stopped by on the way north.

Seems fairly straightforward, but it made me think a bit about the effort to make Agile methods more mainstream. Let's assume that Agile development works for the visionary crowd. Now we want to get early mainstream project managers and executives to adopt it. Geoffrey Moore calls these people pragmatists. He has a rule: pragmatists almost always adopt only after their peers do. Those peers are other pragmatists in the same industry. A visionary entrepreneur or someone in another industry is like a neighboring rice farmer: so different that their experience doesn't predict much. It's only the fellow pragmatists who can be treated as fellow wheat farmers.

This produces a first-mover problem: how do you persuade the first pragmatist to try something? I'm not going to talk about that here, but see Moore's Crossing the Chasm. Instead, I'm going to assume that we've succeeded at that. For whatever reason, MegaCorp has one Agile project going and producing good results. Are we over the hump? I think experience tells us not, and Surowiecki gives us a way to talk about why.

Let's consider the other project managers and executives in MegaCorp. When they look around themselves within the company, they see a huge variation in project results. (That's why they've probably already flirted with CMM: they crave predictability.) I think of every project as being like its own little rice patch. Why should the manager of the next-door rice patch believe that another project's success with XP means anything? Especially if that project was a pilot project: it was probably staffed with enthusiasts, it was probably small, it was quite likely a greenfield project, the Hawthorne Effect was surely in play, etc. etc. All these are reasons to avoid the risk of change. (In Moore's terminology, software encourages managers to be either visionaries or conservatives, leaving a bigger-than-usual chasm.)

To me, this suggests that those who want to encourage the spread of Agility should perhaps concentrate more on the second Agile project in a company than the first. One of the two goals of the Agile Alliance is to help more Agile projects be created. During the next year, we want to reach out more toward project sponsors and other executives. (I should have been doing a better job of that over the last year.) Perhaps the best way to do that is to provide specific support for moving beyond the pilot project. That'd at least be novel, and I think it demonstrates a reassuring long-term commitment.

## Posted at 20:34 in category /agile [permalink] [top]

About Brian Marick
I consult mainly on Agile software development, with a special focus on how testing fits in.

Contact me here: marick@exampler.com.




Agile Testing Directions
Tests and examples
Technology-facing programmer support
Business-facing team support
Business-facing product critiques
Technology-facing product critiques
Testers on agile projects

Permalink to this list


Working your way out of the automated GUI testing tarpit
  1. Three ways of writing the same test
  2. A test should deduce its setup path
  3. Convert the suite one failure at a time
  4. You should be able to get to any page in one step
  5. Extract fast tests about single pages
  6. Link checking without clicking on links
  7. Workflow tests remain GUI tests
Permalink to this list


Design-Driven Test-Driven Design
Creating a test
Making it (barely) run
Views and presenters appear
Hooking up the real GUI


Popular Articles
A roadmap for testing on an agile project: When consulting on testing in Agile projects, I like to call this plan "what I'm biased toward."

Tacit knowledge: Experts often have no theory of their work. They simply perform skillfully.

Process and personality: Every article on methodology implicitly begins "Let's talk about me."


Related Weblogs

Wayne Allen
James Bach
Laurent Bossavit
William Caputo
Mike Clark
Rachel Davies
Esther Derby
Michael Feathers
Developer Testing
Chad Fowler
Martin Fowler
Alan Francis
Elisabeth Hendrickson
Grig Gheorghiu
Andy Hunt
Ben Hyde
Ron Jeffries
Jonathan Kohl
Dave Liebreich
Jeff Patton
Bret Pettichord
Hiring Johanna Rothman
Managing Johanna Rothman
Kevin Rutherford
Christian Sepulveda
James Shore
Jeff Sutherland
Pragmatic Dave Thomas
Glenn Vanderburg
Greg Vaughn
Eugene Wallingford
Jim Weirich


Where to Find Me

Software Practice Advancement


All of 2006
All of 2005
All of 2004
All of 2003



Agile Alliance Logo