Exploration Through Example

Example-driven development, Agile testing, context-driven testing, Agile programming, Ruby, and other things of interest to Brian Marick
191.8 167.2 186.2 183.6 184.0 183.2 184.6

Tue, 19 Dec 2006

Lively wireframes

Tests are better than requirements documents because they're more lively. Not only do they describe what the system is to do, they give strong hints about whether it does it. Requirements documents just sit there. The liveliness of tests makes up for the occasional awkwardness of their descriptions. (It's harder to write for two audiences—the human and the test harness—than it is to write for one.)

In a series of talks I gave earlier this year, I described three types of business-facing tests: ones based on business logic, ones based on workflow, and ones based on wireframe mockups of a user interface. I talked about wireframes last, and what I had to say compared poorly to the previous two. Those tests had been simultaneously executable and OK-to-good at communicating. But, when it came to wireframes, the best I could do was draw one on a flipchart and say, "I wish I could lift that off and put it in the computer. The closest I can come is this..."


  def test_structure_without_audits_or_visits
    wireframe_looks_like {
    }.given_that {

That's bad because we have two separate representations, each of which is lousy for one of the two audiences. I now think I have something better. Here's a wireframe:

It's a drawing created with OmniGraffle Pro (using a stencil from John Dial). That kind of wireframe is easy for a whole team to talk about, but it's too ambiguous for a testing tool. (How would it know whether a given rectangle is a text box, a text field, or the decoration at the bottom of the window?) Fortunately, Omnigraffle allows you to attach notes to graphics. The yellow tooltip-ish rectangle shows annotations to a text field that remove ambiguity.

Here's a test that uses that wireframe:

The image is just there for human consumption. In real life, I'd want the human to work exclusively on the Graffle document and not think about PNG files at all. Instead, I'd have a script watch for changes to Graffle files and regenerate all the PNG images.

The actual test ignores the image. Instead, it parses the Graffle file ("normal-run.graffle"), hooks the program up to a fake window system that records messages like setStringValue and selectAll, starts the program, waits for it to do all its UI initialization, then compares the state of the windows against what the Graffle document claims. When the tests run, the results look like this:

The error messages could do a better job of pointing to the right control, and it's a shame that the image doesn't appear in the output. (Fit swallows it along with any other HTML tags in the test input. No doubt I could work around that.) However, this output is only for programmers already deep in the code. It doesn't have to be as friendly as output aimed at a wider audience.

I still have two big open questions.

  • How much time would it take to make a fake window system that could maintain all the state anyone cares to express in a test? (And what is it that should be expressed in such tests? I'll have more to say on that later, probably.)

  • How fragile will these tests be in the face of change? Updating the annotations and the tests has to be a small part of changing the wireframes and the UI code.

The next installment ties this into the Atomic Object style of model/view/controller, as described here (PDF) and in a forthcoming Better Software article. But first, I have to figure out how to parse canvases out of Graffle files. And there's that whole vacation thing.

## Posted at 21:50 in category /fit [permalink] [top]

Wed, 13 Dec 2006

Best signature ever

Plucked from the bottom of mail from Pete McBreen:

"For a list of all the ways technology has failed to improve the quality of life, please press three." - Alice Kahn

## Posted at 12:26 in category /misc [permalink] [top]

Tue, 12 Dec 2006

Primary Loyalties

This is an editorial that expanded on an offhand earlier post. It was rejected. While it does have two potentially offensive analogies, I figure I have more leeway in what I publish here. It's had a postscript added and is now filled with hyperlink goodness.

A recent UN report states that "New explosive devices are now used in Afghanistan within a month of their first appearing in Iraq." (Reuters, September 27, 2005). Compare that to the rate of diffusion of technology in our field. I'll use continuous integration as an example. It's a well-established technology that's easy to deploy, is practically without risk, has considerable benefit, was first widely described in 2000, and has had a solid open source tool supporting it for at least three years. But there's a reasonable chance you've never heard of it. (Note: true of original audience; likely not true of this blog's audience.) If you tried to deploy it, it might well take months and months to get permission, to round up a build machine, and to get the first people using it.

Something is desperately wrong with this picture. Why is it that people living in isolated harsh conditions where people are trying to kill them can move faster than we can in our offices?

John Robb, a software executive and former Air Force counterterrorism operative, describes what the guerrillas do as open-source warfare, and he's developed a rather elaborate theory of how that works. One underpinning of the theory is what he calls primary loyalties. "A primary loyalty is a connection to a non-state group that is greater than loyalty to a state. These loyalties include those to clan, religion, tribe, neighborhood gang, etc. These loyalties are reciprocated through the delivery of political goods [...] by the group that the state cannot or will not deliver."

Professional-class employees like you and me once had something like a primary loyalty to our employer, especially if it was a large company. In the US and elsewhere, that employer delivered "goods" to us like steady employment, guaranteed pension, medical care, a career path, and the training we needed to advance along it. Under Anglo-American capitalism, at least, corporations no longer deliver many of those things. Instead, as is described in Jacob Hacker's The Great Risk Shift, companies have given us the opportunity and responsibility to provide those things for ourselves. For example, instead of being given a guaranteed pension, we're given money to invest. If we invest well, we'll end up with more retirement money than the pension a company would have given us; if not, well, tough luck.

Whether that's a good or bad shift, employees have acted like people in Iraq and other failed states: they've shifted their primary loyalty elsewhere. In the US, we've seen rising nationalism, increased devotion to religious groupings, and more loyalty to political "tribes" (though not increased formal party membership). None of those loyalties have anything to do with work. Therefore, according to Robb, we're missing a key part of the infrastructure that supports fast diffusion and implementation of technologies at the office.

I think that's bad. We need groups that deliver the goods and are deserving of loyalty. Existing structures (unions, professional societies) aren't working, and I'm loathe to wait for them to start. The best I can offer is the autonomous team. I'm not talking about collections of individuals who've sleepwalked through "team-building exercises," but actual teams that work together very closely (often in pairs), learn together quickly, and provide cover for each other. When a team is working, the business comes to view it as a single specialist, a unit, with authority over what happens within itself. If the team decides to try continuous integration, it will deploy it without ever thinking to ask permission.

I acknowledge that it's offensive, at some gut level, to suggest emulating killers. But if this decade has a notable example of the "learning organization", it is—sadly—groups of insurgent cells with high internal loyalty and loose connections to both each other and also to the overarching sources of goals and funding.

P.S. John Robb's ideas haven't convinced me yet—sometimes his analogies seem more than a bit strained—but you may find his site worth a read. Hacker's notion of a risk shift has also drawn some scorn, though that particular link misses the point that matters to me. If you're an investor in the stock market, you expect stocks with higher volatility to pay higher returns over time. The higher returns are your payment for accepting higher volatility, usually tagged as "risk". What I take from Hacker is that a career today has higher volatility than in the past, but that higher risk has not come with significantly higher returns—instead, the US real median income has increased by 31% from 1967 to 2005 (source, PDF, p. 5). That's an annual real return of 0.6%. For comparison, that's a bit less than the real return on short-term US Treasury bills, historically the world's least risky investment.

## Posted at 07:53 in category /misc [permalink] [top]

Mon, 11 Dec 2006


My wife is writing the chapter on mammary gland health and disorders for Large Animal Internal Medicine, a standard reference. Her current draft is 119 double-spaced pages. It has 532 citations. The scary thing is how much she remembers—off the top of her head—about the contents of the papers. She is truly a fox.

Me, I can barely remember the difference between Facade and Adapter.

## Posted at 14:14 in category /misc [permalink] [top]

Thu, 30 Nov 2006

Did I say "Scripting for Testers"?

Scripting for Testers has been renamed Everyday Scripting in Ruby because a couple of reviewers argued that pretty much all that was required to make it suitable for a larger audience was changing the title and the bit of Introduction that says who the book is for. So we did.

I hope testers still pick it up. The subtitle says "for teams, testers, and you", which helps Google find it when you type in "scripting for testers." (It's the top hit.)

Sadly, the scheduled ship date is a bit after Christmas. Since it would be sad if testers didn't get the book under their tree, we've decided to delay the holiday.

Thanks to those who helped me on it: Mark Axel, Tracy Beeson, Michael Bolton, Paul Carvalho, Tom Corbett, Bob Corrick, Lisa Crispin, Paul Czyzewski, Shailesh Dongre, Gunjan Doshi, Danny Faught, Zeljko Filipin, Pierre Garique, George Hawthorne, Paddy Healey, Andy Hunt, Jonathan Kohl, Bhavna Kumar, Walter Kruse, Jody Lemons, Iouri Makedonov, Chris McMahon, Christopher Meisenzahl, Grigori Melnik, Sunil Menda, Jack Moore, Erik Petersen, Bret Pettichord, Alan Richardson, Paul Rogers, Tony Semana, Kevin Sheehy, Jeff Smathers, Daniel Steinberg, Mike Stok, Paul Szymkowiak, Dave Thomas, Jonathan Towler, and Glenn Vanderburg.

UPDATE: People have pointed out the lack of links. I am a master of Marketing.

## Posted at 06:47 in category /misc [permalink] [top]

Open Office for Fit

I've started using OpenOffice (in its Mac-ified NeoOffice form) for writing Fit tables. It's working considerably better than Word. Not only does it produce decent HTML (valuable when you're trying to figure out exactly what's going on), it does a better job of producing an HTML file that looks similar to the original WYSIWYG editor view, both when displayed through a browser and when read back into the editor.

I should note that I'm still using Word X for the Mac, so others might have better luck with Word than I've had. But if Word isn't working well for you, check out OpenOffice.

## Posted at 06:47 in category /fit [permalink] [top]

Fri, 10 Nov 2006

Nerd humor

These are all mentioned in Crypto-Gram.

First, the recent torture-lite bill boiled down to C code.

if (person = terrorist) {
} else {

There's more than one relevant bug. More here.

Next, check out the comments to A Million Random Digits. I like B. McGroarty's best.

Next, 19 Year Old Diebold Technician Wins Us Presidency.

## Posted at 14:29 in category /misc [permalink] [top]

Wed, 01 Nov 2006

IEEE Software issue on TDD

IEEE Software will have a special issue on test-driven development (May/June 2007). I'm a reviewer, and I've been asked to spread the word. The Call for Papers is here. The deadline is December 1.

## Posted at 07:53 in category /misc [permalink] [top]

Response to an essay

At OOPSLA, I was tapped to give the response to Jim Waldo's essay, On System Design. The essay isn't online yet, so here are what I see as its main points:

  • Almost all systems have designs. Most are bad. Some are good. (There is some tension in the essay about the word "design". It sometimes feels like design is inherent in the system, sometimes like it requires a description of the system, and sometimes that it is an act performed with a certain attitude over a prolonged period of time.)

  • There is no single design method, though conversation and thoughtful reflection and iteration seem to be required. Good design comes from good designers. Design is learned through apprenticeship.

  • Design was more possible in the past. (The essay does a nice job of listing the forces working against design.)

  • All is not lost, though. Open source allows people to learn system design by examining designs, and it evades some of the forces that work against design. Agile software development has the conversations and iterations characteristic of the design process.

  • Designers must have the courage to push back against the forces pushing against good design.

Unusually, for me, I wrote my talk down as an essay and read it to the audience. (Bad idea—my first flop sweat experience in a long time.)

My job is to give a thoughtful reaction to this essay, to describe what it means to a person with my perspective. Here goes.

You've just heard a description of a fall from a Golden Age when the world allowed us our values—to a world where the people and structures that hold power over us are disinterested, immune to our influence, and unwilling to leave us to putter in peace, despite our heartfelt claims that we'd all be better off if they did.

We're not the first people in this situation—many have been in far worse—and as I reread Mr. Waldo's essay one day, I thought it might be instructive to see how those others have handled it.

One response, the default perhaps, is despair and retreat from engagement. I think we're all familiar with that feeling, and with those who've succumbed to it, so I won't discuss it further.

The next two responses come from the Hellenistic period of Greek history, which followed the Classical period and was a time of turmoil, during which you might easily and uncontrollably go from great wealth to poverty or from power to slavery. This raised the practical question: how do you make yourself happy in a hostile world?

Zeno of Citium's answer has come to be called Stoicism. In this tradition, happiness comes from the possession of the genuinely good, and the only things that are genuinely good are the characteristic virtues of humans: wisdom, justice, temperance, courage, and so forth. We might include the desire to apprehend elegance in design as a virtue.

The wise person—the happy person—makes decisions based on how they align with the genuinely good. The results of those decisions have nothing to do with happiness: the Stoic would prefer they lead to wealth, health, and life, but is ultimately indifferent if they lead instead to poverty, sickness, and death. Epictetus puts it this way:

Our opinions are up to us, and our impulses, desires, aversions--in short, whatever is our doing. Our bodies are not up to us, nor our possessions, our reputations, or our public offices... if you think that [those] things ... are your own, you will be thwarted, miserable, and upset, and will blame both the gods and men.

From this, we get the popular image of the Stoic as someone who does what's right, because it's right, and is immune to attempts to sway her through non-rational emotions like fear of death. Marcus Aurelius, a later Stoic, put it this way:

Say to yourself in the early morning: I shall meet today ungrateful, violent, treacherous, envious, uncharitable men... I can [not] be harmed by any of them, for no man will involve me in wrong.

The Stoic approach to our problem would be to do thoughtful design because it is a good, and to be indifferent to the consequences. We would, for example, not care if the only company that would allow us to design well pays poorly, builds mundane software, and has no free soda in the kitchen. Stoicism is, I believe, what Mr. Waldo advocates.

But Stoicism was not the only philosophy that sprang from the chaos of the Hellenistic period. Epicurianism was another.

This is Epicurus, the founder of Epicurianism. In Epicurianism, happiness means having your desires satisfied and pain avoided. The virtues—courage, wisdom, and the like—are useful because they lead to the satisfaction of desires, not in and of themselves (as in Stoicism).

The best strategy toward happiness is to pare your desires down to the minimum, which are then easily satisfied. One should avoid desires that are inherently unlimited, such as those for wealth, power, fame, and the like, in favor of desires that can be readily satisfied—by, say, filling your stomach when hungry. Moreover, simple food is easier to obtain than fancy food and fills the stomach just as well; therefore, you should strive to be happy with simple food, though equally happy to eat fine food when it's there.

When I think of Epicurianism today, I think of the open source programmer who comes home from an unsatisfying job and spends part of the evening working on Firefox plugins or Ruby packages, designing them to meet the highest standards. Since, to Epicurus, current pain is outweighed by the mental pleasure of remembering past pleasures and anticipating future ones, the next day at work is thus made tolerable.

A third reaction is, to a Western audience, most associated with the period after the stability of the Roman empire collapsed. It is a negotiated retreat from the field of battle.

Here is a monastery. I suspect that it was built on the edge of a cliff not because of the view but because that's a defensible location.

Monasteries had defenses because they were liable to attack. Many of the attacks were like those of the Vikings on England, Ireland, Scotland, and elsewhere. Those attacks came from outside the existing, fragile social order. But there were also attacks from closer to home. A record that spans the 1400s shows that monasteries in Ireland had troubles long after the Vikings ceased to be a threat:

1394: The monastery of Loch Seimhdille was burned by the family of Ó Ceallaigh thirty-one years after it had been previously burned by Cathal Óg Ó Conchobhair.

1398: Mac Diarmada of Magh Luirg, ..., went to provision Carraig Locha Cé and compelled the monastery of Boyle to supply Carraig.

1402: A foray by Ó Ceallaigh and Clann Chonnmhaigh on the monastery of Comán.

This being roughly a thousand years after Christianity reached Ireland, but before Martin Luther, I'm speculating here that these attackers were Catholic Christians, yet they were not deterred by the presumed anger of the Christian God at attacks on His monks. Hence: walls, cliffs, and towers to which the monks could retreat while the raiders plundered.

Still, the monks did not simply disappear from society behind walls. They provided value to those they'd left behind.

For example, they would pray for the souls of your departed relatives.

And monasteries were a convenient place to stash the still-living bodies of inconveniently undeparted relatives. The picture is of Sophia, inconvenient to Peter the Great, in a nunnery.

And, of course, in Belgium there was beer.

I am sure these services gained them some protection.

When I think of monasticism today, I think of Agile projects. In Agile projects that are running well, there is an implicit or explicit deal between the team and the business. The team promises to deliver shippable business value at frequent intervals and not to whine when the business changes its mind about what it wants. In return, the business leaves the team alone to build the product as they like. That allows people who crave good design to do it—provided they can mesh it with the need to deliver frequently. In practice, that means that code becomes the whiteboard on which the design is discussed, rediscussed, and refined. This—in the best cases—seems to me exactly the same process Mr. Waldo describes. By that, I mean that the attitudes of people toward the design are the same, the conversations have the same air, the values informing the conversations are the same, and the code—in roughly the same time frame—comes to have as satisfying a design.

I claim the monasticism of the Agile project is a more sustainable model than Stoicism or Epicurianism. It requires less of us because we get to lean on each other. Even programmers, notoriously not team players, gain strength from each other.

Perhaps that's a claim we can discuss.

For my part, I've recently become obsessed with the weakness of Agile Monasticism. Here is a story I heard from an ex-employee of a company I'll call Frex:

[That] year came dreadful fore-warnings over the land of [Frex], terrifying the people most woefully: these were immense sheets of light rushing through the air, and whirlwinds, and fiery dragons flying across the firmament. [By this, he refers to the acquisition of Frex by a larger company.] These tremendous tokens were soon followed by a great famine [the new head of marketing moved the Customer out of the project team room] and not long after, on the sixth day before the ides of January in the same year, the harrowing inroads of heathen men [a new VP of Development] made lamentable havoc in the church of God in Holy-island by rapine and slaughter. [The imposition of a "more mature" development process caused all but one of the team to quit.]

Such stories are common. Agile projects have no real defensive walls; all they can do is deliver return on investment and hope the business values it. But we all know that ROI is only a part of what moves businesses. Those in the Agile world all know of resistance to Agile from those middle managers who see it as a threat to their power to command and control. Telling such a person that her sabotage endangers the company's ROI is like an abbot standing in the path of Christian raiders and threatening them with loss of their immortal souls: sometimes it works, but nowhere near often enough. And it never works with the worshippers of Odin.

The universe of Agile teams is like a school of fish. Every once in a while, a predator sweeps through us, grabs a team in its mouth, and destroys them. We flail around in panic for a few moments, talk about the stupidity of it all with our nearest neighbors, then reform as before, ready for the next predator.

This is—I repeat—still better than before. Teams do tend to protect their members. Testers are less likely to be offshored. Those who obsess about design can do it without justifying themselves to the unsympathetic. But the teams themselves, as wholes, have no structure of protection.

Mr. Waldo's essay, paradoxically, is leading me to seek answers to the current problems of Agility in collective action exactly because its focus on individual courage calls attention to our biggest blind spot: we believe that each of us must alone contend against aggregates possessing decades of institutional power. We don't even think about standing shoulder to shoulder.

What path we should take, I don't know. Unionism is so foreign to the professional class in the US that I'm nervous about admitting I've ever even had the word in my mind. The ACM appears to me an organization for extracting money from people in return for papers printed in 9-point type, papers placed in bibliographic categories that don't seem to have changed since the seventies. Neither it, nor the IEEE, have enough spunk. The Agile Alliance, on whose board I sit, doesn't seem to have the right leverage. So I don't know what we should do, together, but I'll be thinking on the problem, and that's because of On System Design.


Special thanks to Donnchadh Ó Donnabháin, who tutoried me in Gaelic pronunciation.

The photo of Despair is copyright by Carl Robert Blesius and was retrieved from http://blesius.org/gallery/photo?photo_id=1061. It is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 1.0 license.

The monastery is used by permission of historylink101.net and was found at the Greek Picture Gallery.

The fish picture is copyright by Toni Lucatorto and was retrieved from http://flickr.com/photos/toniluca/61782380/. It is distributed under the Creative Commons Atribution-NonCommercial 2.0 license.

The picture of the dinosaurs is used by permission of clipart.com.

The other photographs did not have copyright notices.

## Posted at 07:47 in category /agile [permalink] [top]

Tue, 24 Oct 2006

Wanted: sound file

My life would be ever so much better if I had a snippet of the ka-chunk sound that a 35mm slide projector makes when changing slides. Anyone got one or can record one?

Update: My life is ever so much better. There are at least three people in the world much better at Google than I am. Thanks, all.

Update: I've gotten requests for what I used. Here it is: http://freesound.iua.upf.edu/tagsViewSingle.php?id=4868. It uses a Creative Commons license.

## Posted at 14:49 in category /conferences [permalink] [top]

Gaelic pronounciations needed

For a mini-talk I'm giving at OOPSLA and possibly part of later talks, I probably need acceptable pronunciations of Gaelic words. I tried Gaelic pronunciation guides I found on the net, but what I'm coming up with can't be right. Here are the words. Do you know how they're pronounced? (Send a description or, better, a sound file.)

Comán (Koe-mahn)

Chonnmhaigh (Chawn-way, where "ch" is as in "loch". Is it really pronounced as strongly as a "ch" at the end?)

Ceallaigh (Kallagh, where "gh" is like the "ch" in "loch" but based on a G - whatever that means.)

Seimhdille (this is where I gave up - I can't figure out anything for that combination.)


Óg (that's with a long o, right? So it's pronounced like the first two letters of "ogre".)







Cé (Kay?)

Update: A Gaelic speaker will be teaching me the right pronunciation at OOPSLA before my talk. Isn't that cool?

## Posted at 14:48 in category /misc [permalink] [top]

Thu, 19 Oct 2006

The learning organization

A recent UN report states that "New explosive devices are now used in Afghanistan within a month of their first appearing in Iraq." (Reuters, September 27, 2005). How long does it take your organization to put a new technology or technique into use? How does that make you feel?

Offensive though it is, this decade's most notable examples of "learning organizations" are groups of insurgent cells with high internal loyalty and loose connections to both each other and also to the overarching sources of goals and funding.

## Posted at 05:06 in category /misc [permalink] [top]

Tue, 17 Oct 2006

A fixture for Boolean-valued business logic

I've implemented the fixture described earlier. It takes a table in a particular format, generates a new ColumnFixture table, and causes that table to be executed. You can see the Fit output a programmer works with here.

The source and jar file are at http://www.exampler.com/testing-com/tools/fitlibrary-extensions-0.1.zip. The README.txt file will tell you about examples.

I believe it works correctly, and I put it to the test at a client's on Monday. Nevertheless, it is an early version: I made no attempt to handle malformed input gracefully. I haven't made it work with DoFixture yet. I need to clean up the source directory structure. (JUnit tests are intermingled with source files.) I realize that the fixture knows almost enough to generate much of the ColumnFixture code for you, so I'm going to add that.

The version in the zip file was compiled under Java 1.4, though it is likely to compile under earlier versions.

## Posted at 16:57 in category /fit [permalink] [top]

Mon, 09 Oct 2006

An OOPSLA tutorial

If you're going to OOPSLA, I recommend you attend the tutorial Programmers are from Mars, Customers are from Venus: A Practical Guide to Working with Customers on XP Projects. I haven't taken it myself (never been in the right place at the right time), but I've talked to the lead presenter, Angela Martin, at length about the topic, and she's given me the complete notes. For those who don't know of her, Angela is one of the first names that springs to mind when you think about customers / product directors / product owners. Not only does she have practical experience, but she's also done some extremely interesting anthropological-ish research. Her co-presenters are Robert Biddle and James Noble, who have quite a good reputation as presenters. (Their postmodern programming presentation a few years back is a classic, a St. Crispin's Day event.)

I mention this because the description at the OOPSLA site has two flaws. "Working with Customers on XP Projects" in the title makes it seem that it's not for Scrum projects, but it is. And the blurb does not say, "This is for you too, programmers." From Angela's description, it most emphatically is.

## Posted at 09:02 in category /conferences [permalink] [top]

Thu, 28 Sep 2006

Off the rails

I'm too sick to write what I should be writing, and I can't sleep, so I decided to collect my thoughts and references about a current political topic I've been studying as I have time. The normal sort of posts will return shortly, but for the moment I'll use whatever reputation I have for careful-but-sympathetic thought to push back against an all-but-inevitable failure.

My understanding is that habeas corpus is a method by which prisoners can challenge their imprisonment before a judge. The idea has worked pretty well for 700 years. It fits with John Adam's phrase "a government of laws, not men": no one has exclusive power; everyone is subject to being checked and balanced.

It is now due to be removed in a hastily-considered bill. Despite what some say, the idea of habeas is not to "give terrorists rights"; it is to preserve the rights of those wrongly accused as terrorists or unlawful combatants. There have been many such people already. Sometimes people are just detained; some are sent to Syria and tortured.

The bill allows for review of detentions by military commissions, but to date only ten have been held. People can be held forever without any recourse. (Some people have continued to be held even after review found them innocent, though a large number have been released.) In newer versions of the bill, "people" can include US citizens. Unlike the military's current definition of unlawful combatant, which covers only "those who engage in acts against the United States or its coalition partners in violation of the laws of war and customs of war during an armed conflict," the new one covers anyone who "has engaged in hostilities or who has purposefully and materially supported hostilities against the United States" or its military allies. Who are our allies? What does "material support" mean? I guess some of us might just find out.

During detention, what? It is simply not the case, as the President stated, that Geneva Convention Common Article 3 is impossibly vague about the treatment of prisoners. Ironically, on the same day as that statement, the US military released its new procedures, which explicitly conform to the Geneva Conventions. It's not surprising that, in over fifty years, we've been able to come to agreement about what the Conventions require. But now we're going to replace them with new language that will have to be freshly interpreted. Everyone, save the people who'll actually be making the decisions—who refuse to commit themselves—is running around saying "waterboarding is allowable" or "waterboarding is not allowable", but that's silly. In the absence of court review, what's allowable is whatever's done. The legal principle is that "there is no right without a remedy."

(Stories about what's being done will leak, I suppose, as they always do. That could lead to some sort of remedy. I wonder if leaking, receiving, or reporting leaks counts as "material support"?)

The pros apparently don't think torture is effective:

I am absolutely convinced [that] no good intelligence is going to come from abusive practices. I think history tells us that. I think the empirical evidence of the last five years, hard years, tell us that. . . . Moreover, any piece of intelligence which is obtained under duress, through the use of abusive techniques, would be of questionable credibility, and additionally it would do more harm than good when it inevitably became known that abusive practices were used. And we can't afford to go there.

Some of our most significant successes on the battlefield have been -- in fact, I would say all of them, almost categorically all of them, have accrued from expert interrogators using mixtures of authorized humane interrogation practices in clever ways, that you would hope Americans would use them, to push the envelope within the bookends of legal, moral and ethical, now as further refined by this field manual.

We don't need abusive practices in there. Nothing good will come from them.

(From the announcement of the new procedures, above.)

But there's no institutional check on the non-professionals or the rogue professionals. We'll just have to rely on the moral character of everyone involved. That's of course entirely opposed to the American tradition of rule by laws, not men, but <sarcasm> apparently we face a threat more grave than the 45,000 nuclear warheads the Soviet Union had at its peak and a struggle more threatening than World War II, and so cannot afford the traditions that have worked for more than two hundred years </sarcasm>.

We certainly face threats—always have, always will—but I don't see any reason to give into the "this time it's different" fallacy.

All this matters to me because my parents grew up in Nazi Germany. I grew up knowing that cultures can descend into madness, and that it can happen without the majority ever really explicitly willing it or being really conscious of it. No, I'm not saying that America is just like Nazi Germany; I'm saying that men like my grandfather—not politically involved, just trying to live their lives—somehow, through fear or anger or depression or just passivity, let decency slip out of their grasp.

It also matters because I grew up knowing that the Americans were the good guys. My father (in the German Navy) was captured near Marseille. He didn't mind; he and his fellows didn't fight back. They wanted out of the war, and they wanted to surrender to the safest force: the Americans. Prisoner of war camp (American and French) was no picnic—my father weighed 130 pounds when he got out—and there was abuse, but it was not institutionalized (except in one camp, for a short time). He got what he expected, and he believes he has no cause for complaint.

In contrast, my Uncle Paul was captured by the Soviets on the eastern front. I imagine he fought harder than my father to avoid capture, because everyone knew what happened to Russian prisoners. And it did happen: it was 1950 before he even knew the war was over, and he came home broken for life.

There's practical value to being seen as the good guys, the just guys, the humane guys. That's not just true when fighting Germans; it works in the middle east, too.

Out of fear or anger or depression or just passivity, we're letting our elected representatives—our employees—reinforce hysteria to no effective end. If that bothers you, here is your Senator's contact information, and here is your Representative's.

Although this bill is being pushed by Republicans, I believe it should not be a partisan issue. The bill does not square with the conservative tradition of Chesterton's gate. It's being rushed through because what being a Republican politician today means is all about winning at domestic politics. (Just as being a Democratic politician appears to be all about not losing.) I can echo the the author of this fantastic essay: I miss Republicans. I miss Eisenhower; he'd surprise you.

## Posted at 00:09 in category /misc [permalink] [top]

Wed, 27 Sep 2006

More on boolean expressions

My examples below use a simple rule for deciding what values of a boolean expression to test. I should probably describe it and justify it.

Given an expression with all ands like X1 and X2 and ... and Xn, you use these test values:

  1. One case: all the Xi's are true.

  2. N cases. In each, all the Xi's are true except for one that's false. (A different one every time.) The way I think about it is that, for each Xi, there's an example that shows the whole expression is false exactly and only because of it.

So the table for (A and B and C) would be this:

A and B and C
A B C   expected result
t t t   t
F t t   F
t F t   F
t t F   F

The case for or-expressions is similar: just flip all the trues and falses:

A or B or C
A B C   expected result
f f f   f
T f f   T
f T f   T
f f T   T

The reasoning behind these rules is based on mutation testing, the name for a long thread of academic research on testing. The way I state it (which is different in an unimportant way from how it's usually put) is that mutation testing involves assuming that the code is incorrect in some definable way, then asking for a test suite that can distinguish the incorrect code you have from the correct code you should have.

Now, for any given program, there are an infinite number of variants, so mutation testing depends on picking a definition-of-incorrectness that (a) lets you generate a reasonably small set of alternatives, but (b) gives you confidence that you've caught all the plausible errors. The usual approach is to assume one-token errors.

For example, suppose you are given (A and B or C). Maybe it should be (A or B and C) or (A and not B or C) or (A and (B or C)).1

One-token errors aren't the only ones you could make. For example, you might completely forget that D ought to be involved in the expression—it should be (A and B and C and D). That's a fault of omission, and mechanical techniques aren't good at them. Nevertheless, one-token errors seem to work pretty well for boolean expressions.

Suppose you have the original (A and B or C) and a variant (not A and B or C). The test value (A=true,B=true,C=true) distinguishes the two, because the given expression yields true while the possibly-more-correct variant would yield false. So, when you run the original program and its variant2, that test case will produce one answer in the original and a different one in the variant. One of them's got to be wrong. If it's the original program, you've found a bug. If it's the variant, you know that variant cannot be the correct program (the original is not incorrect in that way). In the jargon, the mutant is killed.

Trying all the possible combinations of variable values will either find a one-token error or kill all the mutants. But you never have to try all of them. There will be some test inputs that don't add anything: any mutant they kill will be killed by some other test input. So you can construct a minimal set for any given expression.

If you look at the table below, you can see that the rule for and-expressions I gave above is justified; the cases I give kill all the mutants. (In the table, the first row is for the expression as given; each row below it is a mutant. The X's in a cell means that column's test case kills that mutant.)

A && B && C T f f f f f f f
!A && B && C f / X f / f / T / X f / f / f / f /
A && !B && C f / X f / T / X f / f / f / f / f /
A && B && !C f / X T / X f / f / f / f / f / f /
A && B || C T / T / X T / X T / X f / f / T / X f /
A || B && C T / T / X T / X T / X T / X f / f / f /
A && B T / T / X f / f / f / f / f / f /
A && C T / f / T / X f / f / f / f / f /
B && C T / f / f / T / X f / f / f / f /

Remember all this assumes that tests powerful enough to catch one-token errors will catch more complicated (but still plausible) errors. A way to convince yourself is to try and find a variant of (A and B and C) that won't be caught by these test cases. Ask yourself if it's at all plausible that you'd make such an error. (Remember: we've already conceded faults of omission.)

These rules are easy to memorize. The cases for expressions that mix and and or are not. A long time ago, I wrote a program that generates probably-minimal test sets for any given boolean expression (including relational operators like a<b). Timothy Coulter and Curtis Pettit, students of Cem Kaner, made it more capable and gave it a web UI. Here it is: http://www.oneofthewolves.com/multi/applet.html.

When using the style I described earlier, I don't think you need multi, because I'm tentatively advocating always breaking tables that combine ands and ors into separate tables that do not.

1 I can't remember if the transformations I used when working all this out included substituting one variable for another (like (A and B and A)). Multi, described after this footnote, doesn't. I don't think it would make a difference—certainly it doesn't in this particular example—but I'm not going to bother to check.

2 I'm leaving what it means to "run a program" vague. That gets to the difference of whether the mutation is "weak" or "strong". See this post by Ivan Moore. I didn't find much in the online literature about mutation testing; if you want to know more, you'll have to go to the library. There are some starting references at the end of this paper (PDF).

## Posted at 08:36 in category /fit [permalink] [top]

Sun, 24 Sep 2006

Describing yes/no choices in Fit

Using Fit to describe boolean (yes/no) decisions can be much clearer if you just insist that all decisions be expressed in multiple, uniform, simple tables. No boolean expressions in the code may mix ands and ors, but that's not a bad idea anyway in this age of small methods and ubiquitous languages.

Suppose you're given a jumble of three packs of cards. You are to pick out every red numbered card that's a prime, not rumpled, and is from either the Bicycle pack or the Bingo pack (but not from the Zed pack). Here is a way you could write a test for that using CalculateFixture:

which pack? color? prime? rumpled?   select?
Bicycle red 3 no   yes
Bingo red 3 no   yes
Zed red 3 no   no
Bingo black 3 no   no
Bingo red 4 no   no
Bingo red Queen no   no
Bingo red Ace no   no
Bingo red 3 yes   no

I bet you skimmed over that, read at most a few lines. The problem is that the detail needed to be an executable test fights with the need to show what's important. This is better:

which pack? color? prime? rumpled?   select?
Bicycle red 3 no   yes
Bingo red 3 no   yes
Zed red 3 no   no
Bingo black 3 no   no
Bingo red 4 no   no
Bingo red Queen no   no
Bingo red Ace no   no
Bingo red 3 yes   no

That highlights what's important: any card must successfully pass a series of checks before it is accepted. This test better matches what you'd do by hand. Suppose the cards were face down. I'd probably first check if it were rumpled. If so, I'd toss it out. Then I'd probably check the back of the card to see if it had one of the right logos, flip it over, check if it's black or a face card (two easy, fast checks), then more laboriously check if it matches one of the prime numbers between 2 and 10 (discarding Aces at that point).

The code would be slightly different because it has different perceptual apparatus, but still pretty much the same:

return false if card.rumpled?
return false if card.maker == 'Zed'
return false if card.color == 'black'
return false if ['2', '3', '5', '7'].include?(card.value)
return true

It does bug me that the table looks so much more complex than the code it describes. It still contains a lot of words that don't matter to either the programmer or someone trying to understand what the program is to do. How about this?

All the following must be true to accept a card:
description example counterexample
the right manufacturer Bicycle, Bingo Zed
the right color red black
the number is prime 2, 3, 5, 7 4, Ace, Queen, etc.
the card is unrumpled yes no

From this, the Fit fixture could generate a complete table of all the given possibilities, run that, and report on it. (Side note: why did I pick Queen as a counterexample instead of Jack or King? Because if the program is storing all cards by number, the Queen will be card 11. Since I'm not going to show all non-primes—believing that more trouble than it's worth—I should pick the best non-primes.)

The same sort of table could be created for cases where any one of a list of conditions must be true.

Now, many conditions are more complicated than all of or none of or any one of. However, all conditions can be converted into one of those forms. Here's an example.

Suppose you're allowed to pay a bill from an account if it has enough money and either the account or the "account view" allows outbound transfers. That would be code like this:

class Account
  def can_pay?(amount)
    balance >= amount && (self.may_transfer? or view.may_transfer?)

However, that could also be written like this:

class Account
  def can_pay?(amount)
    balance > amount && is_money_source?

  def is_money_source?
    self.may_transfer? or view.may_transfer?

I claim that code is just as good or even better. It's better because there's less of a chance of a typo leading to a bug (writing a && b || c instead of a && (b || c)). It's also arguably better because a new word and perhaps idea have been introduced into the project language: "money source". I think finding the right words is often important.

The corresponding tables would be like this:

All of the following are required to pay a bill:
the balance must be sufficient
the account must be a money source

One of the following is required to be a money source
the account may transfer
the accounts view may transfer

In this particular case, I left off the Example and Counterexample columns because they're obvious. I'd expect the fixture to fill them in form me. I didn't include a table about the balance being correct because I wouldn't think the programmers would need it, nor would others need it to believe the programmers understand it.

One thing that worries me about this is that the table doesn't rub your nose in combinations. Such a table is more likely to force you to discover business rules you'd forgotten about, that you'd never known about, or that no one ever knew about. (Well, it does that for a while - until the tedium makes your mind glaze over.) In a way, this fixture makes things too easy.

On the other hand, there's something to be said for protecting later readers from the process through which you convinced yourself you understood the problem.

I'm tempted to launch into implementing this, but I have other things to work on first.

## Posted at 11:49 in category /fit [permalink] [top]

Thu, 21 Sep 2006

Do good work

I read Tom Wolfe's The Right Stuff a zillion years ago. One passage hit me then, and it's stuck with me. The time is somewhere in the beginning of the Mercury program:

Asking Gus [Grissom] to "just say a few words" was like handing him a knife and asking him to open a main vein. But hundreds of workers are gathered in the main auditorium of the Convair plant to see Gus and the other six, and they're beaming at them, and the Convair brass say a few words and then the astronauts are supposed to say a few words, and all at once Gus realizes it's his turn to say something, and he is petrified. He opens his mouth and out come the words: "Well... do good work!" It's an ironic remark, implying "... because it's my ass that'll be sitting on your freaking rocket." But the workers start cheering like mad. They started cheering as if they had just heard the most moving and inspiring message of their lives: Do good work! After all, it's little Gus's ass on top of our rocket! They stood there for an eternity and cheered their brains out while Gus gazed blankly on them from the Pope's balcony. Not only that, the workers—the workers, not the management but the workers!—had a flag company make up a huge banner, and they strung it up high in the main work bay, and it said: DO GOOD WORK.

That came to mind when I read this abstract:

This paper presents a fully independent security study of a Diebold AccuVote-TS voting machine, including its hardware and software. We obtained the machine from a private party. Analysis of the machine, in light of real election procedures, shows that it is vulnerable to extremely serious attacks. For example, an attacker who gets physical access to a machine or its removable memory card for as little as one minute could install malicious code; malicious code on a machine could steal votes undetectably, modifying all records, logs, and counters to be consistent with the fraudulent vote count it creates. An attacker could also create malicious code that spreads automatically and silently from machine to machine during normal election activities—a voting-machine virus. We have constructed working demonstrations of these attacks in our lab. Mitigating these threats will require changes to the voting machine's hardware and software and the adoption of more rigorous election procedures.

Since this is by no means the first report, I feel safe in saying Diebold is not DOING GOOD WORK.

I wish people who could matter—that especially means you, Fourth Estate—cared. We're all on top of the freaking rocket. (Not just the US, since the size of our military and economy puts much or all of the world on the rocket too.)

I'm sure there are people at Diebold who feel embarrassed or even humiliated by what their company is selling. If any one of them wants throw caution and good sense to the winds and hang up a DO GOOD WORK banner, I'll buy it for you. Seriously.

## Posted at 20:10 in category /misc [permalink] [top]

Learning from you

I've been invited to the Software Practice Advancement Conference. The idea appeals: expense-paid trip to London, opportunity to rouse the rabble along some lines I'll be previewing here as I have time, and a conference that's said to be good (I've never been). On the other hand, I hate overseas flights because I can't sleep on planes, and Dawn almost certainly can't come with.

Here's what would tip me over the edge. There are lots of people I could learn from in London. If there are teams there who do something really well (making small stories, writing FIT tests, release planning, etc. - anything), I would like to come work with you for several days. Not just visit and watch, but act as much like a team member as I can. Let me know.

P.S. The idea of visiting practice is part of what I want to rouse the rabble to, something that lives in the same space as the MFA for Software, something that's part of my formal discussion of Jim Waldo's OOPSLA essay On System Design, which will be titled something like Surviving in a World of Ever-Looming Malignity: Or, Monasticism for the Married.

UPDATE: Yes, I'm not expecting to be paid for the visits.

## Posted at 19:16 in category /conferences [permalink] [top]

Mon, 11 Sep 2006

Across the chasm

A couple of years at the Agile conference in Calgary, a big topic of discussion was whether Agile was poised to cross the chasm from visionary early adopter types to the early mainstream. This year at Agile2006 it sure seemed to me we had.

If I recall the high-tech adoption curve correctly, a big difference between the Visionary early adopters and the Pragmatist early mainstream is who they talk to. The Visionaries talk to the Technology Enthusiasts to find ways to have big wins. The Pragmatists talk to other Pragmatists, especially ones in the same industry, to find ways to have safe wins.

My main client these days is a good example of a Pragmatist. Before adopting Scrum, they methodically went to visit other companies that had been using Scrum successfully. That's the first time I've seen that.

Agile in the mainstream is definitely a good thing, but every silver lining comes with a cloud. I worry that the clear sunshine of innovation will be obscured by the mists of scale. (Sorry about that...)

If you believe Moore, the mainstream market naturally shakes out into a single dominant "gorilla" and several "chimps" that scrabble for the leavings. He uses Oracle as an example of the gorilla, companies like Sybase as examples of chimps. Or you could think of the relational model in general vs. other ways of organizing and accessing persistent data.

On the one hand, that's good for innovation: the chimps have to find some angle to distinguish themselves from the safer gorilla choice. On the other hand, the innovation is constrained: it can't be too wildly different from the gorilla or else you're no longer in the mainstream market. (The distinction here might be between object databases—never made it in the mainstream—and adding object-ish features to relational databases or just figuring out how to make object-relational mapping work.)

But more important, to me, is a redirection of talent. The gorilla of Agile is Scrum + a selection of XP practices (perhaps most often the more technical ones like continuous integration or TDD). Consultants and consultancies can make more money, grow their practice faster, and have more influence by helping new teams start with Scrum+XP and by taking steps to make Scrum+XP more palatable to large segments of the mainstream market (the later mainstream, what Moore calls Conservatives). People doing that don't have time to do other things.

We saw that at Agile 2006, where the proportion of novices perhaps reached some sort of tipping point that made it more like a conventional conference. That's not a criticism: the Agile Alliance is there to help Agile projects start and Agile teams perform—says so right on the website—and making sure the beginner is served is absolutely necessary to those goals.

So that's all good. But I'm not comfortable unless I've got the feeling that there's something just beyond the horizon poised to surprise me. I'm not usually the one to find it: I'm more of a synthesizer, amplifier, or explainer than an innovator. So I selfishly need people out there searching, not teaching Scrum+XP.

I'm getting a sense that some significant chunk of people are ready for Agile to take a surprising jump forward. See, for example, what Ron Jeffries has recently written. Some part of my next year will be spent in support of that. I have at least one whacky idea, a bit related to the MFA in software.

I'll be poised to spring into action soonish. Just let me get this book done, please let me get it done, without any of the changes in response to reviewer comments introducing a nasty bug.

## Posted at 21:34 in category /agile [permalink] [top]

Tue, 29 Aug 2006

Two conferences

Give a thought to going to the second Continuous Integration and Testing Conference in London on October 6-7. I went to the first one and liked it. I'd go to this one, but I understand you can't take water on planes now and I'm mostly water.

I will be going to the Simple Design and Test conference near Philadelphia (USA), on October 27-29.

And RubyConf is sold out already. Rats. That'll teach me.

## Posted at 09:18 in category /conferences [permalink] [top]

Mon, 28 Aug 2006

Two claims about clean code

Agile depends critically on programmers keeping the code clean. Lots of us know important steps in making code cleaner: remove duplication, rename methods and classes as their purpose changes, be wary of if statements, check if methods are composed, move methods that display feature envy, and the like.

I make two claims.

  1. A lot of the craft of being a good programmer is how you sequence those individual steps, how you make them work together.

  2. Standards of cleanliness ought to be situational. For example, consider an application in an extremely fluid domain, one where there's a considerable business advantage to having a code base that's ridiculously flexible, one whose capabilities suggest new features. Contrast it to a purely CRUD app. The first ought to be much more aggressive about naming, I bet, and I wouldn't be surprised to see the programmers favoring embedded domain-specific languages. (As someone once said, "All large systems eventually end up with a Lisp implementation inside them.")

I wonder how I could learn more about that? The best way would be to work with other people on several disparate systems for a long time—which is not in the cards.

## Posted at 11:17 in category /agile [permalink] [top]

Sat, 12 Aug 2006

Over the hump

I have finished the review draft of Scripting for Testers. I am going on holiday.

## Posted at 16:29 in category /testing [permalink] [top]

Fri, 04 Aug 2006

Workshop on Agile performance testing - Exeter, England

Posted at the request of Ross Collard, organizer.

WOPR7 is a workshop on agile performance testing. See www.performance-workshop.org for details, including how to apply.

It will be held in Exeter, England on October 12 - 14, 2006 (Thursday - Saturday). WOPR7 is a peer-level mini-conference, invitation-only, deliberately small and collegial, and free. James Bach will conduct a related one-day tutorial on Wednesday, October 11, also free.

Prior WOPR workshops have uniformly been rated outstanding. WOPR6, for example, was held on the Google campus in California last April, and attracted 120 applicants for the 20 seats available.

Some myths may make people ambivalent about applying to attend WOPR7 --

Myth 1.. Travel to the U.K. is a hassle / we do not have a budget for European travel / the food is bad in England. Actually, Exeter is a delightful, semi-medieval university town. The climate will be great in October. So will the camaraderie.

Myth 2.. "In my organization, we do not do agile performance testing and thus do not have any experience reports (ERs) to share on this topic." In response, I ask questions like: "Well, do you test and compare the performance iteratively?", and "Do you have to be agile and respond adroitly to fast changing conditions, often under tight deadlines?" They always say: "Of course! But we do not rigorously follow XP, Scrum, etc."

Myth 3.. "I would like to go to WOPR7 as a participant but have no ER to submit, so I know I will not be selected." We have room for motivated beginners as well as experts.

Myth 4.. "I would like to go to WOPR7 and I do have a story (an ER), but it could not possibly be selected over ones by Harty, Sabourin, Pearl, Barber, etc.". See comment above on myth #3.

## Posted at 07:43 in category /conferences [permalink] [top]

Mon, 31 Jul 2006

The Gordon Pask Award 2007

Each year at the Agile200X Conference, the Agile Alliance presents the Gordon Pask Award for Contributions to Agile Practice. Here's its description:

The Gordon Pask Award recognizes two people whose recent contributions to Agile Practice demonstrate, in the opinion of the Award Committee, their potential to become leaders of the field. The award comes with a check for US$5000.

Last year's recipients were J. B. Rainsberger and Jim Shore. This year's are:

Laurent Bossavit, for translating Extreme Programming Explained into French, for early and helpful activity on the English-language XP mailing list, for organizing a French-language site, mailing list, and wiki, for XP Day France, for the (incipient) thoughts on his blog, and for his championing of code dojos.

The collaborators Steve Freeman and Nat Pryce for helping found XP Day, for their long-time involvement in the Extreme Tuesday Club, for their joint role in the development, evolution, and popularization of the idea of mock objects and its realization in jMock, and for the networks of collaborations they're involved in (storytelling in Fit and scrapheap programming, for example).

(That "network of collaboration" thing presents a problem. Steve and Nat are extreme examples of a problem the Pask award faces: given the collaborative nature of Agile, any boundary you draw that says "this idea, here, is due to that set of people, there" is bound to leave out contributors. Steve and Nat are far from the only people who've worked on mock objects, and they've both collaborated with other people on other things. Where do you draw the award's line? There'd be some justification for giving it to the whole of London, or at least to the whole Extreme Tuesday Club.

(The committee—Rachel Davies, J.B. Rainsberger, Jim Shore, and me—discussed such matters for two and a half hours, maybe more, one night [causing me to rudely skip dinner with Laurent, for what I hope he now thinks is a good reason]. At times, I found myself thinking that maybe the whole idea was too much trouble. Where I ended up is that we should not avoid doing greater good because we cannot distribute all the credit that's deserved. I hope no one gets upset. Believe me, trying to pick two awards from many possibilities is just no fun at all.)

Our criteria are evolving (and, starting with this second year, they're mainly in the hands of the past recipients). We are looking for people who provide both ideas and actions. We want people who are advancing the state of the practice. But we also want people who are spreading knowledge of the existing state of the practice, so that Agile teams know what more there is to learn. And we also want people who are helping people on a personal level, not just at the abstract level of ideas.

## Posted at 09:31 in category /conferences [permalink] [top]

Sun, 30 Jul 2006

An unhappy trend: a looming humanist/technologist split

This trend is one I had trouble explaining at Agile 2006, so bear with me. (Or skip the whole thing - might be the best use of your time.)

Imagine telling the story of how the bicycle evolved. You could tell it as a story of technology. In it, the bicycle evolved from a crude prototype to today's designs because of improvements in materials technology, a greater understanding of applying human power to spinning wheels, and changing "ecological niches" (from unpaved or poor roads to both roads that allow greater speed and also steep paths ridden purely for recreation).

You wouldn't really include people in that story. Yes, tires got wider because people all of a sudden chose to ride down mountains, but once that niche was chosen, the form of the bicycle can be seen as inevitable. Or you might note that the frames of some bicycles are shaped differently because (first) women riders wore skirts and (later) because of tradition. But, allowing for that, the form of the woman's frame follows function.

In such a story, one of technological determinism, it would be absurd to say that a mountain bike would look different if, say, society's class structure were different.

But there's another kind of story, one of social determinism, where human relations play a driving role. A socially deterministic story of ethernet might point out that squirting packets into the ether, checking for collisions, and possibly resquirting isn't an inevitable design. After all, at one point, token ring networks were a pretty serious contender, and they were much more orderly: you wait until you get a token, then you talk. No collisions allowed. A socially determinist story would point out that ethernet was developed at a deliberately freewheeling, relatively unstructured laboratory, not too many miles from one of the most try-it-and-see-what-happens cities in the world (San Francisco). The story would try to work through how the design of ethernet reflected the overlapping societies of the actual humans participating in its creation.

A true socially determinist story sounds weird to me (and, I suspect, you). After all, surely Ethernet was a better design than token ring: no complexity of worrying about a machine crashing while it has the token, for example. And, therefore, someone would have invented it anyway, and it was just happenstance that they worked at Xerox Palo Alto Research Center.

But we technologists tell pretty weird stories, too. Remember "information wants to be free" and "the Net interprets censorship as damage and routes around it"? Those are pure technology determinism, and they seem at least a tad less plausible today than they did around the time of the Netscape IPO.

As something of an instinctive middle-of-the-roader, stories that combine the human/social and the technological make the most sense to me. Agile is noteworthy for telling such stories. For example, the story of an XP project is not the story of a progression of work artifacts (as many processes are); instead, it's a story that includes people sitting in particular physical configurations and deliberately not replicating the ownership relations of the society around them (when it comes to code and expertise).

But at the same time, XP isn't a story you can tell well without talking about technology. It's not a story of a surgical team or a squad of soldiers: it's a story of working software, changed frequently in behavior-preserving and behavior-adding ways.

So, for example, continuous integration is partly about a social reaction to a shifting technological practice. Suppose you're working alone on a machine. You write code that passes the test that motivated it. You also run a whole bunch of other tests that take a few seconds to run. When one of them fails, that's no big deal, so there are no social pressures to be extra careful to avoid them.

In contrast, failing nightly builds disrupt the project much more, so—often—peer pressure is used to prevent them. (In one company, anyone who broke the build had to keep the Frog of Shame on their monitor for all to see.)

Jeffrey Fredrick's article on continuous integration shows how a particular technology—semi-fast notification of semi-substantial test runs—requires a social contract different from both the super-fast local build and the unbearably-slow nightly build:

CI is different [from a nightly build]. Its builds don't need durable build products to be worthwhile. They are a way for a developer to have a conversation with the system, to get reassurance that he has done his part, at least for now. And with a CI build, the cycle time is short, the number of affected parties small, and the cost of failure low. This change in the cost of failure makes for a significant change in behavior—if you'll let it. I've met people who want CI failures to be a shaming event, similar to what happens when the nightly build breaks. But given the nature of a CI build, does this make sense?

[... A] CI build should be tuned to surface failure feedback as quickly as possible, but this feedback is not a management tool; it's an enabling tool. It allows the developer to take responsibility for each check-in in a way that isn't possible (or at least not cost effective) in the absence of such a system. [...] Tracking the failures caused by each individual would only discourage the behavior of frequent check-ins, which you want to promote.

Fredricks' article demonstrates a nice back-and-forth between the technical and social. It's that integrated story that I worry is slipping away. One way it will happen is for those with a technologist bias (most people on our teams) to vote with their feet. The dominant methodology today is "Scrum plus some of XP." The parts of XP that often seem to get left out of the "some" are the human ones: pairing, shared code ownership. Whatever you may think about the merits of XP's particular practices, they do tend to make it obvious that a team has to form some sort of a social contract. Maybe the habit sticks. Maybe it won't when the team choses from a buffet of practices, picking the sweet corn of refactoring over the brussels sprouts of shared code ownership.

Perhaps because I have a technologist bias, I'm more alarmed by social stories that include no technology. These are stories that involve how Placating people interact with Blaming people, or how INTJs interact with ENFPs—but don't involve what they're interacting about. Such models apply as well to a surgical team as to a software team, despite the fact that "crash" has a profoundly different meaning to each of them.

I'm not denying value to pure-technology or pure-social discussions. I just think they're seductively easy. I want more discussions like one that was had in Jeff Grover's and Zhon Johansen's wonderful discovery session at Agile 2006. They began with exercises demonstrating particular human quirks, but the talk afterward seemed to zero in on specific practices.

One exchange sticks out in my memory. There was an exercise about people's personal space. That, in itself, is nothing special (if you already know about it), but I thought the resulting discussion of pairing went in a nice direction. Personal space surely matters in pairing, but someone observed that sitting side by side is different than sitting face-to-face, and that the focus on a shared object (something external to gesture at) allows a smaller personal space. Someone else then noted that personal space is why he so wants chairs with wheels in pairing environments. That way, when people need to have a longer discussion, they can turn toward each other and simultaneously scoot back to maintain comfort. I thought that was cool. It's about social organization of people in a particular physical environment doing a particular task.

Images from the Project for the Scientific and Cultural Aspects of the Bicycle (Amstel Institute), Webopedia.com, The Frog Store, and RoleModel Software.

## Posted at 11:42 in category /agile [permalink] [top]

Wed, 26 Jul 2006

An unhappy trend: leadership

At Agile 2006, I'm seeing or inventing several unhappy trends that I want to call out.

At the first Agile Development Conference (the predecessor conference), I noted with surprise how often the word "trust" came up. At this conference, the surprisingly common word is "leadership." As in: "what's needed to make Agile succeed is executive leadership." Noteworthy: what was once called the Executive Summit is now the Leadership Summit.

As an inveterate champion of the little guy, I've always hated the Great Man theory of business. That's the idea that it all depends on the brilliance and Will of the Jack Welches and Chainsaw Als. I'm seeing that theory accepted as a matter of course in Agile, and it bugs me. It's part of the domestication of Agile: the fitting of something potentially disruptive into the comfortable patterns of life.

Imagine, if you will, the Great Man theory of the Scrum Master: "a team needs the leadership of their Scrum Master to excel." That's the opposite of the truth: the Scrum Master is not a master of the team; she's a master of Scrum: she knows best how the team can use Scrum to succeed. The team leads her, rather than vice versa. As both Mike Cohn and Ken Schwaber have said to me, one of the hardest parts of being a Scrum Master is not leading: is keeping your mouth shut and insisting that the team solve their problem rather than depending on someone else to tell them what to do.

I view executive leadership in the same way. We know how to do software better. It's the executive's job to support us in doing that—to clear obstacles out of the way of our practice—and not to lead us. We already know where to go. We know how to do our job. We need to be assisted, not led.

## Posted at 06:01 in category /agile [permalink] [top]

Thu, 20 Jul 2006

People who want to learn Ruby in Cleveland

Someone from the NOSQAA is being relentless about getting me to do something at their annual Quality Expo in Cleveland, Ohio, USA, in early November. (It happens that I have a client in Cleveland these days.)

That ties in with some thoughts about the long-overdue Scripting for Testers book. (Which is getting close, honest!) I'm not a fan of two- or three-day 60-people-in-a-room training courses. Even if there are lots of exercises, most of the course doesn't stick. It doesn't cause the kind of change that I want to cause.

So, when people call me and tell me they want me to train their testers in Ruby, I'm not planning on offering them such a course. Instead, I'm going to pattern my offering on the way I do consulting, which is to fly in for a week per month, sit down with people at computers and do work on their product, repeating the trips until they decide I'm no longer worth the money.

The Ruby variant would go like this: I won't train the testers in Ruby. I wrote a book that's supposed to allow them to self-train. So I want the company and testers to demonstrate that it won't all be a waste of time by working through parts 1 through 3 of the book on their own and starting to apply Ruby to their own projects. I'll come in, once or more, to help them with those projects, make observations, give impromptu mini-courses on topics I think they should know. That will be more expensive and time-consuming than a stand-up course, but it will have a much higher chance of working.

But I can do more, tying Ruby into my normal consulting. Suppose I'm flying to a city once a month anyway. What I'd like to do is organize something akin to a flash mob: a flash user group of testers (and others) who want to learn scripting. They'd learn it on their own, in concert or individually. When I'm in town, we'd have dinners devoted to the topic. At some point, we'd cap it off with a one-day mini-conference on Ruby and testing. I'm envisioning that the morning would be devoted to enticing beginners. Again, I'd downplay the lecture. What I'd want is the members of the existing flash user group to pair up with newbies and show them the Wonders of Ruby. In the afternoon, we'd have advanced topics. Perhaps something like RubyConf would work: have people present how they've used Ruby in their job. That way people would get ideas, hook up with people doing similar things.

Then, having gotten things going, I would ride off into the sunset.

To see if that works, I'd like to do a dry run in Cleveland. The question is whether there's interest. If you're near Cleveland and interested, drop me a line. Forward this URL to people in Cleveland. Let's see if we can get a critical mass going. If so, I'll tell Ms. Persistent-Far-Beyond-the-Call-of-Duty-They're-Lucky-to-Have-Her that she's won me over.

## Posted at 09:03 in category /ruby [permalink] [top]

Wed, 19 Jul 2006

Agile 2006 Topics

Here are things on my mind these days. If you're at Agile 2006, and you have experience to offer, please let me listen to your story.

  • When people start working on business-facing tests, where do they end up? I'd like to hear stories of what you tried, what you persisted in doing, what you slid back from, what you rightfully rejected.

    (Background: I don't have the same knowledge of the steady state of testing as I do of other aspects of the project. For example, if you told me you have standups, but didn't pair, I'd shrug. Not unusual. If you told me your team always programmed in pairs but didn't have standups, I'd be surprised. For business-facing testing, I don't know what to be surprised at and what to shrug at.)

  • Part of what happens in some Agile projects is a transition from this style of testing:

    to this:

    In the first, there's a limited amount of testing, almost exclusively manual, through the same interface the user uses. The transition is to a lot more automated testing of pieces of the system and fewer end-to-end tests (be they automated, manual exploratory, or a combination). This frequently causes concern: how will we be sure the pieces fit together?

    What happened in your project? Did the concern go away? Was it time that did it, or did you find some other way to convince people? What did you do before the concern went away?

  • Tell me the story of your product director. How uncomfortable was she at the start? What were her big worries? How comfortable did she get? How did that happen?

Thanks. I should be easy to spot. I still look roughly like the picture at the top of the page, though with a mustache and dorky goatee now. Something like this:

## Posted at 16:05 in category /conferences [permalink] [top]

Tue, 18 Jul 2006

Refactoring, defined

A while back, I sat in while Ralph Johnson gave a dry run of his ECOOP keynote. Part of it was about refactoring: behavior-preserving transformations. The call was for research on behavior-changing transformation that are like refactorings: well-understood, carefully applied, controlled.

Ralph mentioned that persistent question: what does "behavior-preserving" mean? A refactoring will change memory consumption, probably have a detectable effect on speed, etc.

My reaction to that, as usual, was that a refactoring preserves behavior you care about. Then I thought, well, you should have a test for behavior you care about. ("If it's not tested, it doesn't work.") That, then, is my new definition of refactoring:

A refactoring is a test-preserving transformation.

If you care about performance, a refactoring shouldn't make your performance tests fail.

## Posted at 21:30 in category /coding [permalink] [top]

Tue, 11 Jul 2006

National debt

I stumbled across a bunch of graphs about the US national debt, courtesy Mark Wieczorek. Keeping in mind my suspicion of simplistic use of numbers, one graph is still pretty interesting for someone who grew up, as I did, hearing all about "tax and spend" liberals of the Lyndon Johnson variety. It's below, showing the yearly increase in the debt in constant dollars. I've overlaid color. Republican Presidential administrations are red, Democratic blue. The bars across the top show control or near-control of Congress.

Note: the vertical lines are approximate. If I were truly serious, I'd make some sort of effort to determine if the lines should be shifted to the right (since Presidents don't have an instantaneous effect). However, I'm mainly doing this because I'm stuck on something I'm supposed to be writing, I need a break, and I'm in a hotel room in Cleveland.

The current situation is worse than that graph makes it appear, as the high level of yearly debt under Bush is projected to continue, as shown on the right (non-constant dollars).

The final picture is my family. The small ones get to pay it off—the "bridge to nowhere", an overpriced prescription drug plan, Paris Hilton's tax break on unearned income, sloppy accounting in Iraq, a culture of corruption that's far beyond what Democrats achieved in their days of power, all of it. That's shameful. I expect my children's generation will look at the adults of today and call us lazy, feckless, self-centered, and stupid. With justice.

## Posted at 20:03 in category /misc [permalink] [top]

Sat, 08 Jul 2006

The bloat trochar and the rulebook

Jeffrey Fredrick and Kevin Lawrence liked this, so I'm posting it here.

Background: On the Agile Testing list, someone wrote:

To perform surgery, you always first scrub.

Always always always. One size fits all. No exceptions. Hire a nurse whose only job is to make sure everyone does it.

Someone replied:

This statement ignores context, and its application breeds contempt not only for context but for nurses.

I was in one of those moods, so I wrote this:

I've been talking about scrubbing for surgery with my wife (who both does it, and has a grant proposal out to study something related to it). What strikes me about it is something that's been said here before about testing in Agile projects, but I think needs to be said again.

One thing about scrubbing is there is universal agreement about the goal: minimize the amount of "trash" (bacteria, etc.) that gets into the wound.

Even though, in a non-emergency, you do always always always scrub, I was surprised at how much variation there is. Some people have a rule that you scrub each of four sides of each finger ten times. Some people think you don't have to count; you just have to scrub for ten minutes. Some scrub for five. People scrub with different things. And so on.

Although the rules vary, they are rules, rather than judgment calls. People do not scrub according to today's context. They scrub the way they always scrub, which is likely the way they were taught or the way their colleagues do it. It's not really possible for them to judge context -- there's just too much noise in the causal chain from scrubbing to surgical outcome. That also makes experimental justification of scrubbing techniques hard. Still, if pressed, a surgeon could make an argument for her style in terms of the agreed-on goal.

The other thing that struck me is the degree to which the (rich) world has been constructed around the goal of sterility.

  • Being as sterile as surgeons ought to be is unnatural. Therefore, there are mechanisms in place to make it harder to give into nature. The role of scrub nurse is one such mechanism.

    Obsessive drilling in the rules is another. Because surgeons have to push in the opposite direction from what's Only Natural, strict rules are safer on average. In the way that most people think they're better-than-average drivers, most people think they're better-than-average judges of context. If they err, they'll err on the side of doing what's Natural. So you must push the pendulum a little further in the other direction, or you reduce the number of times they feel the need to make a judgment rather than follow the rule.

  • The trappings of the world are built assuming you will not be able to scrub sometimes. Field dressings don't have to be sterilized; they come that way. Ditto for instruments. Ranchers and veterinarians carry bloat trochars. If they come upon a cow with life-threatening bloat, they don't have to stab it with a pocket knife to let the gas out. Instead, they can screw in the bloat trochar. That both releases the gas and brings the stomach flush against the body wall, minimizing the trash that gets out of the stomach into the body, greatly reducing the chance of peritonitis.

    It's really impressive when you think about it: there's this vast mechanism, mobilizing the efforts of thousands upon thousands of people, all with the aim of making minimizing trash something that just happens in the expected course of things. It's like what Whitehead said about notation:

    By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race.

Testing in Agile projects:

  • There is not agreement about the goal. This is repeatedly apparent in the sometimes implicit, sometimes explicit subtext of the conversations between Michael Bolton and Ron Jeffries. Michael's emotional center of gravity appears to be about finding unanticipated problems in supposedly finished work (including the work of thinking). Ron's appears to be about smoothly doing the work with confidence that the end result will be as instructed.

  • There are differing assumptions about an actor's role in the world. Michael is context-driven: understand the context and optimize your behavior within the context. Ron is goal-driven: understand the goal and change the world so that the goal is achievable and even "just happens in the expected course of things".

Those are the extremes, of course. I'm sure Michael takes advantage of opportunities to change the context, and I've seen Ron adapt to the context. However, the founding document of the context-driven school (Kaner et. al's Testing Computer Software) says, right on page vii, in bold italic font, "This book is about doing testing when your coworkers don't, won't, and don't have to follow the rules."

I switched from the context-driven approach to what I saw as a different approach because I saw Agile as making two key shifts with respect to testing:

  • Programmers are ready to follow rules they wouldn't have before because the test-first people found the trick to making those rules be of immediate benefit to programmers.

  • The change-embracing attitude of Agile largely eliminates the emotional distinction between feature and bug (replacing both with "a unit of work"). That, together with collocation and frequent releases, can make "testers are part of the team" a fact rather than a slogan.

If I am right and the debate is really about emotional comfort and personal identity, I don't expect argumentation per se will resolve it. Of the people who talk about idea change in a convincing (to me) way, only Feyerabend gives much of a role to argumentation. His Against Method is (in large part) about how Galileo argued in favor of the Copernican system in Dialogue Concerning the Two Chief World Systems. According to Feyerabend, Galileo cheated. He misrepresented the opponents' arguments, ridiculed their conclusions by surreptitiously substituting his own assumptions for theirs, studiously avoided the weaknesses behind his favored theory, and appealed to his readers' desire to hang with the cool kids.

## Posted at 12:04 in category /agile [permalink] [top]

Tue, 27 Jun 2006

The Agile 2006 Fringe

At an Agile Alliance board meeting, some of us were fretting that Agile 200X might go the way of a lot of conferences: the vast bulk of the attendees would be novices to the field, there would be a fixed set of experienced constant attendees (mostly the presenters), and the middle layers of experience would be missing. The middle layers wouldn't come because so much of the content would be tailored to novices.

There's nothing wrong with novices. More: a conference should cater to novices. However, that middle layer is necessary to advance the field and keep the conference lively and changing.

Jeff Patton said something—I forget what—and I spun his idea into the idea of the Agile Fringe. It's based on the Edinburgh Fringe, which "surrounds" the Edinburgh Arts Festival (and, in fact, dwarfs it). My understanding of the Fringe is that anyone willing to rent space can present anything they want. Fringe events can be more avante garde than would fit in the regular Festival.

My idea is for an Agile 2006 Fringe. People willing to donate the proportional cost of a room to the Agile Alliance (or do something else that indicates they're serious) can have it for that time to do something of their choosing. They may throw it open to the public—post notices all over the conference—or they may confine it to a secretive cabal of insiders. Whatever they want.

My preference would be for something that involves doing, rather than only talking, since there are Open Space sessions for group discussions. But it'll be your space and your time: whatever you want is fine with me. For example, I could imagine continuing an Open Space discussion with a subset of like-minded participants.

As has been the case all year, I'm too overwhelmed to do an adequate job at any of my wild-eyed (or even staid) volunteer activities. I'm pretty sure we have the room. I don't know the cost yet. I've put little thought into it. It's up to you. If you want to make something of the opportunity, feel free. Contact me to tell me what should happen.

P.S. The Agile 2006 hotel is full. I believe attendance is already well over last year's, and another sell out would surprise no one.

## Posted at 07:31 in category /conferences [permalink] [top]

Operations manager for the Agile Alliance

The Agile Alliance is hiring a part-time operations manager:

POSITION: Operations Manager for a non-profit software professional organization, 15-25 hours per week to start, with the possibility of later becoming a full-time position as the organization grows.

SALARY: 25-50$/hour depending on experience


A position with the Agile Alliance, a non-profit organization that supports individuals and organizations who use Agile approaches to develop software. Initially a half-time position, the duties include directing other paid staff and contractors, coordinating activities of the Agile Alliance, and supporting the Board of Directors. Find more about the Agile Alliance at http://www.agilealliance.org.

More here. Pass it on.

## Posted at 07:31 in category /misc [permalink] [top]

Sun, 25 Jun 2006

RubyConf proposals

David A. Black reminds me to remind you that RubyConf proposals are due June 30. Here's the proposal link: http://proposals.rubygarden.org/.

## Posted at 10:24 in category /ruby [permalink] [top]

Agile with mainframes

I have a client that has many, many mainframes. Every project I might coach involves mainframes to a much greater extent than I've experienced before. I'd like to help the mainframe people with their programming and, especially, testing. If anyone has experience reports for me to read or stories to tell me, please do. I've already ordered Agile Database Techniques and Refactoring Databases.

I will set something up on the topic at Agile2006, both in the Open Space sessions and in the Agile 2006 Fringe (to be explained later).

I'll summarize anything I find out. If you like to write, your experience might fit in either Better Software or Agile Times.

## Posted at 10:13 in category /agile [permalink] [top]

Balancing forces in business-facing tests

This week, I gave seven (!) presentations of a live demo of testing and design in an Agile project. I started with a product director's idea for a story; showed the business-facing tests used to nail down that idea for the programmers; demonstrated how a programmer can use testing to make every step a small, safe, checked one; and ended (in some versions) with a working feature to be demoed and then manually tested (in an exploratory style). The idea was to get across a gut feel for how development feels, plus show some key principles in action.

Here's something that really came into focus as I (at first) kept radically changing the presentation and (later) tweaked it:

  1. A business-facing test describes some facts that should be true of the new feature. (The feature uses this business logic, or appears something like this on the screen, or is used in this workflow.)

  2. The product owner must be able to read the test to check that the team has captured the most important parts of the conversation in which the feature was described. (But note that the document does not end the conversation.)

  3. At any given moment, the test document can be used to ask which of the facts it describes actually are true of the product. (That is, it's executable.) That property lets programmers use it to drive programming.

I expect product directors to read these documents collaboratively, sitting down with at least one programmer or tester. So the product director has to be semi-comfortable with the notation. (I also like it if that notation lends itself to looking at the feature in a different way. For example, a tabular notation for state machine designs encourages you to think through more cases than a node-and-arc notation does. That's also why Fit tests are good for business rules.)

So we want readability by a non-technical audience. However, the need for the documents to be executable pushes the notation in the direction of the product's implementation language.

It's balancing those two forces that's the trick.

There are two other sets of forces to balance:

Fragility vs. comprehensiveness

The more detail there is in the test, the more fragile it becomes. That means a change to a single fact about the program will break many tests, and the breaking of a particular test may tell you nothing new about the program. That's wasteful.

And yet, detail that is not tested may not be gotten right in the first place. If it is right, but then goes wrong, you may well not notice it.

Excess detail seems to cause the most problem in the user interface. Today, my solution is to have the tests describe intermediate results from user-experience design (as I have glancingly learned it, mainly from Jeff Patton). Today's two types are:

  • Wireframe diagrams (as shown on the right). The tests are a textual representation of the picture. I earlier showed an example, though I now believe it has too much technology-specific language and detail. (Note that there's an argument against wireframe diagrams. I interpret it as an argument against making them too early. But they make a good level of detail to hand to a programmer at some point.)

  • Tests that are akin to scenarios, workflows, or semi-detailed use cases. Here's a snippet from one of my latest examples:

        in_sidebar {

    Notice that the test is (almost) exclusively about how a user moves from one place to another and what information she uses in each place. It's deliberately vague about the form that information takes: what does the notification look like? How does Adam jump? (A link? a button?) Those details will come, in part, from the wireframe test, or they will not be written down in a test at all.

    The details not worth writing down are those that (1) are easily communicated when two people—one a programmer— look at a page, point at page elements, and talk about what they should look like; and (2) are highly unlikely to be changed by accident when working on some other part of the program. (See the latter parts of "When should a test be automated?" for a discussion of that last.)

People vs. process

I sometimes refer to myself as a "recovering abstracter." I used to jump to abstractions way too fast. Now I believe in building them gradually by implementing examples.

Neverthess, abstractions are important. In many programs, the real value comes from the business logic. Those are abstractions (of what's already worked for the business, I hope). All of my tests above abstract out detail. More importantly, the story of a project's ubiquitous language is one of developing shared abstractions.

But the majority of business people, it seems, are not practiced at thinking in abstractions (at least, our kind of abstractions). Notoriously, they want to see the user interface right away, they want it to be pretty (that is, detailed), and they want to talk in terms of what's on a screen rather than the concepts behind it. Their desire to do that conflicts with our desire to abstract away fragile and confusing detail.

We need to strike a balance. Over time, we need to show them that they can get what they want from us more easily if they tolerate our need to write things down in wierd and hard-to-visualize notations. (It worries me that I don't see what we're giving up in exchange.)

## Posted at 10:01 in category /agile [permalink] [top]

Thu, 15 Jun 2006


I'm practicing for a set of five demos I'm doing next week. In each, I'll work through a story all the way from the instant the product director first talks about it, through TDDing the code into existence, and into a bit of exploratory testing of the results. Something interesting happened just now.

Step one of the coding was to add some business logic to make a column in a Fit table pass.

In step two, I worked on two wireframe tests that describe how the sidebar changes. These tests mock out the application layer that sits between the presentation layer and the business logic.

What remained was to change the real application layer so that it uses the new business logic. That, I said (imagining my talk), is so simple that I'm not going to write a unit test for it. Even if I do mess it up (I claimed), I have end-to-end tests that will exercise the app from the servlets down to the database, so those would catch any problem.

You can guess the results. I made the change and ran the whole suite. It passed. Then I started up the app to see if it really worked, and it didn't. The problem is in this teensy bit of untested code:

      def may_add_user?

The problem is that I have an extra s in @current_sesssion. In Ruby, a previously unmentioned instance variable has value nil. It happens that, to the business logic, nil means "no session, so not logged in, so not allowed to create a user."

From this, we can draw two lessons:

  • Maybe all those people who say that even code within a class should go through accessors to get to instance variables are right. Had I done that, the program would have failed—in the end-to-end tests—with a "no such method" error.

  • Hey, one-time program chair of the Pattern Languages of Programs conference, there's this pattern called Null Object...

I'm still not inclined to write a unit test.

This is the neatest thing to happen to me today. But nothing like it better happen in the real demo.

## Posted at 20:09 in category /coding [permalink] [top]

Wireframe style for tests

Earlier, I wrote about sentence style tests for rendered pages. Based partly on conversations about that entry with Steve Freeman and partly on bashing against reality, I've changed the style of those tests.

Since they are about the part of the app that the user sees and since I'd like them to be readable by the product director, I found myself asking where they would come from in a business-facing-test-first world and how the product director would therefore think about them. I imagined that, sometime early on, someone makes a sketch, paper prototype, or a wireframe diagram. So I came to think that this test ought to be a textual, automatically-checkable wireframe diagram. Like this:

  def test_structure_without_audits_or_visits
    wireframe_looks_like {
    }.given_that {

One interesting thing is that I put the setup for the test after the checking code. That's because the page layout seems more important.

How well does that test describe this page? (The sidebar is described in tests of its own.)

I'll let you be the judge.

## Posted at 11:45 in category /testing [permalink] [top]

Tue, 13 Jun 2006

Agile software development and Glade air freshener (a pet peeve)

It really gripes me when people argue that their particular approach is "agile" because it matches the dictionary definition of the word, that being "characterized by quickness, lightness, and ease of movement; nimble." While I like the word "agile" as a token naming what we do, I was there when it was coined. It was not meant to be an essential definition. It was explicitly conceived of as a marketing term: to be evocative, to be less dismissable than "lightweight" (the previous common term).

Discussing the characteristics of Agile software development by reference to the dictionary is akin to discussing the product characteristics of Glade Air Freshener according to the definition of "glade" as "an open space in a forest". There is some limited use: I can imagine an S.C. Johnson and Son executive objecting to a proposed new scent by saying a user smelling it is more likely to think of a day at the beach - "that briny smell" - than Bambi at the edge of the forest. In the same way, I can imagine someone saying of a development team that since it doesn't respond nimbly to changes in the business environment, it sure doesn't seem to be "agile" from the perspective of those paying the bills.

But it would be unreasonable for our executive to object to chemists adding tri-nitro-benzo-dawnocaine because it's extracted from sea water, not meadow earthworms. By the same token, the nimbleness of the Agile methods from the point of view of the business may be achieved by being inflexible about frequent releases of shippable software. Or the project might insist on a path toward faster feedback (like unit testing) even if that path's short-term costs are higher than some alternative and the long-term benefits of feedback aren't clear in this case.

In a way, context-driven testing may be more agile than Agile testing in that it relies on individual rationality and choice in cases where XP and even Crystal would at least begin by following rules and precedents.

That's why I habitually capitalize the "agile" in Agile testing, etc. It doesn't mean "nimble" any more than Bill Smith means "a metalworker with a hooked blade and a long handle."

## Posted at 08:09 in category /agile [permalink] [top]

Two updates

Richard P. Gabriel is reported to have used the scrapheap metaphor in a 1986 talk about "Used Software".

Robert Chatley and Tom White have been working on sentence style for tests in Java:


## Posted at 07:59 in category /misc [permalink] [top]

Fri, 09 Jun 2006

Sentence style for tests

Steve Freeman and Nat Pryce will have a paper titled "Evolving an Embedded Domain-Specific Language in Java" at OOPSLA. It's about the evolution of jMock from the first version to the current one, which is something of a domain-specific language for testing. It's a good paper.

I've been doing some work recently on an old renderer-presenter project, and I was inspired by the paper to rip out my old tests of a rendered page and replace them with tests in their style. Here's the result. It first has flexmock sentence descriptions of how the renderer uses the presenter. Then come other sentence descriptions of the important parts of the page structure.

  def test_patient_display_page_minimal_structure
    # No audits, no visits, no "add audit" button
    # ... but these do:

I rather like that, today at least. It's much more understandable than my previous tests. After only a few months, I had to go digging to figure them out, but I doubt I'll have to do that for these. Moreover, I think these tests would be more-or-less directly transcribable from a wireframe diagram or sketch of a page on a whiteboard. They're also, with a little practice, reviewable by the product director.

(I'm still very much up in the air about how much automated testing how close to the GUI we should do, but this has nudged my balance toward more automated tests.)

I also remain fond of workflow tests in this style:

  def test_approvals_can_happen_without_logging_in
    # without logging in...
           # no place else to go for now.
           # can't show further approvals because not logged in.
           # this will change in some later iteration.

These workflow tests can be derived from interaction design work as easily as Fit tests are. They're less readable than Fit tests, but not impossibly code-like. These workflow tests are end-to-end. They go through HTTP (using my own browser object, rather than Watir or Selenium), into the renderer/presenter layer, down into the business logic, and through Lafcadio into MySQL.

And, finally, I also am starting to write RubyFIT tests in the style that I've heard Jim Shore call "business-facing unit tests":

Adding a Case

Students can add cases on their own, but they must be approved by the clinician.

When students add a case, clinicians get sent email with a clickable link. When approved, they're taken to the approval page. Their sidebar also shows that same approval URL.


user type allowed?() approval needed?()
clinician yes no
student yes yes
admin no -

I feel reasonably comfortable with the way this project is test-driven from (what would be in reality) business-facing sketches on the whiteboard down to individual lines of code. Well, except for the Javascript. That wasn't test-driven.

## Posted at 13:18 in category /testing [permalink] [top]

Thu, 08 Jun 2006

Three ages of programming (update)

Update to the previous article from Andy Schneider.

The term "scrapheap programming" may date back to Noble & Biddle's OOPSLA '02 Postmodern Programming Extravaganza. (The Biddle link is an old one, but the newer one I found had less on it.)

Pete Windle was the coauthor on the paper of Andy's I cited. Sorry, Pete.

## Posted at 11:32 in category /coding [permalink] [top]

Tue, 06 Jun 2006

Three ages of programming

Let's pretend there have been three ages of programming: the Age of the Library, the age of the Framework, and the Age of the Scrapheap. They correspond to three ages of documentation: the Age of Javadoc, the Age of Javadoc (plus the occasional tutorial), and the Age of Ant.

The first substantial program I ever wrote was a reimplementation of Plato Notes (think USENET news) for the TOPS-10 operating system. To do that, I only had to learn two things: Hedrick's Souped Up Pascal and the operating system's API. I don't remember the documentation for Hedrick's Pascal - probably I mainly used Jensen and Wirth. If you've read most any book defining a programming language, you'd recognize the style. The operating system was documented with a long list of function calls and what they do. Anyone who's seen Javadoc would find it unsurprising—and vice-versa.

This style of documentation says nothing in particular about how to organize your program or how the pieces should fit together. The next Age provides more structure in the form of frameworks. JUnit is a familiar example: you get a bunch of classes that work together but leave some unfilled blanks, and you construct at least a part of your application by filling in those blanks. A framework will usually come with Javadoc (or the equivalent for the framework's language). There's likely to be some sort of tutorial cookbook that shows you how to use it, plus—if you're lucky—a mailing list for users.

The third age is the age of Scrapheap Programming (named after a workshop run by Ivan Moore and Nat Price at OOPSLA 2005). In this style, you weave together whole programs and large chunks of whole programs to solve a problem. (See Nat's notes.) The scraps have a sideways influence on structure: unlike frameworks, they are not intended to shape the program that uses them. But they have a larger influence on the structure than the APIs do. APIs still allow the illusion of top-down programming, where you match the solution to the problem and don't worry about the API until you get close to the point where you use it. In Scrapheap programming, it seems you rummage through the scrapheap looking for things that might fit and structure the solution around what you find.

What of documentation? Programming has always benefited from a packrat memory. One of the first things I did in my first Real Job was to read all the Unix manpages, sections 1-8, and just last year I surprised myself by remembering something I'd probably learned in 1981 and never used since. But I'm not so good at learning by using, which seems more important in scrapheap programming than in the previous ages.

There are two parts to that learning. You need to somehow use the world to direct your attention to those tools that will be useful someday: Greasemonkey, Cygwin, Prototype, and the like. Next, you have to play with them efficiently so that you quickly grasp their potential and their drawbacks.

Perhaps what's needed today is not only a Programming Language of the Year club, but a Dump Picking of the Month club. I'm fighting the temptation to start one right now.

There's a variant of dump picking that plays to my strengths. Once last month, I was faced with a problem and I said "Wait - I remember reading that RubyGems does this. I wonder how?" A short search of the source later, and I found some code to copy into my program. Last week I used something Rake does to guide me to a solution to a different problem.

Which raises another issue of skill. I'm halfway good at understanding Ruby code, even at figuring out why a Ruby app isn't working. As I've discovered when looking for a Java app to demonstrate TDD, I'm much worse at dealing with Java apps. When I download one, type 'ant test', and see about 10% of the tests fail (when none should), I don't know the first obvious thing a Java expert would do.

I liken this to patterns. There was a time when the idea of Composite was something you had to figure out instead of just use. There was a time when Null Object was an Aha! idea. As happened with small-scale program design, the tricks of the trade of learning code need to be (1) pulled out of tacit knowledge, (2) written down, (3) learned how to be taught, and (4) turned into a card game. I don't know who's working on that. A couple of sources come to mind: Software Archaeology, by Andy Hunt and Dave Thomas, and Software Archaeology, by Andy Schneider.

## Posted at 21:34 in category /coding [permalink] [top]

Fri, 26 May 2006


OK, I don't care if it roasts my lap, I want a MacBook now. (explanation)

## Posted at 08:27 in category /misc [permalink] [top]

A hunter-gatherer in the organizational environment

The economies of scale that favor large corporations come with diseconomies for many of the people who work within them. It's kind of like agriculture that way.

But large corporations are not closed systems. The customers of the large corporations get the benefit in lower prices (though not without hidden costs).

The people who win out are the economic hunter-gatherers who live on the fringes of the Large. People like me. We get the benefit of economies of scale without paying our share of the price. Sorry about that.

## Posted at 07:12 in category /misc [permalink] [top]

Thu, 25 May 2006

Notes toward integration testing (1)

Any time you write code that sits on top of a third party library, your code will hide some of its behavior, reveal some, and transform some. What are the testing and cost implications?

By "cost implications," I mean this: suppose subsystem USER is 1000 lines of code that makes heavy use of library LIB, and NEW is 1000 lines that doesn't (except for the language's class library, VM, and the operating system). I think we all wish that USER and NEW would cost the same (even though USER presumably delivers much more). However, even if we presume LIB is bug free, we have to test the interactions. How much? Enough so that an equal-cost USER would be 1100 lines of unentangled code? 1500? 2000? It is conceivable that the cost to test interactions might exceed the benefit of using LIB, especially since it's unlikely we're making use of all of its features.

More likely, though, we'll under-test. That's especially true because I've never met anyone with a good handle on what we're testing for. Tell me about a piece of fresh code, and I can rattle off things to worry about: boundary conditions, plausible omissions, special values like nil or zero. I'm much worse at that when it comes to integrated code, and I think I'm far from alone.

The result of uncertain testing is a broken promise. Given test-driven design, bug reports should fall into two categories:

  1. Something that was omitted from any of the driving tests. Most of those can be fairly classified as new or changed requirements. They can be estimated and scheduled in the normal way (presuming they're not so simple to fix that you just do it right away). Such are more like new features than what most people mean by "bug," and seeing them shouldn't be cause for surprise or disappointment.

  2. A real bug. Everyone agrees that, given the tests driving the code, this previously untried example should have worked. But it doesn't. That's a surprise and a disappointment.

The TDD promise is that there should be few type 2 real bugs. But if we don't know how to test the integration of LIB and USER, there will be many of what I call fizzbin bugs: ones where the programmer fixing them discovers that, oh!, when you use LIB on Tuesday, you have to use it slightly differently.

Since fizzbin bugs look the same to the product director or user, greater reuse can lead to a product that feels shaky. It seems to me I've seen this effect in projects that make heavy use of complex frameworks that the programmers don't know well. Everyone's testing as best they can, but end-of-iteration use reveals all kinds of annoyances.

I (at least) need a better way to think about this problems. More later, if I think of anything worth writing.

## Posted at 07:54 in category /testing [permalink] [top]

Wed, 24 May 2006

Another hint for revising

Here's an addition to my earlier hints for revising. What a reader sees as a digression often seems central to an author. To see how important it really is, try removing it. Then ask what text later in the piece has to be changed because of that. If the answer is "not much," you've got a digression.

The trick for an author alone is to tell which paragraphs to check. (After all, the whole problem is she's blind to what the reader sees.) Checking them all would be proportional to the square of the number of paragraphs—ick. All I can think of is to focus attention on changes of topic.

Once you've found a removeable paragraph, you can either remove it (probably the safest choice) or make the rest of the piece depend upon it.

## Posted at 19:29 in category /misc [permalink] [top]

Business-facing tests vs. rework

I've been working as product director for a project. As many do, I find that I ought to be spending more time at it than I can. I've written only a few business-facing tests (as examples). Would things have gone better if I'd written more? In some cases, yes. In other cases, no. It's actually worked fine to have the programmer implement his understanding of what I mean, then have me point at the mis-fits and describe tweaks. That's true even though he's remote and I'm doing the describing mainly by email and IM (with some voice).

This is a special case: reimplementation of an existing system, nothing exotic about the domain, etc. etc. What I'd like is a better understanding of when to use each of the following development tactics (and blends between them):

  1. Problem description by conversation and business-facing tests.
  2. Problem solution by programmer testing and coding.
  3. Some revision due to product director's reaction to the finished story (including exploratory testing).
  1. Problem description solely by conversation.
  2. Problem solution by programmer testing and coding.
  3. More revision due to product director's reaction to the finished story (including exploratory testing).

Note—and I think it's important—that I am assuming a full set of rigorous TDD-style tests. So the issue here has little to do with untested code; it has more to do tradeoffs between styles of explanation.

## Posted at 13:04 in category /agile [permalink] [top]

Wed, 17 May 2006

A Java app to demonstrate test-driven design

When teaching TDD, what I like best is to work with people on real changes to their own code. Sometimes that doesn't work. There may be logistical problems. The code may have such legacy nature that progress is way too slow to give them any feel for what a day in the life of a test-driven programmer is like (which is a big part of my goal).

When their code doesn't work out, demoing with toy applications is an unsatisfying alternative all around. I'd like to demo with some substantial open source Java application that was built test-first (so is testable). Does anyone have a recommendation? If so mail me.

## Posted at 09:45 in category /agile [permalink] [top]

Sun, 14 May 2006

Gradual descent

I used to teach the occasional class at the University of Illinois. One summer, I taught "CS397BEM: Being Wrong." The idea of the class was that any solution to a problem brings its own problems. The first example I gave in the class was the body's immune system. It's a solution to a problem: bacteria that want to eat us. So the body has neutrophils that eat the bacteria. But there's a problem: neutrophils exude antimicrobial crud. When they swarm to a site of infection as part of inflamation, the crud damages the body. The solution to that problem is to make the neutrophils short-lived. Once the bacteria are eaten, no more neutrophils are attracted and the existing ones die off before too much damage is done. (My resident expert says this explanation is "simplistic, but OK.")

I thought this class was important because we too often solve the problem in front of us, then stop. We don't try, even casually, to predict the accompanying problems. More importantly, we don't attend to the problems when they surface, so we let the inflammation get worse for too long.

Here's a problem I've noticed but ignored: Big Visible Charts lose their effectiveness over time. They cease providing the same pressure to improve or maintain. Part of it is that they become invisible; the eye ignores what it's seen a zillion times before. Another part, I think, is that people are bad at maintaining a level pace. We randomly jitter, sometimes in the worse direction, sometimes in the better. It's always easier to stay worse than to get better, so eventually one jog worseward isn't corrected with a jog betterward. Now you're at a worse level, and the fact that you've tolerated that makes tolerating the next jog worseward easier.

That's by way of explaining why my weight kept creeping up until the scale said 180.2 pounds. It's not just a disgusting lack of willpower: it's a universal law!

At some point in the decline, you need to stop, take serious stock of things, remind yourself of what you're trying to accomplish, adjust yourself, and return to the task with renewed energy. That's what I've done. It's back to the 2 pounds lighter per week regime, which I think is sustainable down below my previous low. Then the trick will be not to let the supposedly steady state get quite so out of hand next time.

The Big Visible Blog did help me take stock, especially once I crossed a multiple-of-ten threshold. Because other people were watching, I eventually couldn't stomach showing the trend without explanation. But an explanation would be too lame unless it were part of a description of a correction. Hence this post.

(All this might be sophistry, though. The fact is that it's been a truly lousy two months in most all of the spheres I care about—my family, my wife's job, the exhibited character of my nation, and parts of my work life. While Clif Builder Bars are not the junkiest of food, I overdo them as comfort food in black times.)

## Posted at 11:56 in category /misc [permalink] [top]

Tue, 25 Apr 2006

Raise my taxes, please

I don't know what the US should do in Iraq. I believe we're morally obligated to make it come out the best we can. I don't know if we're now doing that, if we could do better by changing course, or if our presence irretrievably does more harm than good. Since the Administration is unwilling to be truthful with the citizenry, since the press is unable to travel in Iraq and is in any case broken as an institution, and since I'm certainly not competent to collect and judge the data myself, I expect I won't know for twenty years, if ever.

However, it is wrong for me to sit here, fat and happy, paying no price while Iraqis, the US military, the UK military, and their families suffer. The least I can do is not add to the trouble of others by expecting my children to pay for all this.

Using figures from the US Internal Revenue service and the Congressional Research Service, I figure my family's share of the Iraq+Afghanistan wars to date is around US$2749, and our share of ongoing costs is US$630 per year. Since we are a two-income family, make rather more than the average, and I believe in a progressive income tax, my rough guess is that we should pay a lump sum somewhere between US$5000 and US$7500, and then between US$1000 and US$1500 yearly. I urge the Congress to raise my taxes accordingly.

I'm serious. Who's with me?

Calculations based on 131,301,697 individual returns (2004), US$6.9 billion per month cost of operations and US$361 billion spent to date (October 2005).

I believe in a progressive income tax because a dollar is worth less to me than to someone who makes minimum wage.

## Posted at 07:25 in category /misc [permalink] [top]

Fri, 21 Apr 2006

Agile definitions

No doubt because I'm a self-proclaimed skeptic about definitions, I was asked:

[...] can you supply dictionary-like definitions for:

test-driven design

example-driven design

agile methodology


My answer:

test-driven design

A style of program design that begins by writing one simple test, then writing just enough code to pass it. Then another simple test is written, and code is added to pass both it and the previous test. The programmer then looks for opportunities to improve the code by generalizing it, removing duplication, restructuring it, or making it more understandable. The test-code-improve cycle repeats until there are no more tests to be had. It is claimed that a good global design emerges from (1) the need to decouple the code to make tests run fast, and (2) the local heuristic rules for code improvement. The tests are retained and run frequently to prevent unintended effects of changes to the design.

example-driven design

A variant of test-driven design with a particular answer to the question "What test should be next?" The tests are written as if they were a series of examples being used to teach someone how to use the code, beginning with simple cases and moving toward the trickier ones.

agile methodology

A style of software development characterized by its release schedule, attitude toward change, and patterns of communication. (1) The product is developed in iterations, usually one to four weeks long. At the end of each iteration, it has additional, fully implemented business value--not just more code--and is ready to be deployed (although it may not be). (2) The design horizon usually extends only to the end of the current iteration; little code is written in anticipation of future needs. Since the project is seen by the programmers as a stream of unanticipated requirements, the team and product are forced to learn to accommodate change. There is no concept of "requirements churn" and no need for a requirements freeze. (3) Written natural language communication is considered a usually-inefficient compromise. Face-to-face communication is higher bandwidth (but transient). Executable documentation—code and tests—is permanent, less ambiguous, and self-checking (but slower to write and read). Agile projects prefer a combination of the latter two over the the first.

## Posted at 17:00 in category /agile [permalink] [top]

How to be a product director

I've finished a first draft of "How to be a Product Director." PDF, 17 pages (with pictures and screen shots and pull quotes and sidebars). Comments welcome.

Some projects call it "the Customer." Others call it "the product owner." Some call it "the goal donor." I and a few others call it the Product Director. Like a film director, the product director is the one person with the clearest vision of the final result. But—like a film director—the role isn't a passive one of "expressing the vision." It's an active role, one of pointing the work of other people in a particular direction, evaluating the results, and adjusting the direction based on the reality of what the last bout of work produced.

It's the hardest job on an Agile project.

The job revolves around four verbs: inform, observe, adjust, and represent.

The product director informs the programmers of what to do next by providing them with stories. [sidebar defining "story"] In order to estimate the cost of a story and implement it successfully, the programmers need:[...]

## Posted at 07:33 in category /agile [permalink] [top]

Wed, 19 Apr 2006

How to be a product director

For a current client, I've decided to write up a short (for me) guide titled "How to be a Product Director." If you have suggestions for a "further reading" section, please send them to me. Thanks.

I will put the finished product up on the web.

## Posted at 07:39 in category /agile [permalink] [top]

Pair programming with users

An interesting story about pairing with users to implement new stories. I found it intriguing because it repeats some of my hobbyhorses from a different perspective (that of an APL programmer). Notice that he gives up on English in favor of examples. Notice also the adaptation of a programming language into a more-ubiquitous cross-domain language. (This made easier by the close fit APL starts out with.)

The part of the story about moving processing in from a distant site reminds me of what I've started calling a service-oriented development strategy. The idea is to forget about the program as an object but rather think of the project team as providing a repetitive service to the user base. In the company described in the link, someone—probably the product director—would funnel claims to process into the team. One person on the team would be an expert in manually processing claims. She'd be a bottleneck, so she'd enlist other available people—the programmers—to handle the simpler claims. Programmers being lazy, they'd quickly write code to make repetitive parts of the task go away. They'd also (and thereby) learn the domain. As they did, they could handle more complex claims, which would lead to more capable code. Lather, rinse, repeat.

Now, the truly lazy service person won't want to even type information into the program; she'll want the users to do it themselves. That means seducing the users into trading the ease of just forwarding a claim to the team for the benefit of higher throughput. So now—and only now—the programmers have to focus on making the UI usable by normal humans. First, they'll make it good enough for those few technology enthusiasts among the users. Then they'll improve it enough for the pragmatists and even the conservatives. (Thus, the standard high-tech adoption lifecycle is followed within a single project.)

At some point, you run out of claims that the product director thinks should be automated. So you stop and send the programmers off to do something else.

I've not convinced anyone to actually try this. I probably never will.

(Note: This strategy doesn't really match the story, since the users do nothing but claims processing. It's probably better suited to situations where the software is a necessary but non-central part of the job.)

One immediate objection is that this will lead to a lousy, patched-on-after-the-fact UI. For what it's worth, Jeff Patton (Mr. Agile Usability) doesn't think that's necessarily so. In fact, when I talked to him about it, he said that committing to a UI too early can hamper the project if there's not been time yet to incrementally make a decent model of the users' world(s).

## Posted at 07:35 in category /links [permalink] [top]

Tue, 18 Apr 2006

Example-based specifications workshop

Bill Wake and I will be hosting a workshop at Agile2006. Here's the blurb:

Some tests are thinly-disguised walkthroughs of a user interface. Other tests are so voluminous that you suspect that even their author didn't read them. But some tests seem designed to facilitate conversations about the key objects and processes in a domain.

How can we foster the latter style, to create specifications useful to both machines and people, including end users, product owners/customers/product directors, testers, and programmers?

We're going to try to gear this toward advanced practitioners, while recognizing there will probably be a majority of novices in the room. What we'll do first is divide into groups, each with a Customer (preferably someone who's held that role on a real project). That Customer will repeatedly describe a story, and the team will turn it into tests. After that, we'll reconvene, discuss, identify 2-4 areas of interest, break into smaller discussions, address them, and produce a poster summary for each.

Because of the italicized bit above, I'd like you to encourage all your Real Customers to come to the conference. Even if they don't come to our workshop, this conference needs to be a place where Customers can learn from each other.


## Posted at 09:31 in category /conferences [permalink] [top]

Wed, 12 Apr 2006

Tests and specifications

I'm not one to quibble over definitions. If someone points at something that's obviously a cow and says "deer", I usually don't argue the point. While we're arguing about what it is we're about to feed, the poor beast will starve.

Still, it creeps me out when people refer to tests (aka examples) as specifications. There's an important distinction:

A specification describes a correct program, while a test provokes a correct program.

In math geek terms, specifications are universally quantified statements, ones of the form "for all inputs such that <something> is true of them, <something else> is true of the output." Tests are constant statements, ones with no variables.* They look like this: "given input 5, the output is 87."

This matters because, while both kinds of statements can be true or false, the only way to deduce the truth of a universally quantified statement from a set of constant statements is to exhaustively list all possible inputs. That's rarely possible.

To make the point concrete, a set of tests allows the programmer to write this:

if (the input is that of test 1)
   make the output what test 1 expects
elsif (the input is that of test 2)
   make the output what test 2 expects
   do something for all remaining cases

Given that code, the tests say absolutely nothing about the correctness of the something that's done for all remaining cases.

Absurd example? An employee of a beltway bandit once told me his project had done exactly that. Proudly told me, no less.

But let's pretend we live in an ethical culture. There, the tests combine with certain habits and memories to provoke particular actions. Consider a programmer faced with two tests:

assert_equal(1, f(1))
assert_equal(4, f(2))

Those tests could be passed with this code:

def f(x)
   when 1: 1
   when 2: 4
   # default?

But a programmer who's been raised well has a fastidious distaste for the case statement and its cousin if, a habit of leaping to abstractions, a learned distrust of incomplete enumerations of cases, and a keen nose for any whiff of duplication. So she will want to change that code. She will further have a memory that the whole point of it all is to square numbers (though the person who told her that was kind of vague on what it means to "square" a number). So she will leap to change that code to this:

def f(x)
   x * x

The assertions themselves are like two pebbles rolling downhill. Whether they start an avalanche depends on what they roll into: the hill has to be ready. For the test-driven, the avalanche is a procedural assurance that the program computes x * x for all x, not a logical one.

That's why I don't like calling sets of tests a specification. In practical terms, I don't like it because it always, always, always leads to someone making the argument about universal quantification vs. tests or quoting Dijkstra to the effect that "program testing can be used very effectively to show the presence of bugs but never to show their absence." The ensuing discussion is rarely, in my opinion, a good use of time. So what am I doing having it?


* Actually, a test statement can be seen as having variables, being a quantified statement like this:

For all a, b, c, ..., x, y, z: given input 5, the output is 87

where each of the variables is something you hope is irrelevant to the output. The trick is to capture all the relevant variables, pin them down, and feed them into the process.

## Posted at 13:08 in category /agile [permalink] [top]

Sun, 09 Apr 2006

The purpose of continuous integration

At yesterday's successful Continuous Integration and Testing Conference, it occurred to me that the aim of continuous integration is to addict people to particular feelings. When they don't feel them, they'll do things to produce them. Those actions are good ones; they'll solve or head off problems. Those feelings are:

  1. A well-founded feeling that the product is making steady forward progress.

  2. The feeling of being part of a buzz of activity. People are moving fast and making things happen.

  3. A feeling of openness. What people are doing is visible; one person's effect on another is immediately at hand.

## Posted at 15:24 in category /agile [permalink] [top]

Thu, 06 Apr 2006

Extreme Test Makeover

During one of the days of Agile2006, Bill Wake and I will be hosting a set of "extreme test makeovers." Throughout the day, we'll have makeover artists who are experts in unit and acceptance testing, with tools like Fit, JUnit, NUnit, Watir, and more. Some of the people who've expressed interest in helping touch up tests are Brian Button, Ward Cunningham, Janet Gregory, Ron Jeffries, Rick Mugridge, Bret Pettichord, Charlie Poole, and Jim Shore.

The idea is that people will bring their laptop, already loaded with tests that can be run. Best would be tests for real product code; that way, you can go back to work and justify the conference trip by slapping your laptop down on your boss's desk and showing improved tests. Tests for any substantial chunk of code are OK, though.

Did I mention that people should bring tests that can be run? That's really important.

Sessions will be 90 minutes each, with five minutes for expert speechifying at the beginning and ten minutes to record lessons learned and stick them up on the wall.

For at least some of the sessions, we'll provide some way for observers to see what's happening (a projector and a microphone).

We'll have a signup sheet at the conference, but I've also started a mailing list where people with tests (that can be run at the conference) can hook up with makeover artists. It's http://groups.yahoo.com/group/test-makeover. People who want to help out can also announce that there.

There may also be informal sessions after the formal ones.

## Posted at 20:49 in category /conferences [permalink] [top]

The Gordon Pask Award 2006

Each year at the Agile200X Conference, the Agile Alliance presents the Gordon Pask Award for Contributions to Agile Practice. Here's its description:

The Gordon Pask Award recognizes two people whose recent contributions to Agile Practice demonstrate, in the opinion of the Award Committee, their potential to become leaders of the field. The award comes with a check for US$5000.

Last year's recipients were:

J. B. Rainsberger, for spending a great deal of time helping people on the testdrivendevelopment mailing list, for writing JUnit Recipes, for XP Day Toronto, and for being the Agile2005 tutorial chair.

Jim Shore, for his performance as a paper shepherd; for a fine experience report he gave at ADC2003 that, together with his blog, suggest a cast of thought that deserves cultivation; for his work on the Fit specification and the C# version of Fit; and for being a person who holds the Fit world together by doing the sort of organizational and cleanup tasks that are usually thankless.

You can see that we are looking for people who provide both ideas and actions. We want people who are advancing the state of the practice. But we also want people who are spreading knowledge of the existing state of the practice, so that Agile teams know what more there is to learn. And we also want people who are helping people on a personal level, not just at the abstract level of ideas.

Send nominations making the case for a particular person to pask-nominations@agilealliance.org The deadline for nominations is May 31.

## Posted at 20:14 in category /conferences [permalink] [top]

Thu, 16 Mar 2006

Pushing the envelope

From the Dept. of Negative Externalities: How suspicious does a credit card application have to be before it's rejected? More than this, it turns out.

From the Dept. of Painstaking Even-Handedness: Yes, but the credit card company's explanation that it's not as bad as it seems does have some weight. The author didn't in fact simulate a meth addict rooting through someone's trash.

## Posted at 05:03 in category /misc [permalink] [top]

Wed, 15 Mar 2006

Two formative experiences

Why is it that I so stubbornly believe that code can get more and more malleable over time? — Two early experiences, I think, that left a deep imprint on my soul.

I've earlier told the story of Gould Common Lisp. The short version is that, over a period of one or two years, I wrote thousands of lines of debugging support code for the virtual machine. Most of it was to help me with an immediate task. For example, because we were not very skilled, a part of implementing each bytecode was to snapshot all of virtual memory,* run the bytecode in a unit test, snapshot all of virtual memory again, diff the snapshots, and check that the bytecode changed only what it should have.

The program ended up immensely chatty (to those who knew how to loosen its tongue). There are two questions any parent with more than one child has asked any number of times: "all right, who did it?" and "what on earth was going through your mind to make you think that was a good idea?" The Lisp VM was much better at answering those questions than any human child.

I was only three or so years out of college, still young and impressionable. Because that program had been a pleasure to work with, I came to think that of course that was the way things ought to be.

Later, I worked on a version of the Unix kernel, one that Dave Fields called the Winchester Mystery Kernel, so I became well aware that not all programs were that way. But at the time, I was also a maintainer of GNU Emacs for the Gould PowerNode. With each major release of Emacs, I made whatever changes were required to make it run on the PowerNode, then I fed those changes back to Stallman. Part of what I worked on was unexec.c, which is the code that dumped a version of Emacs into an executable file after a bunch of Lisp libraries had been loaded. As you might imagine, the code was highly machine-dependent. Since Emacs ran on a gazillion different machines, you'd expect that file to turn into a maze of twisty little #ifdefs, all different. But an amazing thing happened: over time, it got cleaner and cleaner and cleaner. Instead of surrendering to complexity, Stallman and company used patches for new machines to help them find the right abstractions to keep things tractable.

That experience also made an impression on me, and probably accounts for a tic of mine, which is to hope that each change will give me an excuse to learn something new about the way the program ought to be.

I've been lucky in the experiences I've had. A big part of my luck was being left alone. No one really cared that much about the Lisp project, so no one really noticed that I was writing a lot of code that satisfied no user requirement. GNU Emacs was something I did on my own time, not as part of my Professional Responsibilities, so no one really noticed that Stallman pushed harder for good code than those people who were paid to push hard for good code.

I'm not sure whether people on the Agile projects of today have it better or worse. On the one hand, ideas like the above are no longer so unusual, so it's easier to find yourself in situations where you're allowed to indulge them. On the other hand, people's actions are much more visible, and they tend to be much more dedicated to meeting deadlines—deadlines that are always looming. I'm wondering these days whether I'm disenchanted with one-week iterations. I believe that the really experienced team can envision a better structure and move toward it in small, safe steps that add not much time to most every story. I'm not good enough to do that. I need time floundering around. To get things right, I need to be unafraid of taking markedly more time on a story than is needed to get it done with well-tested code that's not all that far from what I wish it were (but makes the effort to get there one story bigger and so one story less likely to be spent). It's tough to be unafraid when you're never more than four days from a deadline.

So I think I see teams that are self-inhibiting. When I work with programmers (more so than with testers), I find it difficult to calibrate how much to push. My usual instinct is to come on all enthusiastic and say, "Hey, why don't we merge these six classes into one, or maybe two, because they're so closely related, then see what forces—if any—push new classes out?" But then I realize (a) I'm a pretty rusty programmer, (b) I know their system barely at all, (c) they'll have to clean up any mess we make, not me, and (d) there's an iteration deadline a few days away and a release deadline not too far beyond that. So I don't want to push too hard. But if I don't, someone's paying me an awful lot of money to join the team for a week as a rusty programmer who knows their system barely at all.

It ought to be easier to focus just on testing, but the same thing crops up. There, the usual pattern goes like this: I like to write business-facing tests that use fairly abstract language (nowadays usually implemented in the same language as the system under test). My usual motto is that I want to see few words in the test that aren't directly relevant to its purpose. Quite often, that makes the test a mismatch for the current structure of the system. It's a lot of work to write the utility routines (fixtures) that map from business-speak to implementation-speak. Now, it's an article of faith with me that one or both of two things will probably happen. Either we'll discover that the fixture code is usefully pushed down into the application, or a rejiggering of the application to make the fixtures more straightforward will make for a better design. But..., (a) if I'm wrong, someone else will have to clean up the mess (or, worse, decide to keep those tests around even though they turned out to be a bad idea), and (b) this is going to be a lot of work for a feature that could be done more easily, and (c) those deadlines are looming.

I manage to muddle through, buoyed—as I think many Agile consultants are—by memories of those times when things just clicked.

* The PowerNode only had 32M of virtual (not physical) memory, so snapshotting it was not so big a deal.

## Posted at 18:40 in category /coding [permalink] [top]

At the end of every story

A question to ask every time you finish a story: "What's now easier to do?" Be ever so slightly disappointed unless each story contributes, in some small way, to making the system more malleable.

## Posted at 04:53 in category /agile [permalink] [top]

Tue, 07 Mar 2006

Mail received today

Subject: http://agilemanifesto.org/principles.html

Your Agile Principles are very interesting
In particular:

Simplicity--the art of maximizing the amount of work not done--is essential.

This is called unemployment, isn't it? I specialise in this and I'm glad to see you value the lifestyle.

## Posted at 05:57 in category /agile [permalink] [top]

Mon, 06 Mar 2006

Article on getting started

One of the things that interests me is how a team gets into alignment with its Product Director (sic). The path is often rocky, especially with someone who's never been a Product Director before.

Something I hear a lot of complaints about—and have griped about myself—is Product Directors who are too attached to the UI:

  • They describe stories in terms of the UI they want to see. When the nouns and verbs that make up the business domain are tied to GUI elements, it takes longer for the programmers to understand those words in the way that works best—by encoding them into business logic and fiddling with them in response to new stories.

  • Instead of being happy with a first draft GUI that's two text fields and a submit button, their stories include javascript input support and validation, they want nice-looking CSS, etc. That kind of work adds friction to the project; the necessary early flailing around is slower and harder when each change affects more code.

I think the cause is one part lack of experience, one part fear, one part necessity, and three parts reasons I don't know yet. The Product Directors are not used to talking about programs in abstract terms, in conversations where they're not pointing at a UI element but instead pointing at (say) nodes in business workflows. The fear is that crudeness of interface extends all the way down through the code, so that it represents everything being tackily done, not just the top layer. Will the programmers ever be able to put a good GUI on?

Those are (I claim) bad causes. The good cause is that the Product Director is the project's representative outward to the business. She will be showing the product to lots of people even more likely to judge a book by its cover, a product by its UI. A snazzy UI may be the path of least resistance.

Still, I think we'd be better off if we knew how to make a persuasive case for growing the UI as gradually as we grow the feature list. When the product is halfway to release, it should have half the features and half the UI glitz, not 1/4 of the features and all of the glitz.

If you have a track record at persuading Product Directors to hold off on the GUI, or if you have anything profound to say about any aspect of aligning the Product Director and the team*, I'd like you to write me an article for the princely sum of US$500 and undying fame.

* I don't want to give the impression that I think the Product Director does all the shifting of perspective and the rest of the team does none. It's not a matter of whipping the Product Director into shape. The alignment is a matter of trust going in all directions; I just happen to focus on having the Product Directors trust the programmers and process because lack of that trust makes it harder for me to get example-driven development going.

## Posted at 13:36 in category /misc [permalink] [top]

Tue, 28 Feb 2006

Speaking text aloud

In my hints for revising, I wrote:

Read your text aloud. You don't have to write like you speak, but reading aloud changes your perspective. Awkwardness will jump out at you.

Reading aloud is one way to get some distance, to separate the piece from your memory of writing it. Putting it aside for a day or, better, a week does the same thing. I find that reading a printed copy helps me see things I don't see on a screen. Can you find other tricks? Richard P. Gabriel tells the story of one writer who would tape his work to a wall, go to the other side of the room, and read it through binoculars.

I hardly ever read my text aloud without remembering an incident from my days as an English major. In one class, we had to write a poem. Other people read them aloud. When someone read mine, I discovered that what sounded OK when I read it sounded awful when he did. There were places where I slowed down, sped up, or placed emphasis and he did not. He didn't because there were no cues in the text to tell him to do that. All the cues were in my auditory memory or imagination.

Recently I've been experimenting with having my Powerbook read the text to me (Program -> Services -> Speech). "Vicki's" rather odd intonation helps me find awkwardnesses that I don't otherwise notice. She's not a replacement for my own reading, but I think it's worth listening to her reaction.

## Posted at 09:35 in category /misc [permalink] [top]

Mon, 27 Feb 2006

Congratulations, California

The responsible among you will be using software with "a number of security vulnerabilities [...]. Although the vulnerabilities are serious, they are all easily fixable." A cynical person—not me!— might take "a serious flaw in the key management of the crypto code" which "was openly published two and a half years ago in a famous research paper, and is now known by anyone who follows election security, and can be found through Google"—but is not yet fixed—to suggest that the bugs, including that one, might not be allocated the "few hours [required] to do the whole job" any time soon.

Have no fear, though, since "the security issues are manageable by a reasonably careful combination of short-and long-term approaches." I'm sure that everyone involved is reasonably careful at all important times. Have fun!

## Posted at 14:33 in category /misc [permalink] [top]

Wed, 22 Feb 2006

Table of contents for the "GUI testing tarpit" series

My "working your way out of the GUI testing tarpit" series really ought to be put into a single paper with the rough transitions smoothed over. Until that happens, if ever, what I've got will have to serve. Here's the table of contents.

  1. Three ways of writing the same test: click-specific procedural, abstracted procedural, and declarative. The first two are usually inefficient solutions to their problems. Life is better if you get rid of as many procedural tests as possible. That's what this series is about.

  2. A declarative test might require a lot of test-specific work behind the scenes. To avoid that, build an engine that can deduce paths through the app.

  3. Trying to convert a whole test suite at once is failure-prone. Therefore, convert the suite one failure at a time.

  4. Capturing abstract UI actions behind the scenes doesn't provide much speedup, but it allows a dandy programming, debugging, and testing tool that lets you get to any page in one step.

  5. If you have your tests avoid the network, you'll discover that many tests boil down into assertions about the structure and contents of a single page. There's no reason those can't be fast, targeted, robust unit tests.

  6. But if most tests are about single pages, how do you prevent changes from introducing dead links? The renderer can check links without clicking on them, at unit-test time.

  7. Not everything can be turned into a fast, network-avoiding unit test. Workflow tests remain GUI tests, but they should clearly focus on workflow and not test things better tested elsewhere. Such tests can be an integral part of the design of application flow.

## Posted at 09:35 in category /testing [permalink] [top]

Mon, 20 Feb 2006

I need a break

The Big Visible Chart of my weight at the top of the blog has been red for too long. This has been a stressful month and a lousy, lousy week, with stress coming in from multiple directions and time to exercise coming from nowhere. The added stress of falling short of the two-pounds-per-week goal every week is working against meeting it. It's the wrong kind of feedback. Therefore, from next week until things get better, green will mark any weight below 170, red any weight over 173, and grey the range between. I expect no better than grey.

## Posted at 18:11 in category /misc [permalink] [top]

End-to-end tests and the fear of truthiness

The American Dialect Society voted "truthiness" the 2005 word of the year. It "refers to the quality of preferring concepts or facts one wishes to be true, rather than concepts or facts known to be true." For me, 2006 is turning into the year of replacing end-to-end tests with unit tests. One risk to face is that unit tests can play into truthiness. This picture illustrates the problem:

Everything seems fine here. The tests all pass. What the picture doesn't show is that the Widget tests require the strings from the Wadget to be in ascending alphabetical order. The fake Wadget dutifully does that. The Wadget tests don't express that requirement, so the real Wadget isn't coded to satisfy it. The strings come back in any old order.

Truthiness would be wishing that unit tests add up to a working system. But the truth is that those two units would add up to a system like this:

We know that those sorts of mismatches happen in real life. So we should fear unit tests.

More tests are a proper response to fear. Hence the desire to wrap the entire chain above in an end-to-end test that 'sees what the user sees'. However, such tests tend to be slow, fragile, etc. So I want to replace them with smaller tests or other methods that are fast, robust, etc., thus reducing the need for end-to-end tests to a bare minimum.

Two such methods are:

  1. Value Objects can be the authority over data format. Suppose neither the fake nor real Wadget returned an array of strings but rather an InventoryNames object. Before, both the real and fake Wadget were supposed to know about order. Now only one object needs to. The requirement on Wadget turns into two requirements: the first that it use InventoryNames, and the second that InventoryNames always yields names in the right order.

  2. I earlier described one way for any code that generates <form action="want_new_case_form"...> to check at runtime whether there's any code corresponding to want_new_case_form. Because of that, a unit(ish) test that generates a page automatically checks link validity.

I expect there are a host of other tricks to learn (but I'm not at this moment aware of places where they're written down). What's seems to me key is to take the strategy of "something could go wrong somewhere, so here's a kind of test with a chance of stumbling over some wrongness" and replace it with (1) a host of tactics of the form "this could go wrong in places like that, so here's a specific kind of test or coding practice highly likely to prevent such bugs" and (2) a much more limited set of general tests (including especially manual exploratory testing).

P.S. I don't like the word "truthiness." It seems statements should have truthiness, not people. A question for you hepcats out there who are down with the happening slang: which is more copacetic, "that's a truthy statement" or "that's a truthish statement"?

## Posted at 18:11 in category /testing [permalink] [top]

Sat, 18 Feb 2006

Model-Renderer-Presenter: MVP for web apps?

A client and I were talking over how Model-View-Presenter would work for web applications. The sequence diagram to the right (click on it to get a bigger version in a new window) describes a possible interpretation. Since the part that corresponds to a View just converts values into HTML text, I'm going to call it the Renderer instead. The Renderer can be either a template language (Velocity, Plone's ZPT, Sails's Viento) or—my bias—an XML builder like Ruby's Builder.

I did a little Model-Renderer-Presenter spike this week and feel pretty happy with it. I'm wondering who else uses something like what I explain below and what implications it's had for testing. Mail me if you have pointers.

(Prior work: Mike Mason just wrote about MVP on ASP.NET. I understand from Adam Williams that Rails does something similar, albeit using mixins. So far handling the Rails book hasn't caused me to learn it. I may actually have to work through it.)

Here's the communication pattern from the sequence diagram:

  1. After the Action gets some HTTP and does whatever it does to the Model, it creates the appropriate Presenter (there is one for each page) and asks it for the HTML for the page.

  2. The Presenter asks the Renderer for the HTML for the page. The Renderer is the authority for the structure of the page and for any static content (the title, etc.) The Presenter is the authority for any content that depends on the state of the Model.

  3. When the Renderer needs to "fill in a blank", it asks the Presenter. In this case, suppose it's asking for a quantity (like a billing amount).

  4. The Presenter gets that information from the Model.

  5. The Renderer can also ask the Presenter to make a decision. In this case, suppose it asks the Presenter whether it should display an edit button. I decided that the Presenter should either give it back an empty string or the HTML for the button. That works as follows:

  6. First, the Presenter asks the Model for any data it needs to make the decision. Suppose that it decides the button should be displayed. But it's not an authority over what HTML should look like, so it...

  7. ... asks the Renderer for that button's HTML. As part of rendering that button, the Renderer needs to fill in the name of the Action the button should invoke. It could just fill in a constant value, but I want the program to check—at the time the page is displayed—whether that constant value actually corresponds to a real action. That way, any bad links will be detected whenever the page is rendered, not when the link is followed. Since there are unit tests that render each page, there will be no need for slow browser or HTTP tests to find bad links. Therefore...

  8. ... the Renderer asks the Presenter for the name to fill in.

  9. The Presenter is the dispenser of Action names. Before giving the name to the Renderer the Presenter asks the Action (layer) whether the name is valid. The Action will blow up if not. (It would make as much sense—maybe more—to have the Action be the authority over its name but this happened to be most convenient for the program I started the spike with.)

What good is this? Classical Model-View-Presenter is about making the View a thin holder of whatever controls the windowing system provides. It does little besides route messages from the window system to the Presenter and vice versa. That lets you mock out the View so that Presenter tests don't have to interact with the real controls, which are usually a pain.

There's no call for that in a web app. The Renderer doesn't interact with a windowing framework; it just builds HTML, which is easy to work with. However, the separation does give us four objects (Action, Model, Renderer, and Presenter) that:

  1. can be created through test-driven design,

  2. can be tested independently of each other,

  3. and can be tested in a way that doesn't require many end-to-end tests to give confidence that almost all of the plausible bugs have been avoided.

The second picture gives a hint of the kinds of checks and tests that make sense here. (Click for the larger version. Safari users note that sometimes the JPG renders as garbage for me. A Shift-Reload has always fixed it.)

More later, unless I find that someone else has already described this in detail.

## Posted at 10:31 in category /testing [permalink] [top]

Thu, 16 Feb 2006

Programmers as testers, again

Over at the Agile-Testing list, there's another outbreak of a popular question: are testers needed on Agile projects? To weary oldtimers, that debate is something like the flu: perennial, sneakily different each time it appears so that you can't resolve it once and be done with it, something you just have to live with.

After skimming the latest set of messages on the topic, I returned to editing a magazine article and then I had a thought that might just possibly add something.

Editors are supposed to represent readers (and others), just as testers are supposed to represent users (and others). To an even greater extent than testers, editors do exactly what the the represented people do: they read the article. And yet, you can't take J. Random Reader and expect her to be a good editor. Why not?

It seems to me that as readers we're trained to make allowances for writers. We're so good at tolerating weak reasoning, shaky construction, and muddled language that a given reader will notice only a fraction of the problems in a manuscript. A good editor will notice most of them. How?

Some of it is what "do we need testers?" discussions obsessively circle: perspective. Editors didn't write the manuscript (usually...), so their view of what it says is not as clouded by knowledge of what it should have said. Editors also do not have their ego involved in the product.

But that perspective is shared by any old reader. What makes editors special is, first, technique. I put those techniques into two rough categories:

  • Model-building techniques. Esther Derby has described cutting a manuscript into pieces and rearranging them on the floor into something like an affinity map. She created a content model she could use to explore structure. In testing, James Bach and Michael Bolton teach many model-building techniques under the umbrella name Rapid Testing. Programmers who learned from them would do better (modulo perspective effects).

  • Something I don't know how to name. Call them attentiveness techniques. It sometimes happens that I have a niggling feeling of something wrong with an article. It's easy to ignore those. But if you have techniques to explore them, you're more likely to. For example, sometimes I find it useful to use the literary technique of deconstruction to figure out an article's implicit assumptions or contradictions.

    It seems to me the exploratory testing world would be the place to look for something that could be adapted to programmers. I'm somewhat disconnected from the leading edge of that world these days, so I don't know how explictly this topic has been taken up. (If you're looking for something to read, Mike Kelly has quite a list of books that have influenced exploratory testers.)

But there's something else that editors and testers have that programmers don't have: leisure. When I'm acting as a pure reader, I intend to get through it and out the other side quickly. As an editor, there's no guilt if I linger. There's guilt if I don't. One problem that Agile projects have is a lack of slack time, down time, bench time. There's velocity to maintain—improve—and the end of the iteration looms. Agile projects are learning projects, true, but the learning is in the context of producing small chunks of business value. There's no leisure for the focus to drift from that. (I'm using "leisure" rather than "permission" because so much of the pressure is self-generated.)

My hunch is that perspective is less important than technique and leisure for producing good products. If the testing and programming roles are to move closer together (which I would like to see), the real wizards of testing technique need to collaborate with programmers to adapt the techniques to a programmer's life. (I tried to do that a few years ago. It was a disaster, cost me two friendships. Someone else's turn.) And projects need some way to introduce leisure. (Gold cards?)

## Posted at 06:56 in category /testing [permalink] [top]

Wed, 15 Feb 2006

Continuous integration and testing conference

A message from Paul Julius and Jeffrey Fredrick about a conference:

Jeffrey Fredrick and Paul Julius are cohosting an event that will focus on [continuous integration and testing]. The event will use Open Spaces to structure conversation, understanding and innovation.

What: Open Space event discussing all aspects of CI and Testing, together
Where: Chicago, IL
When: April 7 & 8, 2006
Who: Everyone interested in CI and Testing
Cost: Free
Info: http://www.citconf.com

We'll be inviting people for all manner of projects and places. In fact, feel free to pass this invitation along to anyone that you think will be interested.

For us to finalize the details of time and place we need to get a feel for how many people are likely to attend. If you are interested in attending please join the CITCON mailing list at:


and post an introductory message. In your message it would be useful if you could indicate any topics of special interest and also how likely you are to attend.

## Posted at 07:53 in category /testing [permalink] [top]

George Washington on warrantless surveillance

Author's note: I know that at most five people want to read my thoughts on the traditional separation of powers. It's just that public discourse in the US is so broken, unserious, and partisan that I sometimes get this image of my college-age children, ten years hence, asking what I did about it. And then I write something so that I can tell them then that back in 2006 I commanded the tide to stop. Take heart, though. Coming up are a few postings on model-view-presenter, web applications, and the testing implications.

So here's the way I understand it.

Congress established a court and a law, FISA, governing the wiretapping of foreign intelligence agents. That court has rarely denied a warrant, though they've modified some larger number. Warrantless surveillance is allowed for fifteen days (after declaration of war), three days (to gather evidence to be used for a warrant application), or one year (but only of foreign nationals).

The Bush administration has a program which admittedly collects communications of US citizens without a FISA warrant. This program is justified in several ways:

  • The three-day period is too short.

  • The authorization to use military force that followed 9/11 is a statute, the FISA law explicitly says it applies "except as authorized by statute", and the statute acted to trigger that exception. Therefore, judicial oversight is not needed. (Some of the lawmakers have denied that the law was intended to modify FISA in this way, pointing to explicit provisions that were rejected, but the Administration's argument is that the law says what it says.)

  • The President's constitutional role as Commander in Chief overrules, in this case, the Congress's constitutional role of making the laws the Executive executes. Therefore, congressional authorization is not needed.

  • Other presidents exercised the same power. (There is much argument about how true that is.)

My opinion on all this is in accord with Ronald Reagan's "trust, but verify" (1989) and George Washington's Farewell Address (1796), in which he said:

It is important, likewise, that the habits of thinking in a free country should inspire caution in those entrusted with its administration, to confine themselves within their respective constitutional spheres, avoiding in the exercise of the powers of one department to encroach upon another. The spirit of encroachment tends to consolidate the powers of all the departments in one, and thus to create, whatever the form of government, a real despotism. A just estimate of that love of power, and proneness to abuse it, which predominates in the human heart, is sufficient to satisfy us of the truth of this position. The necessity of reciprocal checks in the exercise of political power, by dividing and distributing it into different depositaries, and constituting each the guardian of the public weal against invasions by the others, has been evinced by experiments ancient and modern; some of them in our country and under our own eyes. To preserve them must be as necessary as to institute them. If, in the opinion of the people, the distribution or modification of the constitutional powers be in any particular wrong, let it be corrected by an amendment in the way which the Constitution designates. But let there be no change by usurpation; for though this, in one instance, may be the instrument of good, it is the customary weapon by which free governments are destroyed. The precedent must always greatly overbalance in permanent evil any partial or transient benefit, which the use can at any time yield.

The Administration has steadfastly refused to describe limitations on its powers. When signing new laws (such as the recent torture ban), the President has expressly reserved the right to bypass them because of his commander-in-chief power. Other presidents have used "signing statements" in the same way, but this one uses them far more often. (As far as I know, the legal force of signing statements has yet to be decided.)

Further, to my knowledge, the Administration has not proposed any bills to remedy the claimed defects in FISA. (In the searching that led to all these links, I found claims that the Republican majority had offered such, but were rebuffed. I didn't find primary sources, though.) By going the legislative route, they would involve all three branches of government in these important decisions.

The Executive branch is not showing the caution that Washington called for. It's unconservative, since conservatism—if it is to mean anything—ought to mean a healthy distrust of messing with what works in hope of something better. I'm a strange mixture of conservative and what (in the US) is called liberal. But when it comes to the American presumption that you need a system designed to work despite being run by knaves and scoundrels, not because it's run by wise men, I'm conservative.

Until the Administration demonstrates that they are being cautious about encroaching on the other branches (by, say, giving examples of potential wartime powers they do not claim), or argues publicly that they require more powers, the people of the US should urge their representatives in Congress to push back against any appearance of usurpation.

P.S. I know from Google searching that many people can not distinguish an argument about separation of powers from a desire to leave Al Qaeda's phones untapped. So just let me say that I don't have enough information to have an opinion about the specific surveillance in question.

## Posted at 07:52 in category /misc [permalink] [top]

Tue, 07 Feb 2006


Update: I forgot that Kevin Rutherford also suggested the word "director". Great minds, etc.

Mark Smeltzer has come up with an alternative to Appraisers: Product Directors. I like that. It puts the focus on the product, not on managing the team. It connotes movement and responsiveness. Unlike the common metaphor of driving projects, it doesn't imply that other people are passive passengers. Instead, they're active participants in a joint project. If the word makes you think of a movie director, it also brings to mind that person charged with having the clearest idea of the end product during production. It also ought to have connotations of balancing features and cost, of producing the most you can within a given budget. Sometimes it does, though directors get notorious for the opposite.

We also talked a bit about "ScrumMaster" and what might be a more business-friendly term. Mark points out:

Making an analogy to the film industry, the AD (Assistant Director) role embodies many of the ideas and responsibilities associated with ScrumMasters. In the end, that may be what I go with: Assistant Product Director.

One thing I learned from the smattering of email: if told to link a name to one property of the role, different people have very different ideas of what that one property should be. Trying to pick a name within the project might lead to a useful discussion (reminiscent of Gause and Weinberg's heuristic for naming projects in Exploring Requirements). Or, in the wrong hands, it might lead to the most tedious and pointless discussion possible.

## Posted at 04:16 in category /agile [permalink] [top]

Sun, 05 Feb 2006


I'm pleased to have been part of the inspiration for bellygraph.com. As you can see above, my own cruder bellygraph has not recovered from the holidays. Too much to do + more travel + winter + general stress = more eating & less exercise. Humbug.

## Posted at 12:32 in category /misc [permalink] [top]

Two projects I will never do...

... so someone else should.

  • An oral history of the influence of Smalltalk on the Agile methods.

  • A workshop exploring how lessons from improvisatory theater might apply to Agile teams. Invite Michael Bolton, Chad Fowler, Jonathan Kohl, and John D. Mitchell. (I owe three of these four people mail, which gives you some idea of why I'll never do these projects.)

## Posted at 12:32 in category /misc [permalink] [top]

Tue, 31 Jan 2006

PNSQC Call for Papers

The Pacific Northwest Software Quality Conference is one of my favorite conferences. I think it usually runs about 200 people, so it's small enough to meet people. As a regional conference always in the same place (Portland, OR, USA), there's a continuity of attendees that allows some papers to be less introductory than in other conferences.

They tell me:

This year's theme is: "Quality - A Competitive Advantage". If you know of someone doing work worthy of sharing with other interested listeners, please share this with them. PNSQC has been an ardent supporter of both new and experienced speakers and would like to continue to do so.

All that is needed to submit to the conference is a short abstract that gives us enough information to be able to determine the fit of the paper to our conference. Not much work is required. You can submit the abstract at http://pnsqc-m.org/?q=node/174.

Deadline is March 31.

## Posted at 21:21 in category /conferences [permalink] [top]

OOPSLA Essays 2006

Last year, I was track chair for the OOPSLA essays track. This year, thankfully, it's Richard P. Gabriel, who will be more successful than I was. I'm on the committee. A quote from the track page:

An essay is a rigorously peer-reviewed reflection on technology, its relation to human endeavors, or its philosophical, sociological, psychological, historical, or anthropological underpinnings. An essay can be an exploration of technology, its impacts, or the circumstances of its creation; it can present a personal view of what is, explore a terrain, or lead the reader in an act of discovery; it can be a philosophical digression or a deep analysis.

What makes for a successful essay? At its best, an essay is a clear and compelling piece of writing that enacts or reveals the process of understanding or exploring a topic important to the OOPSLA community. It may or may not have a conclusion, but it must provide some insight or argument. A successful essay shows a keen mind coming to grips with a tough or intriguing problem; as Virginia Woolf wrote, "it explains much and tells much." [from the preface of "Memoirs of a Working Woman's Guild"].

The idea of essays is one of those oddities that have made OOPSLA so interesting and productive over the years. You should submit. By March 18.

## Posted at 06:01 in category /oopsla [permalink] [top]


I'm tired of having to write "Customers (product owners, business experts, etc.)" when talking about the particular project role XP calls "customer" (or "Customer," in a largely fruitless effort to short-circuit the association with someone buying something in a store).

We don't have this problem with "programmer" or "tester", so what's up with that other role? Maybe it's that its name is not based on a verb. It's kind of clear what the central activity of a programmer or tester is—to program or to test—but what is it that a Customer does? Customate? A product owner presumably owns, but "to own" is a pretty passive concept.

Maybe things would be clearer if (a) the noun we used for the Customer role was linked to a verb, and (b) that verb had something to do with the central activity of a Customer (product owner, etc.).

And what is that central activity? I think it's to determine the value of a particular proposed change. The verb that comes to my mind is "appraise." So the role would be named Appraiser. Here's a definition:

1: one who estimates officially the worth or value or quality of things
2: one who determines authenticity (as of works of art) or who guarantees validity

I like the word "officially," which hints at the making of a final judgment. I also like "authenticity" and "validity." They have connotations of determining whether something is real or not. In software, the Appraiser determines whether something that could become real should become real.

The only active-verb-based alternative in semi-common use is Goal Donor. I think it's inferior to Appraiser because it's about what that role does from the perspective of a programmer. From the perspective of the business, the judging of value is more important than the giving of goals.

Therefore, unless I get a better suggestion by February 15, 2006, on that date all references to "Customer" in XP books or "Product Owner" in Scrum books will retroactively change to "Appraiser," in exactly the same way that "test-driven" became "example-driven" in late 2003.

## Posted at 06:01 in category /agile [permalink] [top]

Sun, 29 Jan 2006

Agile, five years later

It's just shy of five years since the Agile Manifesto was written. I've often said that I dread the day when I look back on the me of five years ago without finding his naivete and misconceptions faintly ridiculous. When that day comes, I'll know I've become an impediment to progress.

So what about the me of 2001? I do find him a bit ridiculous, though not enough for comfort. During a shortish plane ride, I came up with this list of what I didn't know then:

Tools are important. I'm flying back from working a week at a Delphi shop. Doing... anything... in... Delphi... is... just... so... tedious... that... it... makes... you... want... to... scream. I think it no coincidence that so many of the Agile Manifesto authors had past experience with Smalltalk (or, in my case, Lisp). That kind of background makes it easier to think of software as something you could readily change. I don't think Agile would have taken off without semi-flexible languages like Java and the fast machines to run them.

Moreoever, each new tool—JUnit, Cruise Control, refactoring IDEs, FIT—makes it easier for more people to go the Agile route. Without them, Agile would be a niche approach available only to the ridiculously determined.

People get stuck. What I seem to see often is a team making a big leap. They become more productive, they become happier, the business becomes happier with them. Then they plateau. Now, I know from my weightlifting days that plateaus are a part of growth, but it seems surprisingly hard to make the next leap.

Sometimes I find other Agile consultants surprisingly wistful. The projects they're working with are doing better than they ever did before, but somehow they're not making it to that peak experience the consultant remembers.

The customer role is far harder than I'd anticipated. Five years ago, I wouldn't have said the customer role is the hardest on the project. Now I say it all the time. I also greatly underestimated how central the role is. Sometimes I tell people that I think of good Agile teams as like a compass with the magnetic pole being the customer. You can divert their actions away from the customer, but they'll always push to orient themselves that way. It's an unusually personal relationship.

Testers aren't translators. My image—only half conscious—was of the tester taking business-speak and translating it into tests for the programmers to pass. Now I think of the tester as much more someone who makes nudges that encourage and streamline direct conversations. The translation out of business speak should happen in the code.

Making business-facing tests is difficult and subtle. I pretty much thought I knew how to write "black box" tests, and that the tester's job would be to write those same tests, just earlier and based on much more intensive conversation with the customer. But the tests I advocate today are quite different than the ones I remember thinking about back then, and I'm still coming up with what appear to be important twists.

The interaction between testing and design complicates things. Five years ago, I viewed "test infected" programmers as an uncomplicated good. Programmers, I said, were so enthusiastic about testing that they'd willingly add the hooks testers have always wanted. I'm now thinking it's more complicated. Test-first unit testing leads to small-scale changes in design. Test-first large-scale testing seems to require similar changes in architecture. (See my recent interminable series for hints along those lines.)

Back then, I thought of testers as getting technical stories added to the mix. A tester could do tests of type X much more easily if the programmers did Y, makes the business case to the customer, who can decide to add a story to do Y. Or I thought of testers as writing particular stories in a particular format. When the programmers made those tests pass, the usual rules about minimizing duplication, etc. would cause the architecture to emerge naturally.

I now think that the interaction between tests and architecture will require much closer and sustained conversation than that (will be much less of a waterfall)—unless we're content to rest on a plateau.

Exploratory testing isn't an obvious fit. Back then, I was very taken with how the exploratory coding you see in Agile shops feels like exploratory testing. At a workshop I organized, Michael Feathers also remarked on that. I still think there's a strong connection, and I still talk to teams about exploratory testing, but it remains an obscure practice. When done, it seems still to be mostly about bugs, not—as I used to say—about exploring the business domain and design space. I wish I knew why.

## Posted at 08:27 in category /agile [permalink] [top]

Sat, 28 Jan 2006

Working your way out of the automated GUI testing tarpit (part 7)

part 1, part 2, part 3, part 4 part 5, part 6

Where do we stand?

  • I've prototyped a strategy for gradual transformation of slow, fragile, hard-to-read tests into fast tests that use no unnecessary words and are therefore less fragile.

  • Every page has tests of its layout. They may be comprehensive or not, depending on the needs of the project.

  • Sometimes, parts of pages are generated dynamically according to "presentation business rules". For example, a rule might govern whether a particular button appeears. The page tests can (should) include tests of each of those rules.

  • The page tests are not mocked up. That is, the page renderer is not invoked with an artificially constructed app state. Instead, the app state is constructed through a series of actions on the app.

  • Nevertheless, the tests are declarative in the sense that they do not describe how a user of the app would navigate to the page in question. Instead, the test figures navigation out for itself.

  • As the app grows a presentation layer, the page tests can run at unit test speeds by calling directly into it.

  • There are no tests that check for dead links by following them. Instead, the output-generator and input-handler cooperate so that any attempt to generate a bad link will lead to an immediate failure. Therefore, dead links can be discovered by rendering all parts of all pages. The page tests do that, so they are also link-checking tests.

I want to end this series by closing one important gap. We know that links go somewhere, but we don't know that they go to the right place, the place where the user can continue her task.

We could test that each link destination is as expected. But if following links is all about doing tasks, good link tests follow links along a path that demonstrates how a user would do her work. They are workflow tests or use-case tests. They are, in fact, the kind of design tests that Jeff Patton and I thought would be a communication tool between user experience designers and programmers. (At this point, you should wonder about hammers and nails.)

Here's a workflow test that shows a doctor entering a new case into the system.

 def test_normal_new_patient_workflow

I've written that with unusual messages, formatted oddly. Why?

Unlike this test, I think my final declarative tests really are unit tests. According to my definition, unit tests are ones written in the language of the implementation rather than the language of the business. My declarative tests are about what appears on a page and when, not about cases and cows and audits. They're unit tests, so I don't mind that they look geeky.

Workflow tests, however, are quintessential business-facing tests: they're all about asserting that the app allows a doctor to perform a key business task. So I'm trying to write them such that, punctuational peculiarities aside, they're sentences someone calling the support desk might speak. I do that not so much because I expect a user to look at them as because I want my perspective while writing them to be outward-focused. That way, I'll stumble across more design omissions. Sending all messages to an object representing a a persona (dr_dawn) also helps me look outward from the code.

Similarly, I'm using layout to emphasize what's most important. That's what the user can do and what, having done that, she can now do next. The actual checks that the action has landed her on the right page are less important—parenthetical—so I place them to the side. (Note also the nod to behavior-driven design.)

The methods that move around (like adds_a_case) talk to the browser in the same way that the earlier abstracted procedural tests do, and the parenthetical comments turn into assertions:

   def should_be_on_the_main_page
      assert_page_title_matches(/Cases Available to Dmorin/)

As you can see, I don't check much about the page. I leave that to the declarative page tests.

That's it. I believe I have a strategy for transforming a tarpit of UI tests into (1) a small number of workflow tests that still go through the UI and (2) a larger number of unit tests of everything else.

Thanks for reading this far (supposing anyone has).

What's missing?

The tests I was transforming didn't do any checking of pure business logic, but in real life they probably would. They could be rewritten in the same way, though I'd prefer to have at least some such tests go below the presentation layer.

There are no browser compatibility tests. If the compatibility testing strategy is to run all the UI tests against different browsers, the transformation I advocate might well weaken it.

There are no tests of the Back button. Should they be part of workflow tests? Specialized? I don't know enough about how a well-behaved program deals with Back to speculate just now. (Hat tip to Seaside here (PDF).)

Can you do all this?

The transformation into unit tests depends on there being one place that receives HTTP requests (Webrick). Since Webrick is initialized in a single place, it's easy to find all the places that need to be changed to add a test-support feature. The same was true on the outgoing side, since there was a single renderer to make XHTML. So this isn't the holy grail—a test improvement strategy that can work with any old product code. Legacy desktop applications that have GUI code scattered everywhere are still going to be a mess.

See the code for complete details.

## Posted at 07:56 in category /testing [permalink] [top]

Sun, 22 Jan 2006

The right way to put Unicode on the pasteboard

The previous solution to my copy-Unicode problem turns out not to work for non-Unicode characters, at least not for the sort of screwy characters testers like to paste into apps. So I had to solve it right. I put the solution here in hopes that it'll be found in a web search someday and save someone some time.

require 'osx/cocoa'


#Example: unicopy %w{ 03b4 03d4 03a6 }
def unicopy(hex_string_array)
  copy_with_encoding(utf8(hex_string_array), UTF8_ENCODING)

# Utilities

def utf8(hex_string_array)
  number_array = hex_string_array.collect do | hex_name |

def copy_with_encoding(string, encoding)
  data = OSX::NSData.dataWithRubyString(string)
  ns_string = OSX::NSString.alloc.initWithData(data, :encoding, encoding)
  pb = OSX::NSPasteboard.generalPasteboard
  pb.declareTypes(["NSStringPboardType"], :owner, nil)
  pb.setString(ns_string, :forType, "NSStringPboardType")

For the Windows version and for copying non-Unicode, look here: http://www.exampler.com/testing-com/review-copies/test-strings-0.1.zip. That's an alpha version of a collection of utility methods oriented toward helping testers mess with text fields. They're inspired by James Bach and Danny Faught's perlclip. They work on both the Mac and Windows. The source will eventually live on the Scripting for Testers site.

## Posted at 21:52 in category /ruby [permalink] [top]

Tue, 17 Jan 2006

Hack for unicode to the pasteboard

See, just explaining the problem and sleeping on it makes the solution wave to attract your attention:

One way to put unicode on the Mac OS X pasteboard is to use FUJIMOTO Hisakuni's rubyaeosa to execute Applescript.

require 'osx/aeosa'

   set the clipboard to \xc7data utf8CEA3CEA6\xc8


(The hex characters are Mac-Roman "chevrons" that vaguely look like ‹‹ and ››. Applescript doesn't use 7-bit ASCII. The glop after "utf8" is sigma and phi in the Greek alphabet.)

I could dig further into rubyaeosa to find a Ruby message send equivalent to "set the clipboard", but I maybe think that's a bad idea. This is an example for Scripting for Testers, and I think that the message of getting the job done with bailing wire and twine and moving on is a good one.

Now on to Windows...

## Posted at 08:56 in category /ruby [permalink] [top]

Mon, 16 Jan 2006

The origins of camelcase

Speaking of screwy character encodings, I have a theory about the origins of theAbominationThatIsCamelCase.

In the late 70's, I was a computer operator for a PDP-10. We had spiffy VT100-compatible terminals. But there was this odd CRT off in a corner that we referred to as "the European terminal". On it, the character code that we know and love as ASCII underscore displayed as a left arrow. I remember being disconcerted by a program that used underscore for assignment, and I was told that the language (whatever it was) assumed European terminals. More to the point, I think I remember being told that the_camel_case_naming_style was used either because it would look silly to have names like a←variable←name or because such names would be syntax errors in the Mystery Language.

I have since then assumed that this once-necessary convention stuck in people's heads after it became unnecessary or even harmful, like that pop song you loved when you were 14 or the English units of measure. (I'll spare you any pop economics about path dependence.)

(This story is similar to Wikipedia's Alto Keyboard Origin, though it would seem to put the origin closer to ASCII-63 (which had the left arrow and no underscore), ASCII-67 (which might have perpetuated the arrow), or the early ECMA standards (ditto).)

## Posted at 21:46 in category /misc [permalink] [top]

Unicode to the pasteboard

I am blissfully ignorant of Unicode.

Nevertheless, I want to write a Ruby script that puts Unicode characters (the Greek alphabet, say) onto the Mac OS X pasteboard. It has to be pure Ruby (no writing in C). 7-bit ASCII I can do, and 8-bit Mac-Roman, both using pbcopy. However, I can't see a way to do Unicode.

Please let me know if I'm wrong.

I don't care about the encoding the Ruby code works with. UTF-8, UTF-16, Punycode, whatever.

P.S. Interesting how much more understandable the Wikipedia pages on Unicode are than the official site is.

P.P.S. Seeming bug in Textedit on 10.4.3: if I create a file full of Greek characters and save it as UTF-16, I can open it and see the same characters. If I save it as UTF-8, when I reopen it, it looks like it's full of Mac-Roman characters.

## Posted at 18:55 in category /ruby [permalink] [top]

Tue, 10 Jan 2006

Life is a meaningless void of gibbering chaos whose soundtrack is the thin monotonous whine of accursed flutes

I'm working on code in which a particular object's startdate property can be either a DateTime or a datetime. One is not a subclass of the other. No duck typing here: one exposes the year through a year() method; the other through a year field. (This is Python, so it matters.)

It's reasonable to suppose I have fewer than 400,000 hours left in my life, and I spent one of them finding that out.

Who knew HP's source material was code?

## Posted at 11:44 in category /junk [permalink] [top]

Mon, 09 Jan 2006

Working your way out of the automated GUI testing tarpit (part 6)

part 1, part 2, part 3, part 4 part 5

Here, I dispose of another reason to run tests through the GUI: bad links and other ways of getting to pages. These bugs can be found with unit tests instead. The mechanism fits in well with business-facing test-driven design.

Let's start with a bug. In build 343, an Activity Summary page is added to the app. Links to that page are added to thirteen other pages. In build 582, someone changes the URL of the Activity Summary page and dutifully changes twelve of the thirteen pages that link to it. It's a user who finds that the thirteenth link wasn't updated.

A link-checking program won't find all such bugs because it probably can't get to all the pages of the program. So, the claim is, you should have a GUI testing tool traverse every link. Here, I'll change the sample app to show a better way.

Because I was frightened by DTML as a small child, I lean away from template languages with embedded code and toward code that generates XHTML. (We can argue the merits of the two approaches another day.)

My Renderer class is nothing fancy. A bunch of core methods generate simple XHTML. From them, I've built up more complicated methods, such as the ones used here:

   def case_display_page
      case_record = @app.current_record
      page("Case #{case_record.clinic_id}",
               p("Owner: #{case_record.client}"),

Now suppose I want to add a help link to that page, using a method called help_link_for(topic). Here's a simple implementation of that method:

   def help_link_for(topic)
      %Q{<a href="javascript:standard_popup('help?topic=#{topic}')">Help</a>}

The method generates a link to a javascript popup, but I think it should also check that the topic exists, like this:

   def help_link_for(topic)
      assert(@app.has_help_for?(topic), "Creating link to nonexistent link #{topic}.")
      %Q{<a href="javascript:standard_popup('help?topic=#{topic}')">Help</a>}

has_help_for? checks that the topic exists, using the same mechanism that the help action uses to find the help to display. Therefore, you do not need to follow the link to discover that it's bad, you merely need to generate it. Which means generating the page that contains it. Which we already do with a fast renderer unit test:

   def test_typical_case_display_page
      given_app_with {
         case_record('clinic_id' => 19600219)
      when_rendering(:case_display_page) {
         assert_page_title_matches(/^Case 19600219/)

The test doesn't explicitly check the help link, but it doesn't have to: the renderer assertion will nevertheless check it for us. Here's what will happen if the link is bad:

   2) Error:
StandardError: Programmer error. Creating link to nonexistent page "bogus_page". Please report this error to bugs@example.com.
      ./util.rb:3:in `assert'
      ./renderer.rb:106:in `help_link_for'
      ./renderer.rb:82:in `case_display_page'

(Note: I later added an explicit assertion that the help link exists because I consider it an essential part of the page. The implicit check only fails if the link exists but is bad; the explicit assertion fails if it doesn't exist at all.)

The link-creation routine checks that the particular help topic exists, but it doesn't check that "help" is the right action to get to the help pages. It's easy to ask if the app responds to an action named help. Use this code: @app.respond_to?('help'). So I could add another assertion to help_link_for, but I'd like to handle the risk of an incomplete renaming in a different way. To get there, let me start a seeming digression and fix that long-standing bug in our program (that it prompts you with a button to add an audit even when no more audits are allowed).

Here's the code that adds the button to the page:

   def add_audit_button
                                    submit('Add an Audit Record')))

The renderer could ask the app before generating the add_audit form, like this:

   def add_audit_button
      return unless @app.further_audits_allowed?

                                    submit('Add an Audit Record')))

And, since I'm changing the method anyway, I might as well have it make sure that want_add_audit_form is an action the app responds to:

   def add_audit_action
      assert(@app.respond_to?('want_add_audit_form'), ...)
      return unless @app.further_audits_allowed?

                                    submit('Add an Audit Record')))

But that's starting to bug me. I'm asking the App more and more, not telling it. Is this Feature Envy? Do I want to worry that other methods that generate this action will have to duplicate the knowledge of which checks are appropriate?

It seems to me that the renderer should hand a potential presentation to the app and ask it to apply whatever rules are relevant, but in a way that insulates the app from any knowledge of the presentation (that it'll be in XHTML, etc.). That can be done using a closure as a callback:

   def add_audit_button
      @app.fill(:template_for_want_add_audit_form) { | action |
                                          submit('Add an Audit Record')))

The App would look like this:

   def fill(template_name, *args, &block)
      self.send("fill_#{name}", *args, &block)

   def fill_template_for_want_add_audit_form(&block)
      return unless current_record.accepts_more_audits?


   def checked(action_name)
                 "#{action_name} is not a defined action.")

fill bounces the work off to a particular method. That checks whether the action is allowed by the business presentation rules. If not, it returns nil (which renders as nothing). Otherwise, it passes the correct action name to the closure (after checking that no one's renamed it out from under us) and lets that closure render away.

(Note: the renderer could call fill_template_for_want_add_audit_form directly—the same knowledge is required—but this form seemed more convenient for unit tests.)

This division of responsibility works well with test-driven design.

  1. The customer says "You shouldn't be able to add any audits if the last audit was nominal." After discussion, everyone agrees the story is to leave the "Add Audit" button off the Case Display page and update that page's help with an explanation.

  2. There's an existing test that checks everything important about the Case Display page. (It's test_typical_case_display_page, above.) A new test is written that claims the Add Audit button is missing when the last audit is nominal. Like test_typical_case_display_page, it avoids fiddly details of XHTML structure.

  3. Making the test pass is going to require some new business logic. That leads to three unit tests describing how fill responds when client code asks it to fill in a template_for_add_audit_form:

    • if there are no audits, the template is filled in (with the right action name),
    • if there's an audit with nominal variance, nil is returned instead of the filled-in template, and
    • if there are n audit records (none nominal), the template is filled in.

    None of these tests refer to text at all, much less XHTML text.

  4. Those tests are made to pass.

  5. The original test should now pass. If it doesn't, that means the renderer doesn't call the app to judge the template. How is this possible, since it's supposed to always use this mechanism to get the action name? Bad renderer! But easily fixed.

  6. The story's not done until the Customer sees the new version of the Case Display page, probably by walking through the workflow of creating a nominal audit and then observing that there's no button to create another. That might lead to tweaks of the presentation, especially those aspects not important enough to be described in a test.

  7. If the Customer wants, the same business rule can be used to check incoming actions. (Just because we don't provide a form to let people add to nominal audits doesn't mean that someone couldn't send the appropriate HTTP anyway.)

(As usual, I should note that I have not seen these ideas applied at the scale of a real app. If I ever have time to create a Giant Microbes fan site for my kids, I'll explore them further.)

At long last returning to the help popup, I can change the code that generates the link to this:

   def help_link_for(topic)
      @app.fill(:template_for_help_link, topic) { | action |
            %Q{<a href="javascript:standard_popup('#{action}?topic=#{topic}')">Help</a>}

The App code that would rule on the template would be:

   def fill_template_for_help_link(topic, &block)
                "Creating link to nonexistent help topic '#{topic}'.")

Any unit test that generated a help link would auto-check for a bad action or bad topic. It would not check whether the javascript standard_popup routine pops up a window, pops up a reasonably-sized window, pops it up somewhere not annoying, etc. That could be tested with JsUnit, Watir, or Selenium. Personally, I'd just test it by hand and trust myself to retest it if I change it.

One final note: we are still working our way out of the tarpit. I haven't stressed it in this installment, but both of the old-format tests continue to work. As always, the goal is to gradually reduce the need for slow and fragile tests.

See the code for complete details.

## Posted at 11:51 in category /testing [permalink] [top]

Thu, 05 Jan 2006

Donate miles to families of injured troops

It really gripes me when people like me are accused of not supporting troops overseas when in fact it's the government's poor planning and execution of post-war reconstruction that we don't support. (Not to mention the stinginess when it comes to troop and veteran benefits.) So when I heard about a program to donate frequent-flier miles to benefit the troops and their families, I did. Unfortunately, of the three airlines I fly, the only one still accepting donations (Northwest) is the one I had the least miles on. Apparently everyone else heard about this long ago. If you haven't, now you have, and I urge you to donate. Get a jump on next Christmas's charitable rush.

## Posted at 08:59 in category /misc [permalink] [top]


When I finally upgraded to Mac OS X Tiger, my old Emacs broke again. I hunted around for a replacement, tried a couple, and settled on Aquamacs. It has a few glitches, but it not only works like Emacs should, it also does a surprisingly decent job of acting Maclike. Some things I like:

  • It has the expected Emacs keystrokes, but the Apple key can also be used for normal Mac actions. Apple-O opens a new file, Apple-W closes the current buffer, etc. (You can use the Option key as Emacs's Meta modifier, but I've never been able to undo the 20-year-old hardwiring that has me typing ESC.)

  • By default, marking a region highlights it. As in other Mac apps, typing then replaces the highlighted region. I thought I would hate that, but I actually like it better than the old behavior (which is still available). But you must know that CTL-G undoes the highlighting!

  • Emacs yank (CTL-W), put (CTL-Y), and related actions are independent of the system clipboard, which you get to with Apple-C, Apple-V, etc. I was surprised by how well that fits with some of my common workflows.

It's good enough that I dropped a donation on its author.

## Posted at 08:58 in category /mac [permalink] [top]

About Brian Marick
I consult mainly on Agile software development, with a special focus on how testing fits in.

Contact me here: marick@exampler.com.




Agile Testing Directions
Tests and examples
Technology-facing programmer support
Business-facing team support
Business-facing product critiques
Technology-facing product critiques
Testers on agile projects

Permalink to this list


Working your way out of the automated GUI testing tarpit
  1. Three ways of writing the same test
  2. A test should deduce its setup path
  3. Convert the suite one failure at a time
  4. You should be able to get to any page in one step
  5. Extract fast tests about single pages
  6. Link checking without clicking on links
  7. Workflow tests remain GUI tests
Permalink to this list


Design-Driven Test-Driven Design
Creating a test
Making it (barely) run
Views and presenters appear
Hooking up the real GUI


Popular Articles
A roadmap for testing on an agile project: When consulting on testing in Agile projects, I like to call this plan "what I'm biased toward."

Tacit knowledge: Experts often have no theory of their work. They simply perform skillfully.

Process and personality: Every article on methodology implicitly begins "Let's talk about me."


Related Weblogs

Wayne Allen
James Bach
Laurent Bossavit
William Caputo
Mike Clark
Rachel Davies
Esther Derby
Michael Feathers
Developer Testing
Chad Fowler
Martin Fowler
Alan Francis
Elisabeth Hendrickson
Grig Gheorghiu
Andy Hunt
Ben Hyde
Ron Jeffries
Jonathan Kohl
Dave Liebreich
Jeff Patton
Bret Pettichord
Hiring Johanna Rothman
Managing Johanna Rothman
Kevin Rutherford
Christian Sepulveda
James Shore
Jeff Sutherland
Pragmatic Dave Thomas
Glenn Vanderburg
Greg Vaughn
Eugene Wallingford
Jim Weirich


Where to Find Me

Software Practice Advancement


All of 2006
All of 2005
All of 2004
All of 2003



Agile Alliance Logo