Archive for January, 2010

Attitude of the development team to the product

Back in the old days, most of my Agile consulting was coming into well-performing Agile teams who wanted to talk about improving their testing. So I got to see a lot of seasoned teams. What struck me most, and made me happiest, was their stance toward their product and the product owner. Proud. Enthusiastic. Engaged.

Dave Hoover of Obtiva has been working on a product called Mad Mimi for some time. At my request, he’s collected some of his tweets about it. He exemplifies that attitude I so enjoy. To pique your interest, here are some samples:

Wow, @madmimi has had a great 24 hours. Seems like a tipping point is coming soon. Really exciting to witness this stuff first-hand.
Thu Jul 10 17:34:40 +0000 2008

Sheesh, I’m really impressed by the emails that people are putting together with @madmimi:
Thu Nov 20 13:47:40 +0000 2008

@ByENNEN: Your wish is @madmimi’s command. Have a look now. :)
Tue Dec 30 20:14:24 +0000 2008

In awe of @madmimi today. Went from $100k/year in revenue to $200k/year in revenue in just 2 months. Zero dollars spent on advertising, FTW!
Wed Jan 14 22:04:50 +0000 2009

Fell asleep at 9 last night in the living room “tent” with the boys. Up at 4 this morning. Couldn’t stop myself from hacking on @madmimi.
Mon Feb 16 13:31:44 +0000 2009

@Kellypaull Yes, I use @madmimi. I also built it. :)
Mon Apr 20 05:29:20 +0000 2009

Twitter Driven Development: 13 days ago @misterpowell complained about @mailchimp’s RSS-to-Email. Today, @madmimi released its RSS-to-Email!
Wed Jun 03 21:48:36 +0000 2009

I’m so biased nowadays. Every newsletter I receive via Constant Contact makes my stomach turn. Hard not to reply with @madmimi evangelism.
Mon Jun 22 21:34:06 +0000 2009

This song about @madmimi just made my day
Wed Jul 08 19:54:48 +0000 2009

Can’t sleep after reading @mfeathers‘ blog post and checking the latest @madmimi subscription numbers…
Thu Sep 10 05:10:12 +0000 2009

Here’s the whole list.

Up in the Air

I saw “Up in the Air” last night. While it’s a well-done movie, albeit entirely predictable, the world it portrays is a lot more disturbing than it seems at first. To explain, I recommend John Holbo’s close reading of David Frum’s Dead Right. Here’s the most relevant part:

[Frum writes] “The great, overwhelming fact of a capitalist economy is risk. Everyone is at constant risk of the loss of his job, or of the destruction of his business by a competitor, or of the crash of his investment portfolio. Risk makes people circumspect. It disciplines them and teaches them self-control. Without a safety net, people won’t try to vault across the big top. Social security, student loans, and other government programs make it far less catastrophic than it used to be for middle-class people to dissolve their families. Without welfare and food stamps, poor people would cling harder to working-class respectability than they do now.”

[Now Holbo] The thing that makes capitalism good, apparently, is not that it generates wealth more efficiently than other known economic engines. No, the thing that makes capitalism good is that, by forcing people to live precarious lives, it causes them to live in fear of losing everything and therefore to adopt – as fearful people will – a cowed and subservient posture: in a word, they behave ‘conservatively’. Of course, crouching to protect themselves and their loved ones from the eternal lash of risk precisely won’t preserve these workers from risk. But the point isn’t to induce a society-wide conformist crouch by way of making the workers safe and happy. The point is to induce a society-wide conformist crouch. Period. A solid foundation is hereby laid for a desirable social order.

*Some Spoilers*

“Up in the Air” is almost unrelentingly conservative in that it simply assumes that the world of work today is the one Frum [seems to] desire. In the movie, Fate–impersonal forces of industry that are faceless until they hire George Clooney–strikes people unexpectedly, and the only solace or support they can have is their family. They get no help from the Agile Alliance, ACM, IEEE, craftsman guilds, AFSCME, AFL-CIO, their community, or the keen-eyed watchdogs of the press whipping up public opinion. There’s some help from the company–of what quality we don’t know–and the government offers only unemployment benefits that are explicitly called out as derisory. There’s some talk of protecting the firing process from lawsuits [government intervention in the process], but in the one case where a lawsuit is credibly threatened, a simple lie by Clooney makes it go away.

In the movie, the people below the airplanes have no option but a socially desirable [to some] defensive crouch. To the extent that movies capture the zeitgeist, that’s pretty disturbing. The movie comes out against atomization–being alone is Clooney’s unhappy fate–but the atoms can’t form molecules bigger than a family. There are no other bonds. OK: there’s Clooney’s last, human gesture to Natalie, but that’s something delivered in extremis, not a way of life. The character of Alex is more representative of the movie’s view of the world: outside the family, there are no norms and no obligations.

The movie shows a world in which Phil Gramm, one of the architects of our current troubles, is on point when he talked about the US as “a nation of whiners“. One strong element of conservatism is the preservation of hierarchy. To Gramm–who need never fear unemployment–I suspect that we-the-people are not particularly relevant. How annoying it must be when we complain! We should shut up, take our lumps, scrape through, and–as in the movie’s closing narrative–watch the planes containing our fate fly far overhead.

A European pair-touring trip

I’ll be speaking at the Scandinavian Developer Conference on March 16-17. I may be speaking at the Scottish Ruby Conference on March 26-27. I have to be back in the USA for Philly Emerging Tech on April 8th. I’d like to do a pair tour of Europe sometime in that time. I’m thinking of something like this route:

 Img 60332607-7B082B51651A098706225938887Db693.4B5F8350-Full

Göteburg, Sweden -> Amsterdam -> Paris -> Madrid -> London -> Edinburgh. If you’d be interested in working together, contact me. I’m looking into moving into programming-process consulting (like TDD), cutting-edge programming (like Clojure). I’m also wanting to somehow accelerate my learning of Spanish.

The rules for a pair tour is that you let me “couch surf” (sleep on your floor or couch) and provide some food (dinners, for example). I’m a poor American–I can’t afford European prices.

Delivering value

It’s quite the rage today to talk about “delivering value” as opposed to “delivering software”. That scares me.

Consider the contractor who had our house in chaos for what seems like forever. We now have more counter space in the two upstairs bathrooms. I think you could make a strong argument that he would have delivered more value to us if he’d taught us–two of us in particular–how to more efficiently use the space we already had. And yet I’m glad he didn’t. I preferred him to deliver what we wanted to pay him to deliver.

On the other hand, my dad built houses. I remember him once telling a prospective client that the architect’s plans they had were wrong. Given the man’s job, he’d come home dirty every day, so he’d probably want to go directly to the shower. The plans had him tromping all through the house. Instead, he needed a doorway from the garage to a little hall with a bathroom one step away. I think they let him redraw the plans, thereby delivering more value.

So there’s a tricky balance here. Are we up to it?

As my wife put it the first time she met my friends, “You software people have really strong opinions… about everything!” The average software person is more likely than, say, the average veterinarian to talk to someone briefly about her job and immediately come up with ten ways she ought to do it better. Telling such a jumper-to-conclusions that the team should look beyond what the business wants to what it needs is… maybe tempting them to play to their weaknesses.

I would instead tell a team they must, over the course of many months, prove to the business that they are worthy of being invited to the larger conversation about what the business needs. And to do that, they must–first and foremost–deliver what the business wants, while also acting as a humble student of that business.

A parable about mocking frameworks

Somewhere around 1985, I introduced Ralph Johnson to a bigwig in the Motorola software research division. Object-oriented programming was around the beginning of its first hype phase, Smalltalk was the canonical example, and Ralph was heavily into Smalltalk, so I expected a good meeting.

The bigwig started by explaining how a team of his had done object-oriented programming 20 years before in assembly language. I slid under the table in shame. Now, it’s certainly technically possible that they’d implemented polymorphic function calls based on a class tag–after all, that’s what compilers do. Still, the setup required to do that was surely far greater than the burden Smalltalk and its environment put on the programmer. I immediately thought that the difference in the flexibility and ease that Smalltalk and its environment brought to OO programming made the two programming experiences completely incommensurable. (The later discussion confirmed that snap impression.)

I suspect the same is true of mocking frameworks. When you have to write test doubles by hand, doing so is an impediment to the steady cadence of TDD. When you write a statement in a mocking framework’s pseudo-language, doing so is part of the cadence. I bet the difference in experience turns into a difference in design, just as Smalltalk designs were different from even the most object-oriented assembler designs (though I expect not to the same extent).

Mocks, the removal of test detail, and dynamically-typed languages

Simplify, simplify, simplify!
Henry David Thoreau

(A billboard I saw once.)

Part 1: Mocking as a way of removing words

One of the benefits of mocks is that tests don’t have to build up complicated object structures that have nothing essential to do with the purpose of a test. For example, I have an entry point to a webapp that looks like this:

get /json/animals_that_can_be_taken_out_of_service‘, :date => 2009-01-01

It is to return a JSON version of something like this:

{ unused animals => [’jake‘] }

Jake can be taken out of service on Jan 1, 2009 because he is not reserved for that day or any following day.

In typical object-oriented fashion, the controller doesn’t do much except ask something else to do something. The code will look something like this:

  get /json/animals_that_can_be_taken_out_of_service do
    # Tell the “timeslice” we are concerned with the date given.

    # Ask the timeslice: What animals can be reserved on/after that date?
    # (That excludes the animals already taken out of service.) 

    # Those animals fall into two categories:
    # - some have reservations after the timeslice date. 
    # - some do not.
    # Ask the timeslice to create the two categories.

    # Return the list of animals without reservations. 
    # Those are the ones that can be taken out of service as of the given date. 

If I were testing this without mocks, I’d be obliged to arrange things so that there would be examples of each of the categories. Here’s the creation of a minimal such structure:

  jake = Animal.random(:name => jake‘)
  brooke = Animal.random(:name => brooke‘)
  Reservation.random(:date =>, 1, 1)) do
    use brooke
    use Procedure.random

The random methods save a good deal of setup by defaulting unmentioned parameters and by hiding the fact that Reservations have_many Groups, Groups have_many Uses, and each Use has an Animal and a Procedure. But they still distract the eye with irrelevant information. For example, the controller method we’re writing really cares nothing for the existence of Reservations or Procedures–but the test has to mention them. That sort of thing makes tests harder to read and more fragile.

In constrast to this style of TDD, mocking lets the test ignore everything that the code can. Here’s a mock test for this controller method:

    should return a list of animals with no pending reservations do
      brooke = Animal.random(:name => brooke‘)
      jake = Animal.random(:name => jake‘)

      during {
        get /json/animals_that_can_be_taken_out_of_service‘, :date => 2009-01-01
      }.behold! {
                   and_return([brooke, jake])
                   with([brooke, jake]).
                   and_return([{brooke => [,1,1),,1,1)]},
                               {jake => []}])
      assert_jsonification_of(’unused animals => [’jake‘])

There are no Reservations and no Procedures and no code-discussions of irrelevant connections amongst objects. The test is more terse and–I think–more understandable (once you understand my weird conventions and allow for my inability to choose good method names). That’s an advantage of mocks.

Part 2: Dynamic languages let you remove even more irrelevant detail

But I’m starting to think we can actually go a little further in languages like Ruby and Objective-J. I’ll use different code to show that.

When the client side of this app receives the list of animals that can be removed from service, it uses that to populate the GUI. The user chooses some animals and clicks a button. Various code ensues. Eventually, a PersistentStore object spawns off a Future that asynchronously sends a POST request and deals with the response. It does that by coordinating with two objects: one that knows about converting from the lingo of the program (model objects and so forth) into HTTP/JSON, and a FutureMaker that makes an appropriate future. The real code and its test are written in Objective-J, but here’s a version in Ruby:

should coordinate taking animals out of service do
  during {
    @sut.remove_from_service(”some animals“, an effective date“)
  }.behold! {
                and_return: some route
                with(:date => an effective date‘,
                     :animals => some animals“).
                and_return(’post content‘)
                  with(’some route‘, post content‘)

I’ve done something sneaky here. In real life, remove_from_service will take actual Animal objects. In Objective-J, they’d be created like this:

  betsy = [[Animal alloc] initWithName: betsy kind: cow‘];

But facts about Animals–that, say, they have names and kinds–are irrelevant to the purpose of this method. All it does is hand an incoming list of them to a converter method. So–in such a case–why not use strings that describe the arguments instead of the arguments themselves?

    @sut.remove_from_service(”some animals“, an effective date“)

In Java, type safety rarely lets you do that, but why let the legacy of Java affect us in languages like Ruby?

Now, I’m not sure how often these descriptive arguments are a good idea. One could argue that integration errors are a danger with mocks anyway, and that not using real examples of what flows between objects only increases that danger. Or that the increase in clarity for some is outweighed by a decrease for others: if you don’t understand what’s meant by the strings, there’s nothing (like looking at how test data was constructed) to help you. I haven’t found either of those to be a problem yet, but it is my own code after all.

(I will note that I do add some type hints. For example, I’m increasingly likely to write this:

    @sut.remove_from_service([”some animals“], an effective date“)

I’ve put “some animals” in brackets to emphasize that the argument is an array.)

If you’ve done something similar to this, let’s talk about it at a conference sometime. In the next few months, I’ll be at Speakerconf, the Scandinavian Developer Conference, Philly Emerging Tech, an Agile Day in Costa Rica, and possibly Scottish Ruby Conference.

Some preliminary thoughts on end-to-end testing in Growing Object-Oriented Software

I’ve been working through Growing Object-Oriented Software (henceforth #goos), translating it into Ruby. An annoyingly high percentage of my time has been spent messing with the end-to-end tests. Part of that is due to a cavalcade of incompatibilities that made me fake out an XMPP server within the same process as the app-under-test (named the Auction Sniper), the Swing GUI thread, and the GUI scraper. Threading hell.

But part of it is not. Part of it is because end-to-end tests just are awkward and fragile (which #goos is careful to point out). If such tests are worth it, it’s because some combination of these sources of value outweighs their cost:

  • They help clarify everyone’s understanding of the problem to be solved.

  • Trying to make the tests run fast, be less fragile, be easier to debug in the case of failure, etc. makes the app’s overall design better.

  • They detect incorrect changes (that is, changes in behavior that were not intended, as distinct from ones you did intend that will require the test to be changed to make it an example of the newly-correct behavior).

  • They provide a cadence to the programming, helping to break it up into nicely-sized chunks.

In working through #goos so far (chapter 16), the end-to-end tests have not found any bugs, so zero value there. I realized last night, though, that what most bugged me about them is that they made my programming “ragged”–that is, I kept microtesting away, changing classes, being happy, but when I popped up to run the end-to-end test I was working on, it or another one would break in a way that did not feel helpful. (However, I should note that it’s a different thing to try to mimic someone else’s solution than to conjure up your own, so some of the jerkiness is just inherent to learning from a book.)

I think part of the problem is the style of the tests. Here’s one of them, written with Cucumber:

   Scenario: Sniper makes a higher bid, but loses
       Given the sniper has joined an ongoing auction
       When the auction reports another bidder has bid 1000 (and that the next increment is 98)
       Then the sniper shows that it's bidding 1098 to top the previous price
           And the auction receives a bid of 1098 from the sniper

       When the auction closes
       Then the sniper shows that it's lost the auction

This test describes all the outwardly-observable behavior of the Sniper over time. Most importantly, at each point, it talks about two interfaces: the XMPP interface and the GUI. During coding, I found that context switching unsettling (possibly because I have an uncommonly bad short- and medium-term memory for a programmer). Worse, I don’t believe this style of test really helps to clarify the problem to be solved. There are two issues: what the Sniper does (bid in an auction) and what it shows (information about the known state of the auction). They can be talked about separately.

What the Sniper does is most clearly described by a state diagram (as on p. 85) or state table. A state diagram may not be the right thing to show a non-technical product owner, but the idea of the “state of the auction” is not conceptually very foreign (indeed, the imaginary product owner has asked for it to be shown in the user interface). So we could write something like this on a blackboard:

Just as in #goos, this is enough to get us started. We have an example of a single state transition, so let’s implement it! The blackboard text can be written down in whatever test format suits your fancy: Fit table, Cucumber text, programming language text, etc.

Where do we stand?

At this point, the single Cucumber test I showed above is breaking into at least three tests: the one on the blackboard, a similar one for the BIDDING to LOSING transition, and something as yet undescribed for the GUI. Two advantages to that: first, a correct change to the code should only break one of the tests. That breakage can’t be harder to figure out than breaking the single, more complicated test. Second, and maybe it’s just me, but I feel better getting triumphantly to the end of a medium-sized test than I do getting partway through a bigger end-to-end one.

The test on the blackboard is still a business-facing test; it’s written in the language of the business, not the language of the implementation, and it’s talking about the application, not pieces of it.

Here’s one implementation of the blackboard test. I’ve written it in my normal Ruby microtesting style because that shows more of the mechanism.

context pending state do

  setup do
    start_app_at( => PENDING))

  should respond to a new price by counter-bidding the minimum amount do
    during {
      @app.receive_auction_event(AuctionEvent.price(:price => 1000,
                                                    :increment => 98,
                                                    :bidder => someone else“))
    }.behold! {
                        with( => BIDDING,
                                                 :last_price => 1000,
                                                 :last_bid => 1098))

Here’s a picture of that test in action. It is not end-to-end because it doesn’t test the translation to-and-from XMPP.

In order to check that the Sniper has the right internal representation of what’s going on in the auction, I have it fling out (via the Observer or Publish/Subscribe pattern) information about that. That would seem to be an encapsulation violation, but this is only the information that we’ve determined (at the blackboard, above) to be relevant in/to the world outside the app. So it’s not like exposing whether internal data is stored in a dictionary, list, array, or tree.

At this point, I’d build the code that passed this test and others like it in the normal #goos outside-in style. Then I’d microtest the translation layer into existence. And then I’d do an end-to-end test, but I’d do it manually. (Gasp!) That would involve building much the same fake auction server as in #goos, but with some sort of rudimentary user interface that’d let me send appropriately formatted XMPP to the Sniper. (Over the course of the project, this would grow into a more capable tool for manual exploratory testing.)

So the test would mean starting the XMPP server, starting the fake auction and having it log into the server, starting the Sniper, checking that the fake auction got a JOIN request, and sending back a PRICE event. This is just to see the individual pieces fitting together. Specifically:

  • Can the translation layer receive real XMPP messages?
  • Does it hand the Sniper what it expects?
  • Does the outgoing translation layer/object really translate into XMPP?

The final question–is the XMPP message’s payload in the right format for the auction server?–can’t really be tested until we have a real auction server to hook up to. As discussed in #goos, those servers aren’t readily available, which is why the book uses fake ones. So, in a real sense, my strategy is the same as #goos’s: test as end-to-end as you reasonably can and plug in fakes for the ends (or middle pieces) that are too hard to reach. We just have a different interpretation of “reasonably can” and “too hard to reach”.

Having done that for the first test, would I do it again for the BIDDING to LOSING transition test? Well, yeah, probably, just to see a two-step transition. But by the time I finished all the transitions, I suspect code to pass the next transition test would be so unlikely to affect integration of interfaces that I wouldn’t bother.

Moreover, having finished the Nth transition test, I would only exercise what I’d changed. I would not (not, not, not!) run all the previous tests as if I were a slow and error-prone automated test suite. (Most likely, though, I’d try to vary my manual test, making it different from both the transition test that prompted the code changes and from previous manual tests. Adding easy variety to tests can both help you stumble across bugs and–more importantly–make you realize new things about the problem you’re trying to solve and the product you’re trying to build.)

What about real automated end-to-end tests?

I’d let reality (like the reality of missed bugs or tests hard to do manually) force me to add end-to-end tests of the #goos sort, but I would probably never have anywhere near the number of end-to-end scenario/workflow tests that #goos recommends (as of chapter 16). While I think workflows are a nice way of fleshing out a story or feature, a good way to start generating tests, and a dandy conversation tool, none of those things require automation.

I could do any number of my state-transition tests, making the Sniper ever more competent at dealing with auctions, but I’d probably get to the GUI at about the same time as #goos.

What do we know of the GUI? We know it has to faithfully display the externally-relevant known state of the auction. That is, it has to subscribe to what the Sniper already publishes. I imagine I’d have the same microtests and implementation as #goos (except for having the Swing TableModel subscribe instead of being called directly).

Having developed the TableModel to match my tests, I’d still have to check whether it matches the real Swing implementation. I’d do that manually until I was dragged kicking and screaming into using some GUI scraping tool to automate it.

How do I feel?

Nervous. #goos has not changed my opinion about end-to-end tests. But its authors are smarter and more experienced than I am. So why do they love–or at least accept–end-to-end tests while I fear and avoid them?

Want to pair with me?

I’ll be flying out of Miami for SpeakerConf. I’m thinking of taking the train there, and stopping to pair with people along the way (in the manner of Corey Haine’s pair programming tour). The basic idea is that we should pair on some real software that either you or I are working on right now. The type of software or programming language doesn’t matter. If you feed me or give me a place to sleep, that would be fine - but not necessary.

If you’d like to do that, send me mail.

I could take one of two trains to Washington, DC. The first follows this route:


or this one:


From DC, I’d travel down the coast along this route: