Sat, 31 Jan 2004
Well, this is depressing.
Some background: Dawn spent one month early in our marriage visiting the mastitis research labs of the U.S. I tagged along. Now, mastitis research labs are often found in less exciting places. What was I to do with my time?
Part of what I did was work through problems from John Tukey's Exploratory Data Analysis. John Tukey invented many ways of visualizing and exploring data. For example, he invented the box and whisker chart:
His book basically created a subfield of statistics, dubbed EDA.
I was attracted to EDA because of my long-standing love/hate relationship with metrics. Tukey seemed to me to have a love of numbers, of the detail they reveal, of the insights they can spark. Yet this dean of statisticians was at the same time wary of how easy it is to misuse numbers. Rather than jumping right to means, standard deviations, and curve-fitting, he emphasized pondering outliers and shapes of curves as a first way to get insight into the process under the data. Tukey seemed so much more sensible than so many software metrics people.
So today I was reading a blog entry about metrics by Alberto Savoia, someone who I think has a pretty sensible attitude toward numbers. (Full disclosure: I've received consulting dollars from Alberto's company, and I plan to receive more in the future. But I chose him to write three articles on load testing for what is now Better Software magazine in part because of his attitude, not because I foretold he'd give me money years later.) While I was thinking my boringly habitual cautionary thoughts - "How are the bad managers out there going to abuse this?" - I suddenly remembered EDA. I thought I would recommend on this blog that people get the book. I envisioned conference discussions about incorporating shapes and outliers into Big Visible Charts and intranet dashboards.
Then I noticed the price of the book. US$118 on bn.com. Completely unavailable at Amazon. Jeez. Other EDA books I remember seem to be out of print; one is $100. Both books used to be priced for undergraduate courses, and now they're priced for niche readers.
Back then, I'd bought Stata, a stats package, because it emphasized EDA. It still has the graphs and the stats, but Google and I could find only one reference to "exploratory" or "EDA" on the site (in the $100 book's blurb).
So that's what's depressing: a promising subfield that I'd hoped to turn my betters on to... seems to have practically vanished. Bummer.
Readers might want to check Tukey's book out of a university library. The book is pre-computer, so you get text on how to tally accurately by hand (probably not the way you do it). And Tukey's writing and typography styles are idiosyncratic. I found them kind of charming; you might not.
Jim Weirich has a nice little writeup on bindings in Ruby. Though he's writing in a different style, his piece's feel of progressive revelation is very much what I'm aiming at in my A Little Ruby, a Lot of Objects. I need to find some deadline mechanism to make me restart that book.