Archive for March, 2012

My story about cyclomatic complexity

In the 1980’s, I was interested in metrics, maybe even a fan, and cyclomatic complexity was trendy. I was never entirely comfortable with it: all the justification in terms of “basis paths” seemed hand-wavily disconnected from the reality of code, and the literature seemed to go out of its way to obscure the fact that the cyclomatic complexity of a routine is just the number of branches + 1 (including the implicit branches in boolean statements like a && b). (I don’t remember if there are any restrictions on that identity.)

Then I read a paper in CACM on the cyclomatic complexity of Unix commands, and it was quite a revelation. You see, I worked for a Unix porting shop. We were regularly presented with broken commands whose code we didn’t know well, commands we had to dive into and fix. Two of the commands whose code was ranked least complex by the paper were notorious among us for being hard to understand. Otherwise fearless programmers wept like children when told they’d have to work on `/bin/sh` and `adb`.

Why were those two commands so hard to understand, even if not complex? The one I remember best was `/bin/sh`. This was an older version, written by Steve Bourne, I believe. At that time (I assume he’s gotten over it), he was enamored with the idea of turning C into Algol using the C preprocessor.

That is, his code included a header file that looked like this:


#define BEGIN {
#define END }
#define AND &&

That meant the parts of C that are easy to understand were defined into another language. But you can’t do that with the parts of C that are hard to understand—”Is this a pointer to a function returning an array or a pointer to an array of functions?”—so all that was left (mostly) alone. The end result was like a combination of German and Sanskrit, which turns out to be hard to read even if you know both languages.

(As I recall, the “Algol” part wasn’t just a matter of lexical substitution: it also also meant using Algol idioms instead of familiar C ones. I don’t remember the details, but it might have been things like not using C’s null-terminated strings and malloc/free, but instead inventing new libraries to learn.)

None of this affected the branchiness of the `/bin/sh` code, so it didn’t affect the cyclomatic complexity. It just made the job of understanding the code much more, um, complex.

I don’t remember as much about `adb`. I think it was partly the complexity of the domain (it was an assembly-level debugger), partly that it was heavily data-driven (so that the complexity was in the setup and interconnection of data, not in the code that then processed it).

The point of all of this is that merely using a word from the domain of human psychology allows you to import, under cover of night, tons and tons of assumptions, making them harder to notice. By saying “complexity” instead of “branchiness”, the purveyors of one simplistic measure were able to make it seem profound. They could hide, perhaps even from themselves, the fact that they were, at best, feeling one part of the elephant.

The same applies, I suspect, to the common statement that “functional programs are easier to reason about.” I happen to believe that’s a heck of a lot closer to being true than that complexity is all about branchiness, but still: it’s important to realize that the motivation for that broad statement has historically been the much narrower claim that functional languages make it easier to prove that programs satisfy their specifications.

When most of us slap down our coffee cup next to the keyboard and open up some code in the editor, our conscious goal is not to make proofs. It’s to accomplish other ends. Therefore, a straight extrapolation from “easier to write proofs about” to “easier to reason about” is a big leap. It smuggles in an assumption that “to reason” is one sort of thing, and that one sort of thing is formal logical deduction. That’s a really strong and dubious claim about the nature of thought, one that hasn’t led to overwhelming practical success when used in, oh, artificial intelligence or as logical positivism.

As usual, we ought to leave the grand claims about “the way humans are” or “the way that it is best to live/work” to psychologists and preachers. Amongst ourselves, perhaps we should just say things like “I’ve been doing this one kind of fairly specific thing recently, and I’ve been surprised to find that X has been really helpful to me. Maybe it will help you too.”

Rubactive: functional reactive programming in Ruby

I was trying to figure out what functional reactive programming (FRP) is. I found the descriptions on the web too abstract(*) and the implementations to be in languages that weren’t easy for me to use. I could have been more patient, but I’ve always found rewriting (documentation and implementation) a good way to understand, so I (1) implemented my own (extremely trivial) library in Ruby and (2) changed the names to ones that made sense to me (since I found the terms commonly used to be more opaque tokens than ones that pointed me toward meaning).

And—why not?—I put the code up on github. I also wrote the documentation/tutorial I wish that I’d found early on.

I should point out that it’s quite possible I never really did understand FRP, making my tutorial a Bad Thing.

(*) I don’t mean to criticize that abstract descriptions. They were written by people in particular interpretive communities for other members. I just happen not to be one of those members.