Archive for the 'clojure' Category

Some thoughts on classes after 18 months of Clojure

I had some thoughts about classes that wouldn’t fit into a talk I’m building about functional programming in Ruby, so I recorded them as a video.

Topics:

  • Using hashes instead of classes.

  • Classes as a documentation tool—specifically, as a way of making functions easy to find.

  • Preferring module inclusion to subclassing (which is akin to preferring adjectives to nouns as a way of organizing the documentation of verbs). (Vaguely similar to duck-typing in Haskell.)

  • Object dot notation as a more readable way of writing function composition. (Similar to the motivation for the -> macro in Clojure or type-directed name resolution in Haskell.)

Here are three Ruby functions…

Here are three Ruby functions. Each solves this problem: “You are given a starting and ending date and an increment in days. Produce all incremental dates that don’t include the starting date but may include the ending date. More formally: produce a list of all the dates such that for some n >= 1, date = starting_date + (increment * n) && date < = ending_date.

Solution 1:

Solution 2:

Solution 3 depends on a lazily function that produces an unbounded list from a starting element and a next-element function. Here’s a use of lazily:

As a lazy sequence, integers both (1) doesn’t do the work of calculating the ith element unless a client of integers asks for it, and also (2) doesn’t waste effort calculationg any intermediate values more than once.

Solution 3:

The third solution seems “intuitively” better to me, but I’m having difficulty explaining why.

The first solution fails on three aesthetic grounds:

  • It lacks decomposability. There’s no piece that can be ripped out and used in isolation. (For example, the body of the loop both creates a new element and updates an intermediate value.)

  • It lacks flow. It’s pleasing when you can view a computation as flowing a data structure through a series of functions, each of which changes its “shape” to convert a lump of coal into a diamond.

  • It has wasted motion: it puts an element at the front of the array, then throws it away. (Note: you can eliminate that by having current start out as exclusive+increment but that code duplicates the later +=increment. Arguably, that duplicated increment-date action is wasted (programmer) motion, in the sense that the same action is done twice. (Or: don’t repeat yourself / Eliminate duplication.))

The second solution has flow of values through functions, but it wastes a lot of motion. A bunch of dates are created, only to be thrown away in the next step of the computation. Also, in some way I cannot clearly express, it seems wrong to stick the inclusive_end between the exclusive_start and the increment, given that the latter two are what was originally presented to the user and the inclusive_end is a user choice. (Therefore shouldn’t the exclusive_start and increment be more visually bound together than this solution does?)

The third solution …

  • … is decomposable: the sequence of dates is distinct from the decision about which subset to use. (You could, for example, pass in the whole lazy sequence instead of a exclusive_start/increment pair, something that couldn’t be done with the other solutions.)

  • … eliminates wasted work, in that only the dates that are required are generated. (Well, it does store away a first date — excluded_start — that is then dropped. But it doesn’t create an excess new date.)

  • … has the same feel of a data structure flowing through functions that #2 has.

So: #3 seems best to me, but the advantages over the other two seem unconvincing (especially given that programmers of my generation are likely to see closure-evaluation-requiring-heap-allocation-because-of-who-knows-what-bound-variables as scarily expensive).

Have you better arguments? Can you refute my arguments?

I’m trying to show the virtues of a lazy functional style. Perhaps this is a bad example? [It’s a real one, though, where I really do prefer the third solution.]

Using functional style in a Ruby webapp

Motivation

Consider a Ruby backend that communicates with its frontend via JSON. It sends (and perhaps receives) strings like this:

Let’s suppose it also communicates with a relational database. A simple translation of query results into Ruby looks like this:

(I’m using the Sequel gem to talk to Postgres.)

On the face of it, it seems odd for our code to receive dumb hashes and arrays, laboriously turn them into model objects with rich behavior, fling some messages at them to transform their state, and then convert the resulting object graph back into dumb hashes and arrays. There are strong historical reasons for that choice—see Fowler’s Patterns of Enterprise Application Architecture—but I’m starting to wonder if it’s as clear a default choice as it used to be. Perhaps a functional approach could work well:

  • Functional programs focus on the flow of data through code, rather than on objects with changing state. The former seems more of a match for a typical webapp.

  • It’s common in functional languages to lean toward a few core datatypes—like hashes and arrays—that are operated on by a wealth of functions. We could skip the conversion step into objects. Rather than having to deal with the leaky abstraction of an object-relational mapping layer, we’d embrace the nature of our data.

Seems plausible, I’ve been thinking. However, I’ve never been wildly good at understanding the problems of an approach just by thinking about it. It’s more efficient for me to learn by doing. So I’ve decided to strangle an application whose communication with its database is, um, labored.

I’m going to concentrate on two things:

  • Structuring the code. More than a year of work on Midje has left me still unhappy about the organization of its code, despite my using Kevin Lawrence’s guideline: if you have trouble finding a piece of code, move it to where you first looked. I have some hope that Ruby’s structuring tools (classes, modules, include, etc.) will be useful.

  • Dependencies. As you’ll see, I’ll be writing code with a lot of temporal coupling. Is that and other kinds of coupling dooming me to a deeply intertwingled mess that I can’t change safely or quickly?

This blog post is about where I stand so far, after adding just one new feature.
(more…)

Top-down design in “functional classic” programming

While waiting for my product owner to get back to me, I was going through open browser tabs. I read this from Swizec Teller:

The problem is one of turning a list of values, say, [A, B, C, D] into a list of pairs with itself, like so [[A,B], [A,C], [A,D], [B, C], [B,D], [C,D]].

He had problems coming up with a good solution. “I can do that!” I said, launched a Clojure REPL, and started typing the whole function out. I quickly got bogged down.

This, I think, is a problem with a common functional approach. Even with this kind of problem—the sort of list manipulation that functional programming lives for—there’s a temptation to build a single function bottom up. “I need to create the tails of the sequence,” I thought, “because I know roughly how I’ll use them.”

For me (and let’s not get into all that again), it usually works better to go top down, mainly because it lets me think of discrete, meaningful functions, give them names, and write facts about how they relate to one another. So that’s what I did.

First, what am I trying to do? Not create all the pairs, but only ones in which an element is combined with another element further down the list. Like this:

As I often do when doing list-manipulation problems, I lay things out to visually emphasize the “shape” of the solution. That helps me see more clearly what has to be done to create that solution. There’s one set of pairs, each headed by the first element of the list, then another set, each headed by the second element. That is, I can say the above fact is true provided two other facts are true:

It’s easy to see how I get the heads—just map down the list (maybe I have to worry about the last element, maybe not—I’ll worry about it if it comes up). What are those heads combined with? Just the successive tails of the original list: [2 3], [3]. I’ll assume there’s a function that does that for me. That gives me this entire fact-about-the-world-of-this-program:

Given all that, downward-pairs is easy enough to write:

It’s important to me that I have reached a stable point here. When building up a complicated function from snippets in a REPL, I more often create the wrong snippets than when I define what those snippets need to do via a function that uses them. (I’m not saying that I never backtrack: I might find a subfunction too hard to write, or I may have carved up the world in a way that leads to gross inefficiency, etc. I’m saying that I seem to end up with correct answers faster this way.)

Now I just have to solve two simpler problems. Tails I’ve done before, and here’s a solution I like:

For those who don’t know Clojure well, this produces this sequence of calls to drop:

This does roughly the same work as an iteration would do because of how laziness is implemented.

That leaves me headed-pairs, which is a pretty straightforward map:

This seems like a reasonable solution, doesn’t strike me as being terribly inefficient (given laziness), has a readability that I like, doubtless took me less time than developing it bottom-up would have, and comes with tests (including an end-to-end test that I won’t bother showing).

The whole solution is here.

UPDATE: Yes, I could have used do or even gotten explicit with the sequence-m monad, but that wouldn’t have addressed the original poster’s point, which I took to be how to think about functional problems.

How mocks can cut down on test maintenance

After around 11 months of not working on it, I needed to make a change to Critter4us, an app I wrote for the University of Illinois vet school. The change was simple. When I tried to push it to Heroku, though, I discovered that my Ruby gems were too out of date. So, I ended up upgrading from Ruby 1.8 to 1.9, to Sinatra 1.3 from a Sinatra less than 1.0, to a modern version of Rack, etc. etc. In essence, I replaced all the turtles upon which my code-world was resting. There were some backwards-compatibility problems.

One incompatibility was that I was sending an incorrectly formatted URI to fetch some JSON data. The old version of Rack accepted it, but the new one rejected it. The easy fix was to split what had been a single `timeslice` parameter up into multiple parameters. [Update: I later did something more sensible, but it doesn’t affect the point of this post.] “Crap!”, I thought. “I’m going to have to convert who knows how much test data.” But I was pleased when I looked at the first test and saw this:

The key point here is that neither the format of the URI parameters nor the resulting timeslice object is given in its real form. Instead, they’re represented by strings that basically name their type. (In my Clojure testing framework, Midje, I refer to these things as “metaconstants“.)

The only responsibility this code has toward timeslices is to pass them to another object. That object, the `internalizer`, has the responsibility for understanding formats. The test (and code) change is trivial:

The test is even (and appropriately) less specific than before. It says only that the GET parameters (a hash) will contain some key/value pairs of use to the internalizer. It’s up to the internalizer to know which those are and do the right thing.

The point here is that the size of the test change is in keeping with the size of the code change. It is unaffected by the nature of the change to the data—which is as it should be.

This application is the one where I finally made the important decision to use mocks heavily in the Freeman and Pryce “London” style and—most importantly—to not fall into the trap of thinking “Mocks are stupid!” when I ran into problems. Instead, I said “I’m stupid!” and, working on that assumption, figured out what I was doing wrong.

I made that decision halfway through writing the app. One of the happy results of the mocking that followed was that a vast amount of test setup code devoted to constructing complete data structures went away. No more “fixtures” or “object mothers” or “test factories.”

Clojure and the future for programmers like me

TL;DR: Attitudes common in the Clojure community are making that community increasingly unappealing to people like me. They may also be leading to a language less appealing to people like me than a Lisp should be.

I think people like me are an important constituency.

The dodgy attitudes come from the Clojure core, especially Rich Hickey himself, I’m sad to say. I wish the Clojure core would be more supportive of programmers not like them, and of those with different goals. Otherwise, I’m afraid Clojure won’t achieve a popularity that justifies the kind of commitment I’ve been making to it. What should I do?

I’ve caused a bit of kerfuffle on Twitter about Clojure and testing. I’ve been somewhat aggressive because I’m trying to suss out how much effort to put into Clojure over, say, the next year.

[Note to the reader: scatter “I think” and “I could be wrong” statements freely throughout the following.]

Here’s the deal.

What’s at issue is a style of working (roughly XPish). After some years of work by many, it’s entirely possible in the Ruby world to work in that style. After doing the whole Agile spokesmodel thing for a decade, that’s all I want: a place where there are enough people with my style that we can (1) find each other to work together, and (2) work together without having to re-justify our style to every damn person we meet. In Ruby, XP style is mainstream enough that we can let the product of our work speak for us. (Hence, my recently co-founded contract software company.)

I think it was at the 2nd RubyConf that Nathaniel Talbott introduced Test::Unit, a Ruby version of jUnit. I can’t be positive, but I’m pretty sure Matz (the author of Ruby) didn’t care all that much about testing or, especially, TDD. Wasn’t his style. Perhaps still isn’t. However: he did not, to my recollection, express any reaction other than his usual “People are doing things with my language! I am happy!” And while Matz gives keynotes about design and such, they are pretty obviously about how he feels about design and about how he approaches it. They are invitations, not instructions.

One of the strengths of Ruby in the early days was this air of invitation. (I don’t really follow that community closely any more.)

Rich Hickey, for all his undoubted brilliance as a language implementor and his fruitful introspection into his own design process, presents a less welcoming demeanor than Matz’s. From what I’ve seen, he is more prone, in his public speaking and writing, to instruction than to invitation.

Which is OK with me, all things being equal. But there’s a problem. Brilliant solo language designers set a tone for the community that gathers around the language, especially in the early days. We’ve seen that with Python and Ruby. (Rails too, which acts as a contrast to the “Ruby Classic” tone.) It’s happening again in Clojure, where I’d say the picture of the ideal developer is someone who, with grim determination, wrestles a hard problem to the ground and beats it into submission with a multi-issue bound copy of The Journal of Functional Programming. I’m all for people who do that! So I don’t have to! But I personally prefer something more akin to the style of Ruby’s _why, who’d sneak up on a problem and tickle it until it was laughing too hard to be a problem any more.

I didn’t always prefer the Ruby style. Just after I started my first post-graduate job, Tony Hoare’s Turing Award lecture came out, containing the famous quote:

I conclude that there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.

I positively thrilled to that. That was the man I wanted to be.

As time went on, though, I ran into difficulties. What if you can’t solve the problem with that first method, whether because you’re not smart enough, or the problem domain is inherently messy, or the problem keeps changing on you? What then? Do you just give up? Or do you try to find a third method?

Those difficulties, plus some timely encounters with some strange thinkers, notably Feyerabend and his Against Method, put me on my current career path. Which causes me to say this:

For someone doing core infrastructural design, such as the huge task of tricking the JVM into running a Clojure program speedily, an emphasis on purity and simplicity—one bordering on asceticism—seems to me essential. God speed Rich Hickey for that!

But when a community adopts purity, simplicity, and asceticism as core tenets—ones approaching matters of morality—you can end up with a community that makes things unpleasant for not just the impure and hedonistic, but also for just plain normal people. And they will go elsewhere.

Much as I love Lisp and admire Clojure, if it can’t grab the enthusiasm of, say, the median pretty-good Ruby or Java programmer who has the TDD habit, it’s not economically sensible for a cofounder of a new company to put as much effort into it as I have been putting.

If Clojure turns into a tool for hardcore people to solve hardcore problems, without making concessions for people with softer cores—if it remains a niche language, in other words—I should probably spend more time with Ruby. And I do think, right now, that Clojure is more likely to be a niche language than I thought last year.

So in the last few weeks, I’ve been sending out little probes, mainly to see if anyone with some star power can provisionally buy the story above. If so, I can hope to see an intentional broadening in a direction designed to shackle me and those like me more tightly into the Clojure community—and so help it succeed.

So far, meh.

If I’m sensible, a continuing “meh” response will prompt me to reduce my effort around Clojure—to resist the temptation to make another tutorial, to cut back on new Midje features (but not on fixes and the 1.3 port, I hasten to reassure its users!), and so on. If Clojure then wins big, I can jump back on the bandwagon. I will have lost some early mover advantage, but that doesn’t look to be huge anyway. (Not like it was with Agile, or would have been with Ruby if I hadn’t drifted away before Rails came out.)

I haven’t decided what to do. Reaction to this post might help. Probably the best thing would be for me to be convinced I’m nuts. Hey, it’s happened before!

A postscript on the validity of what I claim

A postscript on worries about the effect of performance on expressiveness

Further reading for “Functional Programming for Object-Oriented Programmers”

I’ve been asked for things to read after my “Functional Programming for Object-Oriented Programmers” course. Here are some, concentrating on the ones that are more fun to read and less in love with difficulty for difficulty’s sake.

If you were more interested in the Clojure/Lisp part of the course:

  • The Little Schemer and other books by the same authors. These teach some deep ideas even though they start out very simply. You’ll either find the way they’re written charming or annoying.

  • Land of Lisp is a strange book on Lisp.

  • If you were totally fascinated by multimethods, you could try The Art of the Metaobject Protocol.

If you were more interested in the functional programming part of the course, the suggestions are harder. So much of the writing on functional programming is overly math-fetishistic.

  • Learn You a Haskell for Great Good is another quirky book (lots of odd drawings that have little to do with the text), but it covers many topics using the purest functional language. There’s both a paper version and a free online version. The online version has better drawings (color!).

  • I haven’t read Purely Functional Data Structures, though I think I have it somewhere. @SamirTalwar tweeted to me that “I’m enjoying Purely Functional Data Structures. It’s quite mathsy, but with a practical bent.” The book is derived from the author’s PhD thesis.

The approach I took in the course was inspired by the famous Structure and Interpretation of Computer Programs (free online version). Many people find the book more difficult than most of the ones above. Part of the reason is that the problems are targeted at MIT engineering students, so you get a lot of problems that involve resistors and such.

Functional Programming for the Object-Oriented Programmer

Here is a draft blurb for a course I’d like to teach, possibly in Valladolid, Spain, in September. I’d like to see if enough people are interested in it, either there or somewhere else. (There’s a link to a survey after the description.)

Short version

  • Target audience: Programmers who know Java, C#, C++, or another object-oriented language. Those who know languages like Ruby, Python, or JavaScript will already know some—but not all!—of the material.

  • The goal: To make you a better object-oriented programmer by giving you hands-on experience with the ideas behind functional languages. The course will use Clojure and JavaScript for its examples. You’ll learn enough of Clojure to work through exercises, but this is not a “Learn Clojure” course.

  • After the course: I expect you to return to your job programming in Java or Ruby or whatever, not to switch to Clojure. To help put the ideas into practice, you’ll get a copy of Dean Wampler’s Functional Programming for Java Developers.

Syllabus

  • Just enough Clojure
  • Multimethods as a generalization of methods attached to objects
  • Protocols as a different approach to interfaces
  • Functions as objects, objects as functions
  • Mapping and folding in contrast to iteration
  • Functions that create functions
  • Laziness and infinite sequences
  • Time as a sequence of events
  • Generic datatypes with many functions versus many small classes
  • Immutable data structures and programs as truth-statements about functions
Interested?

Longer version

Many, many of the legendary programmers know many programming languages. What they know from one language helps them write better code in another one. But it’s not really the language that matters: adding knowledge of C# to your knowledge of Java doesn’t make you much better. The languages are too similar—they encourage you to look at problems in pretty much the same way. You need to know languages that conceptualize problems and solutions in substantially different ways.

Once upon a time, object-oriented programming was a radical departure from what most programmers knew. So learning it was both hard and mind-expanding. Nowadays, the OO style is the dominant one, so ambitious people need to proactively seek out different styles.

The functional programming style is nicely different from the OO style, but there are many interesting points of comparison between them. This course aims to teach you key elements of the functional style, helping you take them back to your OO language (probably Java or C#).

There’s a bit more, though: although the functional style has been around for many years, it’s recently become trendy, partly because language implementations keep improving, and partly because functional languages are better suited for solving the multicore problem* than are other languages. Some trends with a lot of excitement behind them wither, but others (like object-oriented programming) succeed immensely. If the functional style becomes commonplace, this course will position you to be on the leading edge of that wave.

There are many functional languages. There are arguments for learning the purest of them (Haskell, probably). But it’s also worthwhile to learn a slightly-less-pure language if there are more jobs available for it or more opportunities to fold it into your existing projects. According that standard, Clojure and Scala—both hosted on the JVM—stand out. This course will be taught mainly in Clojure. Where appropriate, I’ll also illustrate the ideas in JavaScript, a language whose good parts were strongly influenced by the functional style.

While this particular course will concentrate on what you can do in Clojure, it’d be unfair to make you do all the work of translating these ideas into your everyday language. To help you with that, every participant will get a copy of Dean Wampler’s Functional Programming for Java Developers, which shows how functional ideas work in Java.

Your instructor

Brian Marick was first exposed to the functional style in 1983, when the accident of knowing a little bit of Lisp tossed him into the job of technical lead on a project to port Common Lisp to a now-defunct computer architecture. That led him to a reading spree about all things Lisp, the language from which the functional style arguably originated. He’s been a language geek ever since, despite making most of his living as a software process consultant. He’s the author of the popular Midje testing library for Clojure and has written two books (Everyday Scripting with Ruby and Programming Cocoa with Ruby).

For examples of Marick’s teaching style, read his explanations of the Zipper and Enlive libraries or watch his videos on Top-down TDD in Clojure and Monads.

Interested?

 

*The multicore problem is that chip manufacturers have hit the wall making single CPUs faster. The days of steadily increasing clock rates are over. Cache sizes are no longer the bottleneck. The big gains from clever hardware tricks have been gotten. So we programmers are getting more cores instead of faster cores, and we have to figure out how to use multiple cores in a single program. That’s a tough problem.

Enlive tutorial

Enlive is a templating library for Clojure. Instead of the usual substitute-into-delimited-text approach, it works by editing node trees selected by CSS selectors. I’ve written a tutorial for it. Comments welcome.

A small note about with-group and hiccup

As novice user of Hiccup, a Clojure HTML-layout format, I wanted to write a form that gathered information about N animals. Each animal had a name, a nickname, etc. I wanted to use Ring’s ring.middleware.nested-params so that I could iterate over new animals.

I had some trouble getting it to work, so perhaps this note will help someone else. Note that this applies to hiccup-0.3.5.

My first attempt was something like this:

This was derived from the description of with-group here. However, this didn’t work. Only the last text-field appears in the HTML page. [Note: I was using Noir to render pages.]

Having written a builder-style renderer like Hiccup before, I thought. “Ah. I must make a vector of the rendered functions”:

That lead to a scary exception:

java.lang.RuntimeException: java.lang.RuntimeException: java.lang.IllegalArgumentException: [:label {:for “animal1-true-name”} “Name: “] is not a valid tag name.

The solution I found was to wrap the sub-form inside another element, in this case a :p:

It’s ironic that I’d do something like that as a matter of course, except that I was exploring the API. Which led me astray.

UPDATE: Chris Granger points out “Hiccup treats any vector as a tag, but any other seq will work. Vectors have semantic meaning, the others don’t.” Instead of wrapping the separate steps in a vector, I should have used a list:

I should have realized that in the beginning (though one could argue that the two uses for a vector are distinguishable so why not make a vector work as well as a list?).

Chris also writes “In my experience the best solution is to wrap the block in (html … ). Then you never worry.” That is: