Archive for the 'ruby' Category

How to install the Postgres pg gem on OSX Mountain Lion 10.8.2 (Ruby)

I have Mountain Lion on my new machine. (What a horrible release.) I innocently decided to upgrade my Postgres installation to 9.2.2. I had a lot of trouble getting the Ruby pg gem to build. I found no single source that solved the problem, so I thought I’d post the solution.

The symptom is this spewage:

ERROR:  Error installing pg:
	ERROR: Failed to build gem native extension.

        /Users/marick/.rvm/rubies/ruby-1.9.2-p320/bin/ruby extconf.rb
checking for pg_config... yes
Using config values from /usr/local/bin/pg_config
checking for libpq-fe.h... *** extconf.rb failed ***

Note: if the “checking for pg_config” line fails, you have a different problem. Either you need to set PGDATA correctly, or make sure that the `pg_config` command is in your $PATH. A number of pages out there on the web explain what’s going on.

  1. If you look at the `mkmf.log` file, you’ll see that (for some reason) the installer is trying to use `/usr/bin/gcc-4.2`. That doesn’t exist, so:

         sudo ln -s /Developer/usr/bin/gcc /usr/bin/gcc-4.2
    

    Thanks to Stackoverflow user dfrankow for that bit of the solution.

  2. That still fails because the program you’re compiling innocently tries to include `<stdio.h>` (etc.) Unlike Unix systems since the beginning of time, there is no `stdio.h` in `/usr/include`. To populate /usr/include you have to start Xcode (version 4.X), go to Preferences, pick the “Downloads” tab, and install the Command Line Tools.

    Thanks to Tim Burks on Twitter for this piece of the puzzle.

Rubactive: functional reactive programming in Ruby

I was trying to figure out what functional reactive programming (FRP) is. I found the descriptions on the web too abstract(*) and the implementations to be in languages that weren’t easy for me to use. I could have been more patient, but I’ve always found rewriting (documentation and implementation) a good way to understand, so I (1) implemented my own (extremely trivial) library in Ruby and (2) changed the names to ones that made sense to me (since I found the terms commonly used to be more opaque tokens than ones that pointed me toward meaning).

And—why not?—I put the code up on github. I also wrote the documentation/tutorial I wish that I’d found early on.

I should point out that it’s quite possible I never really did understand FRP, making my tutorial a Bad Thing.

(*) I don’t mean to criticize that abstract descriptions. They were written by people in particular interpretive communities for other members. I just happen not to be one of those members.

Some thoughts on classes after 18 months of Clojure

I had some thoughts about classes that wouldn’t fit into a talk I’m building about functional programming in Ruby, so I recorded them as a video.

Topics:

  • Using hashes instead of classes.

  • Classes as a documentation tool—specifically, as a way of making functions easy to find.

  • Preferring module inclusion to subclassing (which is akin to preferring adjectives to nouns as a way of organizing the documentation of verbs). (Vaguely similar to duck-typing in Haskell.)

  • Object dot notation as a more readable way of writing function composition. (Similar to the motivation for the -> macro in Clojure or type-directed name resolution in Haskell.)

Here are three Ruby functions…

Here are three Ruby functions. Each solves this problem: “You are given a starting and ending date and an increment in days. Produce all incremental dates that don’t include the starting date but may include the ending date. More formally: produce a list of all the dates such that for some n >= 1, date = starting_date + (increment * n) && date < = ending_date.

Solution 1:

Solution 2:

Solution 3 depends on a lazily function that produces an unbounded list from a starting element and a next-element function. Here’s a use of lazily:

As a lazy sequence, integers both (1) doesn’t do the work of calculating the ith element unless a client of integers asks for it, and also (2) doesn’t waste effort calculationg any intermediate values more than once.

Solution 3:

The third solution seems “intuitively” better to me, but I’m having difficulty explaining why.

The first solution fails on three aesthetic grounds:

  • It lacks decomposability. There’s no piece that can be ripped out and used in isolation. (For example, the body of the loop both creates a new element and updates an intermediate value.)

  • It lacks flow. It’s pleasing when you can view a computation as flowing a data structure through a series of functions, each of which changes its “shape” to convert a lump of coal into a diamond.

  • It has wasted motion: it puts an element at the front of the array, then throws it away. (Note: you can eliminate that by having current start out as exclusive+increment but that code duplicates the later +=increment. Arguably, that duplicated increment-date action is wasted (programmer) motion, in the sense that the same action is done twice. (Or: don’t repeat yourself / Eliminate duplication.))

The second solution has flow of values through functions, but it wastes a lot of motion. A bunch of dates are created, only to be thrown away in the next step of the computation. Also, in some way I cannot clearly express, it seems wrong to stick the inclusive_end between the exclusive_start and the increment, given that the latter two are what was originally presented to the user and the inclusive_end is a user choice. (Therefore shouldn’t the exclusive_start and increment be more visually bound together than this solution does?)

The third solution …

  • … is decomposable: the sequence of dates is distinct from the decision about which subset to use. (You could, for example, pass in the whole lazy sequence instead of a exclusive_start/increment pair, something that couldn’t be done with the other solutions.)

  • … eliminates wasted work, in that only the dates that are required are generated. (Well, it does store away a first date — excluded_start — that is then dropped. But it doesn’t create an excess new date.)

  • … has the same feel of a data structure flowing through functions that #2 has.

So: #3 seems best to me, but the advantages over the other two seem unconvincing (especially given that programmers of my generation are likely to see closure-evaluation-requiring-heap-allocation-because-of-who-knows-what-bound-variables as scarily expensive).

Have you better arguments? Can you refute my arguments?

I’m trying to show the virtues of a lazy functional style. Perhaps this is a bad example? [It’s a real one, though, where I really do prefer the third solution.]

TDD Workflow (Sinatra / Haml / jQuery) Part 1

Introduction

This is a draft. Worth continuing the series? Let me know.

Critter4Us is a webapp used to schedule teaching animals at the University of Illinois Vet School. Its original version was a Ruby/Sinatra application with a Cappuccino front end. Cappuccino lets you write desktop-like webapps using a framework modeled after Apple’s Cocoa. I chose it for two reasons: it made it easy to test front-end code headlessly (which was harder back then than it is now), and it let me reuse my RubyCocoa experience.

Earlier this year, I decided it was time for another bout of experimentation. I decided to switch from Cappuccino to jQuery, Coffeescript, Haml because I thought they were technologies I should know and because they’d force me to develop a new TDD workflow. I’d never gotten comfortable with the testing—or for that matter, any of the design—of “traditional” front-end webapp code and the parts of the backend from the controller layer up.

I now think I’ve reached a plateau at which I’m temporarily comfortable, so this is a good time to report. Other people might find the approach and the tooling useful. And other people might explain places where my inexperience has led me astray.
(more…)

Using functional style in a Ruby webapp

Motivation

Consider a Ruby backend that communicates with its frontend via JSON. It sends (and perhaps receives) strings like this:

Let’s suppose it also communicates with a relational database. A simple translation of query results into Ruby looks like this:

(I’m using the Sequel gem to talk to Postgres.)

On the face of it, it seems odd for our code to receive dumb hashes and arrays, laboriously turn them into model objects with rich behavior, fling some messages at them to transform their state, and then convert the resulting object graph back into dumb hashes and arrays. There are strong historical reasons for that choice—see Fowler’s Patterns of Enterprise Application Architecture—but I’m starting to wonder if it’s as clear a default choice as it used to be. Perhaps a functional approach could work well:

  • Functional programs focus on the flow of data through code, rather than on objects with changing state. The former seems more of a match for a typical webapp.

  • It’s common in functional languages to lean toward a few core datatypes—like hashes and arrays—that are operated on by a wealth of functions. We could skip the conversion step into objects. Rather than having to deal with the leaky abstraction of an object-relational mapping layer, we’d embrace the nature of our data.

Seems plausible, I’ve been thinking. However, I’ve never been wildly good at understanding the problems of an approach just by thinking about it. It’s more efficient for me to learn by doing. So I’ve decided to strangle an application whose communication with its database is, um, labored.

I’m going to concentrate on two things:

  • Structuring the code. More than a year of work on Midje has left me still unhappy about the organization of its code, despite my using Kevin Lawrence’s guideline: if you have trouble finding a piece of code, move it to where you first looked. I have some hope that Ruby’s structuring tools (classes, modules, include, etc.) will be useful.

  • Dependencies. As you’ll see, I’ll be writing code with a lot of temporal coupling. Is that and other kinds of coupling dooming me to a deeply intertwingled mess that I can’t change safely or quickly?

This blog post is about where I stand so far, after adding just one new feature.
(more…)

How mocks can cut down on test maintenance

After around 11 months of not working on it, I needed to make a change to Critter4us, an app I wrote for the University of Illinois vet school. The change was simple. When I tried to push it to Heroku, though, I discovered that my Ruby gems were too out of date. So, I ended up upgrading from Ruby 1.8 to 1.9, to Sinatra 1.3 from a Sinatra less than 1.0, to a modern version of Rack, etc. etc. In essence, I replaced all the turtles upon which my code-world was resting. There were some backwards-compatibility problems.

One incompatibility was that I was sending an incorrectly formatted URI to fetch some JSON data. The old version of Rack accepted it, but the new one rejected it. The easy fix was to split what had been a single `timeslice` parameter up into multiple parameters. [Update: I later did something more sensible, but it doesn’t affect the point of this post.] “Crap!”, I thought. “I’m going to have to convert who knows how much test data.” But I was pleased when I looked at the first test and saw this:

The key point here is that neither the format of the URI parameters nor the resulting timeslice object is given in its real form. Instead, they’re represented by strings that basically name their type. (In my Clojure testing framework, Midje, I refer to these things as “metaconstants“.)

The only responsibility this code has toward timeslices is to pass them to another object. That object, the `internalizer`, has the responsibility for understanding formats. The test (and code) change is trivial:

The test is even (and appropriately) less specific than before. It says only that the GET parameters (a hash) will contain some key/value pairs of use to the internalizer. It’s up to the internalizer to know which those are and do the right thing.

The point here is that the size of the test change is in keeping with the size of the code change. It is unaffected by the nature of the change to the data—which is as it should be.

This application is the one where I finally made the important decision to use mocks heavily in the Freeman and Pryce “London” style and—most importantly—to not fall into the trap of thinking “Mocks are stupid!” when I ran into problems. Instead, I said “I’m stupid!” and, working on that assumption, figured out what I was doing wrong.

I made that decision halfway through writing the app. One of the happy results of the mocking that followed was that a vast amount of test setup code devoted to constructing complete data structures went away. No more “fixtures” or “object mothers” or “test factories.”

Looking for contract work

As I mentioned earlier this year, I’m looking to make one of my decadal career shifts. Since that decision, I’ve been doing part-time contract work on a RubyCocoa application, and I’ve found it satisfying to deliver working software to people who are happy to get it. It’s also helped with the nagging dread that—while I can talk the talk about programming, testing, refactoring, and all that—I wouldn’t be able to walk the walk. It turns out I can. Although I’m slower than I’d like, I do respectable work.

In my ideal contract, I’d:

  • … code in Clojure, Ruby, or Javascript. Other than that, I don’t require super-advanced or cool technology, but I do have a hankering to work on something that could somehow be the inspiration for another book. I don’t care about the domain.

  • … devote 1/2 to 3/4 of my time to a single project, working with a single team, over a period of months. Some portion of that—a week or two a month—would be spent onsite. (Chicago would be the best place because it’s easily accessible by train. I live in Central Illinois.)

  • … work at a sustainable pace, and be given the leeway to do a good job by my standards. I’m trying to be artisanal about my code.

    (”I want to be artisanal” might raise red flags: will I decide I know what’s needed better than those who are paying for it? My saving grace is that I have a Labrador-like eagerness to please. I want product owners to smile when they think of me.)

  • … be able to stretch by occasionally going slower while I experiment with techniques. (My work on outside-in TDD in Clojure is an example.) I’m willing to be paid less in order to improve faster.

  • … be in frequent contact with the people who’ll viscerally appreciate the features they get for the money they spend. That given, I don’t care whether I am working directly for a product company or as a subcontractor on behalf of a contract programming company.

  • … work in an Agile style. (I almost didn’t think to include this, since I assume anyone interested in hiring me would expect or accept that. I’m not interested in a job teaching the glories of continuous integration or TDD or refactoring. I’m interesting in learning how to do them ever better, and in working with people who have the same interests.)

However, there may not be an ideal, and I don’t intend to be rigid about opportunities. I could see, for example, working with several teams at once, being someone who helps convert a daily grind into an exploration of new techniques. That’d be more like my consulting past, but I’d be more hands-on than I have in the past, involved for longer, and feel more responsible for the product.

Also: although I listed Chicago as my desired location, it has drawbacks when it comes to (1) winter and (2) helping me with my (currently somewhat faltering) attempt to learn Spanish. I wouldn’t mind working in Costa Rica, Argentina, or elsewhere in Latin America (probably for a longer continuous chunk of time onsite).

I don’t have a huge portfolio of code to show you. What I have is on Github. Critter4Us shows my Cappuccino and Ruby code. My Clojure code is limited to Midje, which is a programmer’s tool rather than an end-user project.

My email address is marick@exampler.com.

Combining related expectations into a single one (mocking, midje)

Here’s a simplified example of a Ruby test:

  def test_queuing
    during {
      @sut.sendMail(:ignored_sender)
    }.behold! {
      @mailPayload.receives(:attachment_path) { “attachment path” }
      @allContacts.receives(:selectedContact) { “contact” }
      listeners_receive_notification_with_info(News::QueueOutgoingAttachment,
                                               :source_path => “attachment path”,
                                               :recipient => “contact”)
      listeners_receive_notification(News::CheckOutboxQueue)
    }
  end

I find this annoying. It obscures what the test is about because the “definition” of the attachment path is far away from its use. If you were using your native tongue to describe what’s happening, you would not say:

First, let’s say that there’s a thing called “the attachment path”. You get it by asking the Mail Payload for its attachment path. There’s also a “contact”. It’s the contact selected in the list of All Contacts.

When the “send mail” button is clicked, a notification is generated. Its “source path” is the attachment path and the “recipient” is the contact.

(Well, you might if you were so completely corrupted by years of mathematics training that its definition-theorem-proof style seems natural.) If I were listening to you, I’d prefer to hear something like this:

When the “send mail” button is clicked, a notification is generated. (Key fact comes first.) Its “source path” is the Mail Payload’s attachment path and the “recipient” is the contact currently selected in All Contacts. (Definitions and uses in the same place.)

In code, that’d look more like this:

  def test_queuing
    during {
      @sut.sendMail(:ignored_sender)
    }.behold! {
      listeners_receive_notification_with_info(News::QueueOutgoingAttachment,
                                               :source_path => @mailPayload.attachment_path,
                                               :recipient => @allContacts.selectedContact)
      listeners_receive_notification(News::CheckOutboxQueue)
    }
  end

That emphasizes an important fact: that I don’t really care what the attachment path or selected contact are, because whatever they produce gets stuffed into the recipient. Indeed, the real selected contact isn’t anything like a string. In the first example, I can use “contact” to represent it because the code treats it as any-old opaque object, and a string is easy for me to type. (It’d probably be better to use symbols for this, I just now realize, because they’re less likely to be taken to be an example of the actual object.)

In OO Land, you could make an argument that my annoyance is a sign I’m Doing It Wrong: I’ve given responsibilities to the wrong objects. I think that’s certainly so in some cases, though (probably) not this one.

But the same thing happens in Functional Land, where the responsibility argument is harder to make because dumb objects are more natural. Here’s an example that uses my Midje TDD library for Clojure:

(fact "unless overridden, each procedure can be used with each animal"
  (all-procedures-no-exclusions) => { ...procedure-name... [] }
  (provided
    (procedures) => [ …procedure… ]
    (procedure-names […procedure… ]) => [ …procedure-name…]))

This is a little terser, but I have the same pattern of two expectation-statements to express one idea — in this case, the idea that the program wants to reach out and grab a list of all the names of all the procedures.

I’m thinking of allowing this kind of fact in Midje:

(fact "unless overridden, each procedure can be used with each animal"
  (all-procedures-no-exclusions) => { ...procedure-name... [] }
  (provided
    (procedure-names (procedures)) => [ … procedure-name … ]))

You’d get an expectation failure for line 4 if either (1) procedures was never called or (2) procedure-names didn’t use what procedures produced.

Comments? (from residents of either land) Who’s done things like this before? How has it worked out? Gotchas? (Replies might go in the Midje mailing list.)

Today’s fun ruby fact

It looks as if block_given? is scoped like a local variable. Try running this:

Shoulda uses this.