String Interpolation in Clojure

Update: An updated version of the string interpolation function is now available as part of the core.incubator project in nestled safely the clojure.core.strint namespace (as of version 0.1.1-SNAPSHOT of core.incubator).  Seriously, use it instead.


It's strange how some days or weeks have running themes. One theme for me this week programming-wise has been string interpolation:

I've become weary of format of late, and all of the other formats out there aren't any more pleasant – variadic (and even keyword or named-argument) string replacement is just a dull tool compared to real interpolation.

The Scala implementation post was the last straw for me, especially because (with all due respect to the Vassil, as he's doing very well with the materials he has at his disposal) it showcases so many of the aspects of Scala that I came to dislike in the course of using it for a year or so: the tortured syntax; the rope, nay, the barbed wire that is implicit conversions; the bear trap of traits.

A Clojure Implementation

OK, enough flame-bait. What I'm really here to do is show how easy it is to add string interpolation to Clojure, and how simple its implementation is:

[sourcecode language="clojure"](ns commons.clojure.strint (:use [clojure.contrib.duck-streams :only (slurp*)])) (defn- silent-read [s] (try (let [r (-> s java.io.StringReader. java.io.PushbackReader.)] [(read r) (slurp* r)]) (catch Exception e))) ; this indicates an invalid form -- s is just string data (defn- interpolate ([s atom?] (lazy-seq (if-let [[form rest] (silent-read (subs s (if atom? 2 1)))] (cons form (interpolate (if atom? (subs rest 1) rest))) (cons (subs s 0 2) (interpolate (subs s 2)))))) ([#^String s] (let [start (max (.indexOf s "~{") (.indexOf s "~("))] (if (== start -1) [s] (lazy-seq (cons (subs s 0 start) (interpolate (subs s start) (= \{ (.charAt s (inc start)))))))))) (defmacro << [string] `(str ~@(interpolate string))) [/sourcecode]

Don't mind the namespace – that's just where we put extensions to Clojure-the-language. The public macro << (named as an homage to heredocs) takes a single string argument, and emits a str invocation that concatenates the string data and evaluated expressions contained within that argument.

Example Usage

First, let's get a value we can refer to:

[sourcecode]commons.clojure.strint=> (def n 99)[/sourcecode]

You can do simple value replacement:

[sourcecode]commons.clojure.strint=> (<< "There's ~{n} bottles of beer on the wall...") "There's 99 bottles of beer on the wall..." [/sourcecode]

And evaluate arbitrary code:

[sourcecode]commons.clojure.strint=> (<< "There's ~(dec n) bottles of beer on the wall...") "There's 98 bottles of beer on the wall..." commons.clojure.strint=> (<< "There's ~(seq (range n 90 -1)) bottles of beer on the wall...") "There's (99 98 97 96 95 94 93 92 91) bottles of beer on the wall..." [/sourcecode]

You can use any functions or macros you have available in your Clojure environment:

[sourcecode]commons.clojure.strint=> (defn- some-function [] {:name "Chas" :zip-code 01060}) #'commons.clojure.strint/some-function commons.clojure.strint=> (<< "My name is ~(:name (some-function)), it's nice to meet you.") "My name is Chas, it's nice to meet you." [/sourcecode]

…including interop with Java methods:

[sourcecode]commons.clojure.strint=> (<< "You have approximately ~(.intValue 5.5) minutes left.") "You have approximately 5 minutes left." [/sourcecode]

Caveats

First, let's say what's wrong with this implementation compared to, say, Ruby's string interpolation (I may be missing other points, I'm no Ruby hacker):

  1. Strings cannot be used within interpolated expressions; e.g. this will cause a straightforward parse exception:
    [sourcecode]commons.clojure.strint=> (<< "~(str n "another string")") #<CompilerException java.lang.IllegalArgumentException: Wrong number of args passed to: strint$-LT--LT- [/sourcecode]
    The Clojure reader sees this as providing three arguments to the << macro. Being able to use strings within interpolated expressions would require a "native" Clojure reader macro for interpolated strings, or the ability to define reader macros in "userspace" (Clojure's read table cannot be modified in Clojure code right now – this is an intentional design decision right now). Update: pmjordan mentioned on hackernews that you can get around this by escaping the nested strings, like so:
    [sourcecode]commons.clojure.strint=> (<< "~(str n \" another string\")") "99 another string" [/sourcecode]
    Very true, and very useful in a pinch, but I would definitely consider it to be a wart (and an issue that is insurmountable from Clojure userland right now).
  2. Heredocs aren't available. That's a far more general shortcoming compared to other languages, but is still related to string interpolation. This is significantly mitigated by the fact that Clojure strings are multiline already, but it would be nice in some circumstances to be able to specify a block of text using different delimiters for one-off templating, etc.
  3. Lazy sequences need to be made strict in order for them to print as they do at a REPL (thus the additional seq invocation in the (range n 90 -1)) example above).

Advantages

I'm sure a lot of people will look at this implementation and say, "so what?". Well, it's got a lot going for it:

  1. Simple implementation. Unless you've got a Pavlovian aversion to parentheses (but are somehow immune to piles of braces?), it's very comprehensible.
  2. It's user-land code. Many languages would require a compiler extension or modifications to the language core to pull this off.
  3. The interpolation happens at compile-time! The only processing that occurs at runtime is the concatenation of the chunks of each string, but all of the string and expression parsing happen before your code using the << macro would hit a customer's server or desktop. This is decidedly in contrast with the Scala interpolation implementation, where all of the string parsing is done at runtime; to my knowledge, doing anything else would require a compiler plugin there.
  4. It's fully composible with all other Clojure code. There's no restriction on where you can use the << macro, and no restriction on what Clojure (or Java!) code you can include in interpolation expressions.
  5. There's no magic. Many languages make it very easy to inject magical – as in, opaque – behaviour into your code. The Scala interpolation implementation is no different – to get that special behaviour out of a String, one must call a magical method i in order to rope in the machinery around the InterpolatedString implicit conversion. On the other hand, all of the effects and actors involved in the << macro are local, and its semantics and calling conventions are exactly the same as any other Clojure macro.

Exhale...

So, hopefully that puts string interpolation behind me. I'd love to see something like this become a reader macro in Clojure someday (maybe in conjunction with heredoc support), but in the meantime, this will make a lot of one-off templating jobs a whole lot easier in Clojure compared to using the usual variadic string replacement methods that are otherwise available.