Highlights of the 2011 Clojure/conj

The second annual Clojure/conj came to a close last November. I was able to attend and wanted to share some impressions of the conference in general and a couple of the more inspiring talks in particular. I’ll be happy to further detail any of the talks for any interested parties.

Clojure, for those unfamiliar, is a functional Lisp dialect created by Rich Hickey. Rich wanted to bring the flexibility and productivity of dynamic languages to the Java world. Clojure compiles to Java bytecode, allowing it to run on the JVM and thus benefit from the entire Java library/deployment ecosystem. There are additional Clojure compilers targeting the CLR and JavaScript platforms.

Chief among Clojure’s aims are performance, programmer productivity and simplified concurrent programming. Clojure is a functional language that looks beyond Moore’s law to Amdahl’s law, predicting that the increasing number of compute tasks that can benefit from multi-core processing will require the adoption of a programming model that fully embraces concurrency.

A pragmatic, general-purpose language, Clojure is finding purchase in a wide array of pursuits. Being that Clojure is perhaps best known for its success with Big Data and concurrency tasks, I was impressed with the diversity of the topics; the conference was a series of success stories in such areas as machine intelligence, performance engineering, genome processing, web development, and live music coding.

What surprised me about the attendees was how many of them, myself included, were drawn to Clojure despite having little experience with it. It was a pretty even split between Rubyists and Java programmers, with a substantial Lisp contingent intermixed.

As the conference was single-track, I was able to attend every talk. While they were, with a very few exceptions, all very interesting, a couple in particular held immediate practical interest.

Recently acquired by Twitter via BackType, Cascalog is a Clojure-based query language that bills itself as “data processing on Hadoop without the hassle”. Cascalog emphasizes function composition to make complicated social-graph-style queries easy. Its creator Nathan Marz demonstrated a variety of querying tasks to answer questions like “Which Twitter ‘following’ relationships are between two users that each have more than 2 following relationships” that can be awkward in the SQL world. Using Cascalog, it looks very straight-forward (and shorter than the previous sentence in this paragraph)…

(let [many-follows (<- [?person]
 (follows ?person _)
 (c/count ?count)
 (> ?count 2))]
 (?<- (stdout) [?person1 ?person2]
 (many-follows ?person1)
 (many-follows ?person2)
 (follows ?person1 ?person2)))

To break this down, the let form defines many-follows as a function that takes a person as input, retrieves a count of their followers, and ensures that count is greater than 2.

Then, with stdout as our output, and two arbitrary persons as input, we ensure each person qualifies via the many-follows function and that person1 follows person2.

The above is computed over a series of map/reduce jobs that retrieve the necessary data, break it into chunks, farm those chunks out to nodes, and recombine the output of those nodes.

Chas Emerick’s talk, “Modeling the world probabilistically using Bayesian networks in Clojure” was particularly interesting. Chas described his project Raposo, a Clojure library for Bayesian inference and modeling he’s preparing for release. His use case for Raposo requires document analysis on large bodies of documents with a constantly changing set of ad-hoc formats, such as SEC filings, which need to be parsed into structured data for extraction. FM has clearly done much work in this area and this talk got me very interested in learning more about it.

There were several talks on ClojureScript, a Clojure compiler targeting JavaScript. ClojureScript aims to be a modern, robust language that can reach to all the places that JavaScript does, bringing with it Clojure features that are too painful to code in JavaScript directly, like namespaces.

ClojureScript is often confused with projects like CoffeeScript; while CoffeeScript is a small additional layer of syntax that compiles to JavaScript, ClojureScript is Clojure hosted on the JavaScript VM, and offers very good feature parity with Clojure as well as excellent JavaScript interoperability; a particular concern I had here was integration with third-party libraries, but by all reports, using external libraries like jQuery from within ClojureScript is trivially easy.

One of the more exciting talks on ClojureScript was an ad-hoc after-hours demonstration of the ClojureScript REPL, which David Nolan used from within Emacs to send Clojure commands to a running browser instance, enjoying full control over browser state and the DOM. An interesting consequence of this REPL is that there now exists a JavaScript debugger for Internet Explorer going back to version 6. Debugging JavaScript in Internet Explorer has long been a pain point and has started improving only in recent versions.

Of less practical value, but very cool nonetheless, was Sam Aaron’s demonstration of Overtone, a live music coding platform. Typing Clojure forms using the Overtone syntax into Emacs and executing them let Sam programmatically create music on the fly. He used it to assign controls to a Monome, a wired box with lit buttons, allowing him to produce beats, synth sounds, and samples by pressing physical buttons.  His impromptu jam session got a standing ovation and the lion’s share of the post-conference press on Twitter.

Also deserving of a shoutout is Daniel Spiewak, whose talk on data structures was so animated, emphatic and incisive that it kept the entire room awake during a talk on data structures. In fact, calling it a ‘talk’ is probably under-selling it; it was more of divinely-inspired Robin-Williams-style conniption fit.

A few of the more notable names in attendance were Rich Hickey himself; Daniel P. Friedman, author of The Little Schemer and a host of other Lisp-related texts; William Byrd, co-author with Prof. Friedman of The Reasoned Schemer; “Uncle Bob” Martin, object-oriented author and speaker; Phil Bagwell, inventor of the VList data structure; Ola Bini of JRuby fame, and a large and boisterous subset of the Clojure/core team. There were certainly others I’m not hip enough to recognize.

That’s my run-through of the more inspiring talks at Clojure/conj 2011. My side-project TODO list is now a few items larger, and I’m already tinkering with the domestic schedule to see if I can attend Clojure/West in San Jose in mid-March 2012 — one week after PyCon 2012 and 15 minutes away, incidentally.

Some links for more information…

Slides for all talks are being gathered at…

https://github.com/relevance/clojure-conj/tree/master/2011-slides

Videos of the talks will eventually be available at…

http://blip.tv/clojure/

Cascalog slides..

https://github.com/relevance/clojure-conj/raw/master/2011-slides/nathan-marz-cascalog.pdf

Cascalog example code…

https://github.com/nathanmarz/cascalog-conj/blob/master/src/clj/cascalog/conj/play.clj

Bayesian Network slides…

https://github.com/relevance/clojure-conj/raw/master/2011-slides/cemerick-modeling-the-world-with-bayesian-networks.pdf

Rich Hickey unveils ClojureScript…

Sam Aaron’s Overtone slides…

https://github.com/relevance/clojure-conj/raw/master/2011-slides/samaaron-overtone.pdf

Overtone introduction video…

Daniel Spiewak’s slides on functional data structures…

https://github.com/relevance/clojure-conj/raw/master/2011-slides/daniel-spiewak-extreme-cleverness.pdf

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>