Clojure – Fascinated, Disappointed, Astonished

brackets-banner I’ve had a pleasure to work with Piotrek Jagielski for about two weeks on Clojure project. I’ve learned a lot, but there is still a lot to know about Clojure for me. In this post I’ll write what fascinated, disappointed and astonished me about this programming language.

Clojure & InteliJ IDEA tips

Before you start your journey with Clojure: * Use Cursive plugin for InteliJ IDEA. In ’14 Edition it was not in the standard plug-in repository (remove La Clojure plug-in and Cursive repository manually). For IDEA ’15 it is in repository. * Colored brackets help me a lot. You can find configuration for colored brackets on Misophistful Github.

Fascinated

Syntax

For many people Clojure brackets are reasons to laugh. Jokes like that were funny at first: “How many brackets did you write today?” I have to admit, that at the beginning using brackets was not easy for me. Once I’ve realized that the brackets are just on the other side of the function name, everything was simple and I could code very fast. After few days I’ve realized that this brackets structure forces me to think more about the structure of the code. As a result the code is refactored and divided into small functions. Clojure forces you to use good programming habits.

Data structure is your code

Clojure is homoiconic, which means that the Clojure programs are represented by Clojure data structures. This means that when you are reading a Clojure code you see lists, maps, vectors. How cool is that! You only have to know few things and you can code.

Do not restart your JVM

Because Clojure code is represented as data structures, you can pass data structure (program) to running JVM. Furthermore, compiling your code to bytecode (classes, jars) may be eliminated.

For example, when you want to test something you are not obligated to start new JVM with tests. Instead you can just synchronize your working file with running REPL and run the function.

Traditional way of working with JVM is obsolete.

clojure-repl

In the picture above, on the left you can see an editor, on the right there is running REPL.

The same way you can run tests, which is extremely fast. In our project we had ~80 tests. Executing them all took about one second.

Simplicity is the ultimate sophistication – Leonardo Da Vinci

After getting familiar with this language, it was really easy to read code. Of course, I was not aware of everything what was happening under the hood, but consistency of the written program evoked sense of control.

Disapointed

Data structure is your code

When data structure is your code, you need to have some additional operators to write effective programs. You should get to know operators like ‘->>’, ‘->’, ‘let’, ‘letfn’, ‘do’, ‘if’, ‘recur’ …

Even if there is a good documentation (e.g. Let), you have to spend some time on analyzing it, and trying out examples.

As the time goes on, new operators will be developed. But it may lead to multiple Clojure dialects. I can imagine teams (in the same company) using different sets of operators, dealing with the same problems in different ways. It is not good to have too many tools. Nevertheless, this is just my suspicion.

Know what you do

I’ve written a function that rounds numbers. Despite the fact that this function was simple, I wanted to write test, because I was not sure if I had used the API in correct way. There is the test function below:

(let [result (fixture/round 8.211M)]
  (is (= 8.21M result))))

Unfortunately, tests were not passing. This is the only message that I received:

:error-while-loading pl.package.calc-test
NullPointerException   [trace missing]
(pst)
NullPointerException

Great. There is nothing better than a good exception error. I’ve spent a lot of time trying to solve this, and solution was extremely simple. My function was defined with defn-, instead of defn. defn- means private scope and test code, could not access testing function.

Do not trust assertions

Assertions can be misleading. When tested code does not work properly and returns wrong results, error messages are like this:

ERROR in math-test/math-operation-test (RT.java:528)
should round using half up
expected: (= 8.31M result)
 actual: java.lang.IllegalArgumentException: Don't know how to create ISeq from: java.math.BigDecimal

I hadn’t got time to investigate it, but in my opinion it should work out of the box.

Summary

It is a matter of time, when tools will be better. Those problems will slow you down, and they are not nice to work with.

Astonished

The Clojure concurrency impressed me. Until then, I knew only standard Java synchronization model and Scala actors model. I’ve never though that concurrency problems can be solved in a different way. I will explain Clojure approach to concurrency, in details.

Normal variables

The closest Clojure’s analogy to the variables are vars, which can be created by def.

(defn a01 []
  (def amount 10)
  (def amount 100)
  (println amount))

We also have local variables which are only in let scope. If we re-define scope value of amount, the change will take place only in local context.

(defn a02 []
  (let [amount 10]
    (let [amount 100]
      (println amount))
    (println amount)))

The following will print:

100
10

Nothing unusual. We might expect this behavior.

Concurrent access variables

The whole idea of concurrent access variables can be written in one sentence. Refs ensures safe shared access to variables via STM, where mutation can only occur via transaction. Let me explain it step by step.

What is Refs?

Refs (reference) is a special type to hold references to your objects. As you can expect, basic things you can do with it is storing and reading values.

What is STM?

STM stands for Software Transactional Memory. STM is an alternative to lock-based synchronization system. If you like theory, please continue with Wikipedia, otherwise continue reading to see examples.

Using Refs

(defn a03 []
  (def amount (ref 10))
  (println @amount))

In the second line, we are creating reference. Name of this reference is amount. Current value is 10. In the third line, we are reading value of the reference called amount. Printed result is 10.

Modifying Refs without transaction

(defn a04 []
  (def amount (ref 10))
  (ref-set amount 100)
  (println @amount))

 

Using ref-set command, we modify the value of the reference amount to the value 100. But it won’t work. Instead of that we caught exception:

IllegalStateException No transaction running  clojure.lang.LockingTransaction.getEx (LockingTransaction.java:208)

Using transaction

(defn a05 []
  (def amount (ref 10))
  (dosync (ref-set amount 100))
  (println @amount))

To modify the code we have to use dosync operation. By using it, we create transaction and only then the referenced value will be changed.

Complete example

The aim of the previous examples was to get familiar with the new operators and basic behavior. Below, I’ve prepared an example to illustrate bolts and nuts of STM, transactions and rollbacks.

The problem

Imagine we have two references for holding data: * source-vector containing three elements: “A”, “B” and “C”. * empty destination-vector.

Our goal is to copy the whole source vector to destination vector. Unfortunately, we can only use function which can copy elements one by one – copy-vector.

Moreover, we have three threads that will do the copy. Threads are started by the future function.

Keep in mind that this is probably not the best way to copy vectors, but it illustrates how STM works.

(defn copy-vector [source destination]
  (dosync
    (let [head (take 1 @source)
          tail (drop 1 @source)
          conj (concat head @destination)]
      (do
        (println "Trying to write destination ... ")
        (ref-set destination conj)
        (println "Trying to write source ... ")
        (ref-set source tail)
        (println "Sucessful write " @destination)))))

(defn a06 []
  (let [source-vector (ref ["A" "B" "C"]) destination-vector (ref [])]
    (do
      (future (copy-vector source-vector destination-vector))
      (future (copy-vector source-vector destination-vector))
      (future (copy-vector source-vector destination-vector))
      (Thread/sleep 500)
      @destination-vector
      )))
Execution

Below is the output of this function. We can clearly see that the result is correct. Destination vector has three elements. Between Sucessful write messages we can see that there are a lot of messages starting with Trying to write. What does it mean? The rollback and retry occurred.

(l/a06)
Trying to write destination ...
Trying to write source ... 
Trying to write destination ...
Trying to write destination ... 
Sucessful write (A)
Trying to write destination ...
Trying to write destination ...
Trying to write source ...
Sucessful write  (B A)
Trying to write destination ...
Trying to write source ...
Sucessful write  (C B A)
=> ("C" "B" "A")
Rollback

Each thread started to copy this vector, but only one succeed. The remaining two threads had to rollback work and try again one more time.

stm-rollback

When Thread A (red one) wants to write variable, it notices that the value has been changed by someone else – conflict occurs. As a result, it stops the current work and tries again whole section of dosync. It will try until every write operation succeed.

Pros and cons of STM

Cons: * Everything that happens in dosync section has to be pure, without side effects. For example you can not send email to someone, because you might send 10 emails instead of one.
* From performance perspective, it makes sense when you are reading a lot from Refs, but rarely writing it.

Pros: * Written code is easy to read, understand, modify. * Refs and transactions are part of standard library, so you can use it in Vanilla Java. Take a look at this blog post for more examples.

Summary

There is a lot that Java developers can gain from Clojure. They can learn how to approach the code and how to express the problem in the code. Also they can discover tools like STM.

If you like to develop your skills, you should definitely experiment with Clojure.

Blog from Michał Lewandowski personal blog.

You May Also Like

33rd Degree day 1 review

33rd Degree is over. After the one last year, my expectations were very high, but Grzegorz Duda once again proved he's more than able to deliver. With up to five tracks (most of the time: four presentations + one workshop), and ~650 attendees,  there was a lot to see and a lot to do, thus everyone will probably have a little bit different story to tell. Here is mine.

Twitter: From Ruby on Rails to the JVM

Raffi Krikorian talking about Twitter and JVM
The conference started with  Raffi Krikorian from Twitter, talking about their use for JVM. Twitter was build with Ruby but with their performance management a lot of the backend was moved to Scala, Java and Closure. Raffi noted, that for Ruby programmers Scala was easier to grasp than Java, more natural, which is quite interesting considering how many PHP guys move to Ruby these days because of the same reasons. Perhaps the path of learning Jacek Laskowski once described (Java -> Groovy -> Scala/Closure) may be on par with PHP -> Ruby -> Scala. It definitely feels like Scala is the holy grail of languages these days.

Raffi also noted, that while JVM delivered speed and a concurrency model to Twitter stack, it wasn't enough, and they've build/customized their own Garbage Collector. My guess is that Scala/Closure could also be used because of a nice concurrency solutions (STM, immutables and so on).

Raffi pointed out, that with the scale of Twitter, you easily get 3 million hits per second, and that means you probably have 3 edge cases every second. I'd love to learn listen to lessons they've learned from this.

 

Complexity of Complexity


The second keynote of the first day, was Ken Sipe talking about complexity. He made a good point that there is a difference between complex and complicated, and that we often recognize things as complex only because we are less familiar with them. This goes more interesting the moment you realize that the shift in last 20 years of computer languages, from the "Less is more" paradigm (think Java, ASM) to "More is better" (Groovy/Scala/Closure), where you have more complex language, with more powerful and less verbose syntax, that is actually not more complicated, it just looks less familiar.

So while 10 years ago, I really liked Java as a general purpose language for it's small set of rules that could get you everywhere, it turned out that to do most of the real world stuff, a lot of code had to be written. The situation got better thanks to libraries/frameworks and so on, but it's just patching. New languages have a lot of stuff build into, which makes their set of rules and syntax much more complex, but once you get familiar, the real world usage is simple, faster, better, with less traps laying around, waiting for you to fall.

Ken also pointed out, that while Entity Service Bus looks really simple on diagrams, it's usually very difficult and complicated to use from the perspective of the programmer. And that's probably why it gets chosen so often - the guys selling/buying it, look no deeper than on the diagram.

 

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Venkat Subramaniam with Dima
Dima got lucky. Or maybe not.

Venkat Subramaniam is the kind of a speaker that talk about very simple things in a way, which makes everyone either laugh or reflect. Yes, he is a showman, but hey, that's actually good, because even if you know the subject quite well, his talks are still very entertaining.
This talk was very generic (here's my thesis: the longer the title, the more generic the talk will be), interesting and fun, but at the end I'm unable to see anything new I'd have learned, apart from the distinction between Dynamic vs Static and Strong vs Weak typing, which I've seen the last year, but managed to forgot. This may be a very interesting argument for all those who are afraid of Groovy/Ruby, after bad experience with PHP or Perl.

Build Trust in Your Build to Deployment Flow!


Frederic Simon talked about DevOps and deployment, and that was a miss in my  schedule, because of two reasons. First, the talk was aimed at DevOps specifically, and while the subject is trendy lately, without big-scale problems, deployment is a process I usually set up and forget about. It just works, mostly because I only have to deal with one (current) project at a time. 
Not much love for Dart.
Second, while Frederic has a fabulous accent and a nice, loud voice, he tends to start each sentence loud and fade the sound at the end. This, together with mics failing him badly, made half of the presentation hard to grasp unless you were sitting in the first row.
I'm not saying the presentation was bad, far from it, it just clearly wasn't for me.
I've left a few minutes before the end, to see how many people came to Dart presentation by Mike West. I was kind of interested, since I'm following Warsaw Google Technology User Group and heard a few voices about why I should pay attentions to that new Google language. As you can see from the picture on the right, the majority tends to disagree with that opinion.

 

Non blocking, composable reactive web programming with Iteratees

Sadek Drobi's talk about Iteratees in Play 2.0 was very refreshing. Perhaps because I've never used Play before, but the presentation was flawless, with well explained problems, concepts and solutions.
Sadek started with a reflection on how much CPU we waste waiting for IO in web development, then moved to Play's Iteratees, to explain the concept and implementation, which while very different from the that overused Request/Servlet model, looked really nice and simple. I'm not sure though, how much the problem is present when you have a simple service, serving static content before your app server. Think apache (and faster) before tomcat. That won't fix the upload/download issue though, which is beautifully solved in Play 2.0

The Future of the Java Platform: Java SE 8 & Beyond


Simon Ritter is an intriguing fellow. If you take a glance at his work history (AT&T UNIX System Labs -> Novell -> Sun -> Oracle), you can easily see, he's a heavy weight player.
His presentation was rich in content, no corpo-bullshit. He started with a bit of history of JCP and how it looks like right now, then moved to the most interesting stuff, changes. Now I could give you a summary here, but there is really no point: you'd be much better taking look at the slides. There are only 48 of them, but everything is self-explanatory.
While I'm very disappointed with the speed of changes, especially when compared to the C# world, I'm glad with the direction and the fact that they finally want to BREAK the compatibility with the broken stuff (generics, etc.).  Moving to other languages I guess I won't be the one to scream "My god, finally!" somewhere in 2017, though. All the changes together look very promising, it's just that I'd like to have them like... now? Next year max, not near the heat death of the universe.

Simon also revealed one of the great mysteries of Java, to me:
The original idea behind JNI was to make it hard to write, to discourage people form using it.
On a side note, did you know Tegra3 has actually 5 cores? You use 4 of them, and then switch to the other one, when you battery gets low.

BOF: Spring and CloudFoundry


Having most of my folks moved to see "Typesafe stack 2.0" fabulously organized by Rafał Wasilewski and  Wojtek Erbetowski (with both of whom I had a pleasure to travel to the conference) and knowing it will be recorded, I've decided to see what Josh Long has to say about CloudFoundry, a subject I find very intriguing after the de facto fiasco of Google App Engine.

The audience was small but vibrant, mostly users of Amazon EC2, and while it turned out that Josh didn't have much, with pricing and details not yet public, the fact that Spring Source has already created their own competition (Could Foundry is both an Open Source app and a service), takes a lot from my anxiety.

For the review of the second day of the conference, go here.

Spring Security by example: securing methods

This is a part of a simple Spring Security tutorial:

1. Set up and form authentication
2. User in the backend (getting logged user, authentication, testing)
3. Securing web resources
4. Securing methods
5. OpenID (login via gmail)
6. OAuth2 (login via Facebook)
7. Writing on Facebook wall with Spring Social

Securing web resources is all nice and cool, but in a well designed application it's more natural to secure methods (for example on backend facade or even domain objects). While we may get away with role-based authorization in many intranet business applications, nobody will ever handle assigning roles to users in a public, free to use Internet service. We need authorization based on rules described in our domain.

For example: there is a service AlterStory, that allows cooperative writing of stories, where one user is a director (like a movie director), deciding which chapter proposed by other authors should make it to the final story.

The method for accepting chapters, looks like this:

Read more »