Getting started with Haskell, stack and spacemacs

It has been very long time since my last blog post. During this period I have become big enthusiast of functional programming, especially using Haskell language. In this and next posts I am going to show that Haskell can be very pleasant to use and with proper tools we are able to develop applications without unnecessary burden.

Recently, many useful tools and editors emerged and they are really easy and convenient to use. In this post I intend to present toolchain that I am using in my everyday Haskell programming.

This post is not an introduction to Haskell language. It is meant to describe how to setup Haskell with stack build tool and spacemacs as an editor. I am also planning to write a post about Haskell basics and its usage in my little project in series of the next posts.

The only necessary prerequisite is having the most recent version of Emacs installed on your system.

New project build/management tool – stack

Managing dependencies and build process is always a gruesome task and there are many tools to ease this work. In Haskell most popular dependency management tool is cabal. It is based on Hackage repository (https://hackage.haskell.org).
One of the desired features of build tools are reproducible builds. We would like to build project in new environment or on the new developer’s machine and have the same outcome in every situation. This would require same compiler version, same libraries etc.
Lately, new tool came out – stack (https://github.com/commercialhaskell/stack). It is aimed at reproducible builds and simple project management. Stack takes care of proper configuration of your project environment.
Stack achieves reproducible builds by using curated snapshot packages managed by special versioned resolvers. It uses cabal as a package manager. Packages are grouped into resolvers. There are two types of resolvers: LTS (long term support) and nightly. The latter contains packages in fresh version but there is also a drawback – potential instability. On the other hand, LTS resolvers contain fixed version of packages which are tested and should not cause any problems. If you are not in need of using latest packages version, LTS resolver should entirely satisfy your project needs.

What is more, stack can also download and setup locally Haskell compiler in version required by your project.

stack in action

Using stack to create new project is really easy. After installing it on our machine (description of installation is included in documentation on project’s GitHub page I linked above), all we need to do to create new project is execute below steps in our terminal:
stack new hello-haskell
cd hello-haskell
stack setup
stack build
stack exec hello-haskell
These commands create new project with name hello-haskell. stack setup initialises environment, install compiler (if it will be required) and necessary libraries for project. stack build builds and compiles project. stack exec … executes executable program built earlier.

If you would like to play around with your project’s code you should type stack ghci in your terminal – this will launch Haskell interactive console – ghci – in version specified in project configuration.

Another stack command worth mentioning is stack test which executes test suites declared in test/ directory.

Dependencies and project settings are placed in hello-haskell.cabal file. It is standard cabal configuration file where we can add desired libraries, set project version, licence, link to the repository and so on. I suggest reading some cabal documentation if you have any doubts but in my opinion this file is very easy and straightforward to edit.

Settings specific for stack are placed in stack.yaml file. Most important option is resolver – which influences version of GHC compiler and libraries your project will be using.

There is one thing you might encounter while setting up project dependencies. What if you need library that is not present in any of stack resolvers? Well, then we must go to stack.yaml file and edit or add section:

extra-deps:
- Vec-1.0.5

With this information stack will download and build desired package from hackage repositories. In my case I needed Vec library so I added it on a list with full name containing version number.

All details and gotchas are described in stack’s Wiki on GitHub. Be sure to check it out frequently as stack is still very young tool and it can change quite often. Documentation is very strong point of stack as it describes very well many aspects of its usages.

 

Powerful editor in new edition – spacemacs

I have spent a lot of time searching for editor that is easy to use with Haskell and that integrates well with its tools like REPL. I’ve been working with Sublime Text for some time as it is integrated quite well with Haskell when using SublimeHaskell package. However, recently I’ve discovered spacemacs project.
spacemacs (https://github.com/syl20bnr/spacemacs) is a easy-to-use kit for Emacs focused on ergonomics. What is great about it is that it embraces Evil mode of Emacs which mimics Vim-style editing and document navigation. With this feature spacemacs is really straightforward for users which know Vim. It is also possible to mix Vim and Emacs style in the same time.

In my opinion, it is really great feature as we can use this editor in the way we like more or is more convenient to us. Whether we are Vim-lovers or Emacs-fans or we want to mix them both – spacemacs allows to work in whatever style we like. I personally use mostly Vim-like mode with only few of original Emacs commands and with spacemacs shortcuts for many actions.

spacemacs is based on layers which add additional functionalities to editor. It can enrich our development environment with syntax completion, git integration, code completion and integration with build tools for many languages.

One of these layers is haskell layer. It supports this language quite well with syntax checking, code suggestions, built-in REPL and code templates for common patterns.

I refer to the official documentation for detailed installation instruction on various platform. After we are ready and spacemacs is on our disk, we can proceed.

Entire spacemacs configuration is placed in .spacemacs file in your home directory. This file is written in Lisp-like language and contains many options to change or add. Here is my current .spacemacs file on what this post section is based: 
https://gist.github.com/rafalnowak/202aba0ee7986515345b

In dotspacemacs-configuration-layers we need to add haskell layer (I also recommend setting auto-completion and syntax-checking layers as well). In order to get layer to work properly, we need to install some additional packages:
stack install stylish-haskell hlint hasktags
Next step is adding these two settings to .spacemacs just after text ;; User initialization goes here:
(add-hook 'haskell-mode-hook 'turn-on-haskell-indentation)
(add-to-list 'exec-path "~/.local/bin/")

It makes spacemacs aware of Haskell indentation style and adds binaries installed by stack to path. It is important as we want to make our editor able to run Haskell tools. 

Full description, as well as platform specific problems, are listed in Haskell layer documentation: https://github.com/syl20bnr/spacemacs/tree/master/layers/%2Blang/haskell There is also a list of useful shortcuts used by this layer.

One essential note: if you wish to use spacemacs with ghc-mod integration, you will need to install ghc-mod at least in version 5.4.0.0. Previous versions do not work properly with Haskell layer and stack. To install ghc-mod in this version you must add cabal-helper-0.6.1.0 to your extra-deps section in stack.yaml and run 

stack install ghc-mod

Which should proceed now without problems.  

After this configuration we are ready to use all power of Haskell and stack in our projects. We will also have solid support from editor. If you have followed steps above, you will see that spacemacs is colouring Haskell syntax, checking its correctness and giving you code completion tips. There is also interactive console for Haskell available under SPC m s s keys combination which makes quick testing of new functions possible. 

Unfortunately, there are some disadvantages of spacemacs. For me, the biggest drawdown is its responsivity. Sometimes during code completion or syntax checking it can hang application for a second or less.

 

Summary

As we could see, Haskell with stack and spacemacs is really powerful yet still simple to use. With stack we can achieve reproducible builds with specific compiler and libraries versions as well as easy project management. spacemacs allows us to create code quickly with support for Haskell syntax, build tools and code completion.

In my next post I am going to describe my experiences with my first bigger Haskell project – functional ray tracer I have been working on recently – https://github.com/rafalnowak/RaytracaH

 

You May Also Like

Zookeeper + Curator = Distributed sync

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground