Log4j and MDC in Grails

Log4j provides very useful feature: MDC – mapped diagnostic context. It can be used to store data in context of current thread. It may sound scary a bit but idea is simple. My post is based on post http://burtbeckwith.com/blog/?p=521 from Burt Beckwith’s excellent blog, it’s definitely worth checking if you are interested in Grails. Short background story… Suppose we want to do logging our brand new shopping system and we want to have in each log customer’s shopping basket number. And our system can be used at once by many users who can perform many transactions, actions like adding items and so on. How can we achieve that? Of course we can add basket number in every place where we do some logging but this task would be boring and error-prone.  Instead of this we can use MDC to store variable with basket number in map.  In fact MDC can be treated as map of custom values for current thread that can be used by logger.  How to do that with Grails? Using MDC with Grails is quite simple. All we need to do is to create our own custom filter which works for given urls and puts our data in MDC. Filters in Grails are classes in directory grails-app/conf/* which names end with *Filters.groovy postfix. We can create this class manually or use Grails command:  grails create-filters info.rnowak.App.Basket In result class named BasketFilters will be created in grails-app/conf/info/rnowak/UberApp. Initially filter class looks a little bit empty: class BasketFilters { def filters = { all(controller:'*', action:'*') { before = { } after = { Map model -> } afterView = { Exception e -> } } }} All we need to do is fill empty closures, modify filter properties and put some data into MDC. all is the general name of our filter, as class BasketFilters (plural!) can contain many various filters. You can name it whatever you want, for this post let assume it will be named basketFilter. Another thing is change of filter parameters. According to official documentation (link) we can customize our filter in many ways. You can specify controller to be filtered, its actions, filtered urls and so on. In our example you can stay with default option where filter is applied to every action of every controller. If you are interested in filtering only some urls, use uri parameter with expression describing desired urls to be filtered.Three closures that are already defined in template have their function and they are started in these conditions: before – as name says, it is executed before filtered action takes place after – similarly, it is called after the action afterView – called after rendering of the actions view Ok, so now we know what are these mysterious methods and when they are called. But what can be done within them? In official Grails docs (link again) under section 7.6.3 there is a list of properties that are available to use in filter. With that knowledge, we can proceed to implementing filter. Putting something into MDC in filter What we want to do is quite easy: we want to retrieve basket number from parameters and put it into MDC in our filter: class BasketFilters { def filters = { basketFilter(controller:'*', action:'*') { before = { MDC.put("basketNumber", params.basketNumber ?: "") } after = { Map model -> MDC.remove("basketNumber") } } }} We retrieve basket number from Grails params map and then we put in map under specified key (“basketNumber” in this case), which will be later used in logger conversion pattern. It is important to remove custom value after processing of action to avoid leaks. So we are putting something into MDC. But how make use of it in logs? We can refer to custom data in MDC in conversion patter using syntax: %X{key}, where key is our key we used in filter to put data, like: def conversionPattern = "%d{yyyy-MM-dd HH:mm:ss} %-5p %t [%c{1}] %X{basketNumber} - %m%n" And that’s it :) We’ve put custom data in log4j MDC and successfully used it in logs to display interesting values.
Log4j provides very useful feature: MDC – mapped diagnostic context. It can be used to store data in context of current thread. It may sound scary a bit but idea is simple. My post is based on post http://burtbeckwith.com/blog/?p=521 from Burt Beckwith’s excellent blog, it’s definitely worth checking if you are interested in Grails.

Short background story…

Suppose we want to do logging our brand new shopping system and we want to have in each log customer’s shopping basket number. And our system can be used at once by many users who can perform many transactions, actions like adding items and so on. How can we achieve that? Of course we can add basket number in every place where we do some logging but this task would be boring and error-prone.
Instead of this we can use MDC to store variable with basket number in map.
In fact MDC can be treated as map of custom values for current thread that can be used by logger.

How to do that with Grails?

Using MDC with Grails is quite simple. All we need to do is to create our own custom filter which works for given urls and puts our data in MDC.
Filters in Grails are classes in directory grails-app/conf/* which names end with *Filters.groovy postfix. We can create this class manually or use Grails command:
grails create-filters info.rnowak.App.Basket
   
In result class named BasketFilters will be created in grails-app/conf/info/rnowak/UberApp. Initially filter class looks a little bit empty:

class BasketFilters {
    def filters = {
        all(controller:'*', action:'*') {
            before = {

            }
            after = { Map model ->

            }
            afterView = { Exception e ->

            }
        }
    }
}

All we need to do is fill empty closures, modify filter properties and put some data into MDC.

all is the general name of our filter, as class BasketFilters (plural!) can contain many various filters. You can name it whatever you want, for this post let assume it will be named basketFilter Another thing is change of filter parameters. According to official documentation (link) we can customize our filter in many ways. You can specify controller to be filtered, its actions, filtered urls and so on. In our example you can stay with default option where filter is applied to every action of every controller. If you are interested in filtering only some urls, use uri parameter with expression describing desired urls to be filtered. Three closures that are already defined in template have their function and they are started in these conditions:

  • before – as name says, it is executed before filtered action takes place
  • after – similarly, it is called after the action
  • afterView – called after rendering of the actions view
Ok, so now we know what are these mysterious methods and when they are called. But what can be done within them? In official Grails docs (link again) under section 7.6.3 there is a list of properties that are available to use in filter.
With that knowledge, we can proceed to implementing filter.

Putting something into MDC in filter

What we want to do is quite easy: we want to retrieve basket number from parameters and put it into MDC in our filter:
class BasketFilters {
    def filters = {
        basketFilter(controller:'*', action:'*') {
            before = {
                MDC.put("basketNumber", params.basketNumber ?: "")
            }
            after = { Map model ->
                MDC.remove("basketNumber")
            }
        }
    }
}
We retrieve basket number from Grails params map and then we put in map under specified key (“basketNumber” in this case), which will be later used in logger conversion pattern. It is important to remove custom value after processing of action to avoid leaks.

So we are putting something into MDC. But how make use of it in logs?

We can refer to custom data in MDC in conversion patter using syntax: %X{key}, where key is our key we used in filter to put data, like:
def conversionPattern = "%d{yyyy-MM-dd HH:mm:ss} %-5p %t [%c{1}] %X{basketNumber} - %m%n"

And that’s it :) We’ve put custom data in log4j MDC and successfully used it in logs to display interesting values.

You May Also Like

Zookeeper + Curator = Distributed sync

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground