Confitura 2013 afterthoughts

Confitura, the biggest free-of-charge Java conference in Europe, took place on the 6th of July in Warsaw. TouK’s presence was heavy, with 5 separate talks, all chosen in call for papers, no sponsored bullshit. We were sponsoring deck chairs during the…Confitura, the biggest free-of-charge Java conference in Europe, took place on the 6th of July in Warsaw. TouK’s presence was heavy, with 5 separate talks, all chosen in call for papers, no sponsored bullshit. We were sponsoring deck chairs during the…
Confitura, the biggest free-of-charge Java conference in Europe, took place on the 6th of July in Warsaw. TouK’s presence was heavy, with 5 separate talks, all chosen in call for papers, no sponsored bullshit. We were sponsoring deck chairs during the conference, and beer on the after-party, though. Both were quite popular. Oh, and we had Bartek Zdanowski with TouK Lab, doing funny things with lights, Jenkins, and raspberry-pi.

Last year, together with Tomasz Przybysz, I had a talk about Groovy and Grails. This year I had the pleasure of presenting revised Test Driven Traps.

Now, this wasn’t the same talk I had in Tallinn. In fact, I rewrote most of it, with three major changes.

While at first I was focused on three goals of TDD (security, design, feedback), I’ve noticed that I underestimate the value of communication. The funny thing is, I do use TDD in that manner frequently, it just didn’t appear to me as very important. Which, as an afterthought, was a big mistake.

Perhaps the reason for it, is how people distinguish between TDD, BDD (Behaviour Driven Development) and ATDD (Acceptance Test-Driven Development). I use Spock most of the time, and Spock mixes all three approaches very well. To be quite honest, I’ve been mixing all three of them in every testing tool I ever had. Whether my tests looks more like BDD or ATDD (that is, had more emphasis on communication and specification) depends on the nature of the project, and the nature of my client. Technology is not a problem here. It just used to be more awkward to write down user stories as a tests, in languages which were not as flexible as Groovy.

I had a lot of success using tests as a communication tool. Both internally, with developers, including myself while returning to an old code base, and with external clients. But I also had some failures.

A few months ago, while doing a mobile banking project, I had a session with a domain expert. We were trying to get to the bottom of fees and pricing in a specific payment scenario. While we were siting and talking, I wrote down a simple specification

static Closure fixed = {BigDecimal value -> new FixedFeeAddedCalculator(value: value)}
static Closure percentage = {BigDecimal value -> new PercentageFeeCalculator(value: value)}

def "should calculate fees"() {
    given:
        biller.feeCalculator = billerFeeType.call(billerFee)
        biller.feesForChannels = [new FeePerChannel(channel: Channel.USSD, feeCalculator: channelFeeType.call(feeForChannel))]
        biller.feesForPrefixes = [new FeePerPrefix(msisdnPrefix: "123", feeCalculator: prefixFeeType.call(feeForPrefix))]
    when:
        BillerFees billerFees = biller.calculateFees(productPrice, "1234", Channel.USSD)
    then:
        billerFees.total + productPrice == totalMoneyPaid
        billerFees.feeForChannel == feeForChannelPaid
        billerFees.feeForPrefix == feeForPrefixPaid
    where:
        productPrice | feeForChannel | channelFeeType | feeForPrefix | prefixFeeType | billerFee | billerFeeType || totalMoneyPaid | feeForChannelPaid | feeForPrefixPaid | feeForBillerPaid
        100          | 7             | fixed          | 3            | fixed         | 10        | fixed         || 120            | 7                 | 3                | 10
        100          | 7             | percentage     | 3            | percentage    | 10        | percentage    || 120            | 7                 | 3                | 10
        123.4        | 10            | percentage     | 10           | percentage    | 10        | percentage    || 160.42         | 12.34             | 12.34            | 12.34
        123.45       | 10            | percentage     | 10           | percentage    | 10        | percentage    || 160.47         | 12.34             | 12.34            | 12.34
        123.45       | 10.05         | percentage     | 10.05        | percentage    | 10.05     | percentage    || 160.68         | 12.41             | 12.41            | 12.41
        100.99       | 0.99          | percentage     | 9.99         | percentage    | 0.09      | percentage    || 112.17         | 1                 | 10.09            | 1
        100.99       | 0             | percentage     | 0            | fixed         | 0.01      | percentage    || 101.00         | 0                 | 0                | 0.01
        10           | 0             | fixed          | 0            | fixed         | 100       | fixed         || 110.00         | 0                 | 0                | 100
}

Those static fields/closures are a little hack, so that I’m able to use something more readable in the ‘where’ clause, than a ‘new’ with the name of the class. After we confirmed, that it looks OK, but we don’t really know whether this is accurate, I just copied the ‘where’ part with a little description, and sent it to the bank for confirmation. It looked pretty much like what the bank likes to send with their requirements. The expert was happy with this, I was happy with this, we’ve had that working for us before.

No word about fees and pricing has been heard from them since. And that’s about six months now.
Which shows, that no matter how good your tools are, no matter how smart domain experts are, if you have a fundamental problem with communication with your client, you are doomed anyway. And working through a proxy with clients from another continent, all I have are problems in communication. The technology is simple and easy, compared to that.

But I digress, I do encourage you to think of tests as a communication tool. It’s a blast when it works. And it does help a lot with keeping our domain knowledge up to date. Something requirement documents fail miserably to do.

The thing I do not know, is whether it is better to keep specifications and user stories separate from the rest of your tests or not. I’ve tried both approaches, but I don’t have any conclusion yet.

I also added a few simple thought on domain setup, to my presentation. Most systems need some domain objects before even unit tests will run. You can create that step by step, calling each constructor in every test, but this soon becomes a DRY violation, and a pain in the ass, as the domain gets more complex. Not to mention refactoring problems. A simple factory helps, but doesn’t change that much, so in my experience, the best way it to have your aggregate roots built in a single class, preferably using the same builders you use in production code (if you use any). This kind of setup definitely simplifies writing new tests, and is trivial to refactor.

But that’s most useful for unit tests. For integration tests, there are three approaches I’m aware of. The first one is the ‘go crazy, copy data from production’. Fortunately, this is possible only if you join an existing project, that had no tests before, so I’ll just skip that, as I am inherently incompatible with such situations. The other two are more practical.

You can have an empty database (you can even create it on the fly), where every test has to provide the data (domain objects) required for itself, or you can have a prepopulated database, with a precise set of data, representing some sample domain objects, and have that reflected in your test code in form of static finals. In both approaches, a rollback (for transactional Dbs) or a manual clean-up (for everything else) keeps the database at the initial state.

Everybody seems to prefer an empty database. But a prepopulated database has two great advantages: it usually simplifies test setups, and may have big impact on test performance. Just a few weeks ago I switched one of our projects to use the prepopulated database, and saved 15 seconds of test time.
15 seconds may not look like much, but our integration tests run now in 1m30s. I’ve saved another 2 seconds, by removing an ugly sleep in one test, and I can save a few more seconds, by just not being wasteful.

And that brings me to the next change in my presentation. I’ve pointed out the importance of performance in TDD. You have 30 seconds for unit tests, before you loose the patience of a developer. Make your tests run more than 3 minutes, and devs will only run them before push to the repo. Make them even longer, and nobody’s gonna care about tests anymore.

This is a single reason, why I thing GORM (Grails object-relational mapping) is not good enough for bigger projects. Grails use Active Record pattern, with a nifty mocking of database access methods via @Mock annotation, for unit tests. But with a complex domain and large aggregate roots, you get a heavy performance hit because of those annotations, and you end up with unit tests having half the speed of integration tests.

And that’s pretty bad. I have ~900 unit tests running in about 2.5 minutes, on i7. And most of that is due to @Mock annotations. I could bring it down to less than 30 seconds, if I could mock those bastard database calls the normal way, but with active record, I would either have to do meta-programming, or go into mixins and categories. The first option is error-prone, the second very cumbersome. Both are probably inefficient. Comparing the amount of work required to make those unit tests fast, with mocking a simple repository interface, puts all the other benefits of Grails active record into question. My conclusion is to not use GORM, if I’m going for anything more than 300 man-days. Or not use Grails altogether.

Those conclusions are kind of obvious, I suppose. The reason why I don’t see them used too often, is because people do not treat tests as a normal object (or functional) oriented code. That at least was my excuse for years and years, till I finally found out, this approach ends up in unmaintainable test code, that keeps you production code from ever being refactored as well.

And that’s pretty much all of my presentation changes.

Confitura 2013 was a fantastic experience. I am usually scared shitless, when giving a talk at a conference. Somehow, maybe because I know a lot of people on the audience personally, I’m less scared in here. During the first half of this year, I gave 6 talks, led two workshops, helped organize Warsaw Java User Group, and wrote a book. Confitura was a great finish for that marathon. Now I need a summer break.

You May Also Like

33rd Degree day 1 review

33rd Degree is over. After the one last year, my expectations were very high, but Grzegorz Duda once again proved he's more than able to deliver. With up to five tracks (most of the time: four presentations + one workshop), and ~650 attendees,  there was a lot to see and a lot to do, thus everyone will probably have a little bit different story to tell. Here is mine.

Twitter: From Ruby on Rails to the JVM

Raffi Krikorian talking about Twitter and JVM
The conference started with  Raffi Krikorian from Twitter, talking about their use for JVM. Twitter was build with Ruby but with their performance management a lot of the backend was moved to Scala, Java and Closure. Raffi noted, that for Ruby programmers Scala was easier to grasp than Java, more natural, which is quite interesting considering how many PHP guys move to Ruby these days because of the same reasons. Perhaps the path of learning Jacek Laskowski once described (Java -> Groovy -> Scala/Closure) may be on par with PHP -> Ruby -> Scala. It definitely feels like Scala is the holy grail of languages these days.

Raffi also noted, that while JVM delivered speed and a concurrency model to Twitter stack, it wasn't enough, and they've build/customized their own Garbage Collector. My guess is that Scala/Closure could also be used because of a nice concurrency solutions (STM, immutables and so on).

Raffi pointed out, that with the scale of Twitter, you easily get 3 million hits per second, and that means you probably have 3 edge cases every second. I'd love to learn listen to lessons they've learned from this.

 

Complexity of Complexity


The second keynote of the first day, was Ken Sipe talking about complexity. He made a good point that there is a difference between complex and complicated, and that we often recognize things as complex only because we are less familiar with them. This goes more interesting the moment you realize that the shift in last 20 years of computer languages, from the "Less is more" paradigm (think Java, ASM) to "More is better" (Groovy/Scala/Closure), where you have more complex language, with more powerful and less verbose syntax, that is actually not more complicated, it just looks less familiar.

So while 10 years ago, I really liked Java as a general purpose language for it's small set of rules that could get you everywhere, it turned out that to do most of the real world stuff, a lot of code had to be written. The situation got better thanks to libraries/frameworks and so on, but it's just patching. New languages have a lot of stuff build into, which makes their set of rules and syntax much more complex, but once you get familiar, the real world usage is simple, faster, better, with less traps laying around, waiting for you to fall.

Ken also pointed out, that while Entity Service Bus looks really simple on diagrams, it's usually very difficult and complicated to use from the perspective of the programmer. And that's probably why it gets chosen so often - the guys selling/buying it, look no deeper than on the diagram.

 

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Venkat Subramaniam with Dima
Dima got lucky. Or maybe not.

Venkat Subramaniam is the kind of a speaker that talk about very simple things in a way, which makes everyone either laugh or reflect. Yes, he is a showman, but hey, that's actually good, because even if you know the subject quite well, his talks are still very entertaining.
This talk was very generic (here's my thesis: the longer the title, the more generic the talk will be), interesting and fun, but at the end I'm unable to see anything new I'd have learned, apart from the distinction between Dynamic vs Static and Strong vs Weak typing, which I've seen the last year, but managed to forgot. This may be a very interesting argument for all those who are afraid of Groovy/Ruby, after bad experience with PHP or Perl.

Build Trust in Your Build to Deployment Flow!


Frederic Simon talked about DevOps and deployment, and that was a miss in my  schedule, because of two reasons. First, the talk was aimed at DevOps specifically, and while the subject is trendy lately, without big-scale problems, deployment is a process I usually set up and forget about. It just works, mostly because I only have to deal with one (current) project at a time. 
Not much love for Dart.
Second, while Frederic has a fabulous accent and a nice, loud voice, he tends to start each sentence loud and fade the sound at the end. This, together with mics failing him badly, made half of the presentation hard to grasp unless you were sitting in the first row.
I'm not saying the presentation was bad, far from it, it just clearly wasn't for me.
I've left a few minutes before the end, to see how many people came to Dart presentation by Mike West. I was kind of interested, since I'm following Warsaw Google Technology User Group and heard a few voices about why I should pay attentions to that new Google language. As you can see from the picture on the right, the majority tends to disagree with that opinion.

 

Non blocking, composable reactive web programming with Iteratees

Sadek Drobi's talk about Iteratees in Play 2.0 was very refreshing. Perhaps because I've never used Play before, but the presentation was flawless, with well explained problems, concepts and solutions.
Sadek started with a reflection on how much CPU we waste waiting for IO in web development, then moved to Play's Iteratees, to explain the concept and implementation, which while very different from the that overused Request/Servlet model, looked really nice and simple. I'm not sure though, how much the problem is present when you have a simple service, serving static content before your app server. Think apache (and faster) before tomcat. That won't fix the upload/download issue though, which is beautifully solved in Play 2.0

The Future of the Java Platform: Java SE 8 & Beyond


Simon Ritter is an intriguing fellow. If you take a glance at his work history (AT&T UNIX System Labs -> Novell -> Sun -> Oracle), you can easily see, he's a heavy weight player.
His presentation was rich in content, no corpo-bullshit. He started with a bit of history of JCP and how it looks like right now, then moved to the most interesting stuff, changes. Now I could give you a summary here, but there is really no point: you'd be much better taking look at the slides. There are only 48 of them, but everything is self-explanatory.
While I'm very disappointed with the speed of changes, especially when compared to the C# world, I'm glad with the direction and the fact that they finally want to BREAK the compatibility with the broken stuff (generics, etc.).  Moving to other languages I guess I won't be the one to scream "My god, finally!" somewhere in 2017, though. All the changes together look very promising, it's just that I'd like to have them like... now? Next year max, not near the heat death of the universe.

Simon also revealed one of the great mysteries of Java, to me:
The original idea behind JNI was to make it hard to write, to discourage people form using it.
On a side note, did you know Tegra3 has actually 5 cores? You use 4 of them, and then switch to the other one, when you battery gets low.

BOF: Spring and CloudFoundry


Having most of my folks moved to see "Typesafe stack 2.0" fabulously organized by Rafał Wasilewski and  Wojtek Erbetowski (with both of whom I had a pleasure to travel to the conference) and knowing it will be recorded, I've decided to see what Josh Long has to say about CloudFoundry, a subject I find very intriguing after the de facto fiasco of Google App Engine.

The audience was small but vibrant, mostly users of Amazon EC2, and while it turned out that Josh didn't have much, with pricing and details not yet public, the fact that Spring Source has already created their own competition (Could Foundry is both an Open Source app and a service), takes a lot from my anxiety.

For the review of the second day of the conference, go here.

Zookeeper + Curator = Distributed sync

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground