Test Driven Traps, part 1

Have you ever been in a situation, where a simple change of code, broke a few hundred tests? Have you ever had the idea that tests slow you down, inhibit your creativity, make you afraid to change the code. If you had, it means you’ve entered the Dungeon-of-very-bad-tests, the world of things that should not be.

I’ve been there. I’ve built one myself. And it collapsed killing me in the process. I’ve learned my lesson. So here is the story of a dead man. Learn from my faults or be doomed to repeat them.

The story

Test Driven Development, like all good games in the world, is simple to learn, hard to master. I’ve started in 2005, when a brilliant guy named Piotr Szarwas, gave me the book “Test Driven Development: By Example” (Kent Beck), and one task: creating a framework.

These were the old times, when the technology we were using had no frameworks at all, and we wanted a cool one, like Spring, with Inversion-of-Control, Object-Relational Mapping, Model-View-Controller and all the good things we knew about. And so we created a framework. Then we built a Content Management System on top of it. Then we created a bunch of dedicated applications for different clients, Internet shops and what-not, on top of those two. We were doing good. We had 3000+ tests for the framework, 3000+ tests for the CMS, and another few thousand for every dedicated application. We were looking at our work, and we were happy, safe, secure. These were good times.

And then, as our code base grew, we came to the point, where a simple anemic model we had, was not good enough anymore. I had not read the other important book of that time: “Domain Driven Design”, you see. I didn’t know yet, that you can only get so far with an anemic model.

But we were safe. We had tons of tests. We could change anything.

Or so I thought.

I spent a week trying to introduce some changes in the architecture. Simple things really: moving methods around, switching collaborators, such things. Only to be overwhelmed by the number of tests I had to fix. That was TDD, I started my change with writing a test, and when I was finally done with the code under the test, I’d find another few hundred tests completely broken by my change. And when I go them fixed, introducing some more changes in the process, I’d find another few thousand broken. That was a butterfly effect, a chain reaction caused by a very small change.

It took me a week to figure out, that I’m not even half done in here. The refactoring had no visible end. And at no point my code base was stable, deployment-ready. I had my branch in the repository, one I’ve renamed “Lasciate ogne speranza, voi ch’intrate”.

We had tons and tons of tests. Of very bad tests. Tests that would pour concrete over our code, so that we could do nothing.

The only real options were: either to leave it be, or delete all tests, and write everything from scratch again. I didn’t want to work with the code if we were to go for the first option, and the management would not find financial rationale for the second. So I quit.

That was the Dungeon I built, only to find myself defeated by its monsters.

I went back to the book, and found everything I did wrong in there. Outlined. Marked out. How could I skip that? How could I not notice? Turns out, sometimes, you need to be of age and experience, to truly understand the things you learn.

Even the best of tools, when used poorly, can turn against you. And the easier the tool, the easier it seems to use it, the easier it is to fall into the trap of I-know-how-it-works thinking. And then BAM! You’re gone.

The truth

Test Driven Development and tests, are two completely different things. Tests are only a byproduct of TDD, nothing more. What is the point of TDD? What does TDD brings? Why do we do TDD?

Because of three, and only those three reasons.

1. To find the best design, by putting ourselves into the user’s shoes.

By starting with “how do I want to use it” thinking, we discover the most useful and friendly design. Always good, quite often that’s the best design out there. Otherwise, what we get is this:

And you don’t want that.

2. To manage our fear.

It takes balls, to make a ground change in a large code-base without tests, and say “it’s done” without introducing bugs in the process, doesn’t it? Well, the truth is, if you say “it’s done”, most of the time you are either ignorant, reckless, or just plain stupid. It’s like with concurrency: everybody knows it, nobody can do it well.

Smart people are scared of such changes. Unless they have good tests, with high code coverage.

TDD allows to manage our fears, by giving us proof, that things work as they should. TDD gives us safety

3. To have fast feedback.

How long can you code, without running the app? How long can you code without knowing whether your code works as you think it should?

Feedback in tests is important. Less so for frontend programming, where you can just run the shit up, and see for yourselves. More for coding in the backend. Even more, if your technology stack requires compilation, deployment, and starting up.

Time is money, and I’d rather earn it, than wait for the deployment and click through my changes each time I make them.

And that’s it. There are no more reasons for TDD whatsoever. We want Good Design, Safety, and Feedback. Good tests are those, which give us that.

Bad tests? All the other tests are bad.

The bad practice

So how does a typical, bad test, look like? The one I see over and over, in close to every project, created by somebody who has yet to learn how NOT to build an ugly dungeon, how not to pour concrete over your code base. The one I’d write myself in 2005.

This will be a Spock sample, written in groovy, testing a Grails controller. But don’t worry if you don’t know those technologies. I bet you’ll understand what’s going on in there without problems. Yes, it’s that simple. I’ll explain all the not-so-obvious parts.

def "should show outlet"() {
  given:
    def outlet = OutletFactory.createAndSaveOutlet(merchant: merchant)
    injectParamsToController(id: outlet.id)
  when:
    controller.show()
  then:
    response.redirectUrl == null
}

So we have a controller. It’s an outlet controller. And we have a test. What’s wrong with this test?

The name of the test is “should show outlet”. What should a test with such a name check? Whether we show the outlet, right? And what does it check? Whether we are redirected. Brilliant? Useless.

It’s simple, but I see it all around. People forget, that we need to:

VERIFY THE RIGHT THING

I bet that test was written after the code. Not in test-first fashion.

But verifying the right thing is not enough. Let’s have another example. Same controller, different expectation. The name is: “should create outlet insert command with valid params with new account”

Quite complex, isn’t it? If you need an explanation, the name is wrong. But you don’t know the domain, so let me put some light on it: when we give the controller good parameters, we want it to create a new OutletInsertCommand, and the account of that one, should be new.

The name doesn’t say what ‘new’ is, but we should be able to see it in the code.

Have a look at the test:

def "should create outlet insert command with valid params with new account"() {
  given:
    def defaultParams = OutletFactory.validOutletParams
    defaultParams.remove('mobileMoneyAccountNumber')
    defaultParams.remove('accountType')
    defaultParams.put('merchant.id', merchant.id)
    controller.params.putAll(defaultParams)
  when:
    controller.save()
  then:
    1 * securityServiceMock.getCurrentlyLoggedUser() >> user
    1 * commandNotificationServiceMock.notifyAccepters(_)
    0 * _._
    Outlet.count() == 0
    OutletInsertCommand.count() == 1
    def savedCommand = OutletInsertCommand.get(1)
    savedCommand.mobileMoneyAccountNumber == '1000000000000'
    savedCommand.accountType == CyclosAccountType.NOT_AGENT
    controller.flash.message != null
    response.redirectedUrl == '/outlet/list'
}

If you are new to Spock: n*mock.whatever(), means that the method “whatever” of the mock object, should be called exactly n times. No more no less. The underscore “_” means “everything” or “anything”. And the >> sign, instructs the test framework to return the right side argument when the method is called.

So what’s wrong with this test? Pretty much everything. Let’s go from the start of “then” part, mercifully skipping the oververbose set-up in the “given”.

1 * securityServiceMock.getCurrentlyLoggedUser() >> user

The first line verifies whether some security service was asked for a logged user, and returns the user. And it was asked EXACTLY one time. No more, no less.

Wait, what? How come we have a security service in here? The name of the test doesn’t say anything about security or users, why do we check it?

Well, it’s the first mistake. This part is not, what we want to verify. This is probably required by the controller, but it only means it should be in the “given”. And it should not verify that it’s called “exactly once”. It’s a stub for God’s sake. The user is either logged in or not. There is no sense in making him “logged in, but you can ask only once”.

Then, there is the second line.

1 * commandNotificationServiceMock.notifyAccepters(_)

It verifies that some notification service is called exactly once. And it may be ok, the business logic may require that, but then… why is it not stated clearly in the name of the test? Ah, I know, the name would be too long. Well, that’s also a suggestion. You need to make another test, something like “should notify about newly created outlet insert command”.

And then, it’s the third line.

0 * _._

My favorite one. If the code is Han Solo, this line is Jabba the Hut. It wants Hans Solo frozen in solid concrete. Or dead. Or both.

This line, if you haven’t deducted yet, is “You shall not make any other interactions with any mock, or stubs, or anything, Amen!”.

That’s the most stupid thing I’ve seen in a while. Why would a sane programmer ever put it here? That’s beyond my imagination.

No it isn’t. Been there, done that. The reason why a programmer would use such a thing is to make sure, that he covered all the interactions. That he didn’t forget about anything. Tests are good, what’s wrong in having more good?

He forgot about sanity. That line is stupid, and it will have it’s vengeance. It will bite you in the ass, some day. And while it may be small, because there are hundreds of lines like this, some day you gonna get bitten pretty well. You may as well not survive.

And then, another line.

Outlet.count() == 0

This verifies whether we don’t have any outlets in the database. Do you know why? You don’t. I do. I do, because I know the business logic of this domain. You don’t because this tests sucks at informing you, what it should.

Then there is the part, that actually makes sense.

OutletInsertCommand.count() == 1
def savedCommand = OutletInsertCommand.get(1)
savedCommand.mobileMoneyAccountNumber == '1000000000000'
savedCommand.accountType == CyclosAccountType.NOT_AGENT

We expect the object we’ve created in the database, and then we verify whether it’s account is “new”. And we know, that the “new” means a specific account number and type. Though it screams for being extracted into another method.

And then…

controller.flash.message != null
response.redirectedUrl == '/outlet/list'

Then we have some flash message not set. And a redirection. And I ask God, why the hell are we testing this? Not because the name of the test says so, that’s for sure. The truth is, that looking at the test, I can recreate the method under test, line by line.

Isn’t it brilliant? This test represents every single line of a not so simple method. But try to change the method, try to change a single line, and you have big chance to blow this thing up. And when those kinds of tests are in the hundreds, you have concrete all over you code. You’ll be able to refactor nothing.

So here’s another lesson. It’s not enough to verify the right thing. You need to

VERIFY ONLY THE RIGHT THING.

Never ever verify the algorithm of the method step by step. Verify the outcomes of the algorithm. You should be free to change the method, as long as the outcome, the real thing you expect, is not changed.

Imagine a sorting problem. Would you verify it’s internal algorithm? What for? It’s got to work and it’s got to work well. Remember, you want good design and security. Apart from this, it should be free to change. Your tests should not stay in the way.

Now for another horrible example.

@Unroll("test merchant constraints field #field for #error")
def "test merchant all constraints"() {
  when:
    def obj = new Merchant((field): val)

  then:
    validateConstraints(obj, field, error)

  where:
    field                     | val                                    | error
    'name'                    | null                                   | 'nullable'
    'name'                    | ''                                     | 'blank'
    'name'                    | 'ABC'                                  | 'valid'
    'contactInfo'             | null                                   | 'nullable'
    'contactInfo'             | new ContactInfo()                      | 'validator'
    'contactInfo'             | ContactInfoFactory.createContactInfo() | 'valid'
    'businessSegment'         | null                                   | 'nullable'
    'businessSegment'         | new MerchantBusinessSegment()          | 'valid'
    'finacleAccountNumber'    | null                                   | 'nullable'
    'finacleAccountNumber'    | ''                                     | 'blank'
    'finacleAccountNumber'    | 'ABC'                                  | 'valid'
    'principalContactPerson'  | null                                   | 'nullable'
    'principalContactPerson'  | ''                                     | 'blank'
    'principalContactPerson'  | 'ABC'                                  | 'valid'
    'principalContactInfo'    | null                                   | 'nullable'
    'principalContactInfo'    | new ContactInfo()                      | 'validator'
    'principalContactInfo'    | ContactInfoFactory.createContactInfo() | 'valid'
    'feeCalculator'           | null                                   | 'nullable'
    'feeCalculator'           | new FixedFeeCalculator(value: 0)       | 'valid'
    'chain'                   | null                                   | 'nullable'
    'chain'                   | new Chain()                            | 'valid'
    'customerWhiteListEnable' | null                                   | 'nullable'
    'customerWhiteListEnable' | true                                   | 'valid'
    'enabled'                 | null                                   | 'nullable'
    'enabled'                 | true                                   | 'valid'
}

Do you understand what’s going on? If you haven’t seen it before, you may very well not. The “where” part, is a beautiful Spock solution for parametrized tests. The headers of those columns are the names of variables, used BEFORE, in the first line. It’s sort of a declaration after the usage. The test is going to be fired many times, once for for each line in the “where” part. And it’s all possible thanks to Groovy’s Abstract Syntaxt Tree Transofrmation. We are talking about interpreting and changing the code during the compilation. Cool stuff.

So what this test is doing?

Nothing.

Let me show you the code under test.

static constraints = {
  name(blank: false)
  contactInfo(nullable: false, validator: { it?.validate() })
  businessSegment(nullable: false)
  finacleAccountNumber(blank: false)
  principalContactPerson(blank: false)
  principalContactInfo(nullable: false, validator: { it?.validate() })
  feeCalculator(nullable: false)
  customerWhiteListEnable(nullable: false)
}

This static closure, is telling Grails, what kind of validation we expect on the object and database level. In Java, these would most probably be annotations.

And you do not test annotations. You also do not test static fields. Or closures without any sensible code, without any behavior. And you don’t test whether the framework below (Grails/GORM in here) works the way it works.

Oh, you may test that for the first time you are using it. Just because you want to know how and if it works. You want to be safe, after all. But then, you should probably delete this test, and for sure, not repeat it for every single domain class out there.

This test doesn’t event verify that, by the way. Because it’s a unit test, working on a mock of a database. It’s not testing the real GORM (Groovy Object-Relational Mapping, an adapter on top of Hibernate). It’s testing the mock of the real GORM.

Yeah, it’s that stupid.

So if TDD gives us safety, design and feedback, what does this test provide? Absolutely nothing. So why did the programmer put it here? Because his brain says: tests are good. More tests are better.

Well, I’ve got news for you. Every single test which does not provide us safety and good design is bad. Period. Those which provide only feedback, should be thrown away the moment you stop refactoring your code under the test.

So here’s my lesson number three:

PROVIDE SAFETY AND GOOD DESIGN, OR BE GONE.

That was the example of things gone wrong. What should we do about it?

The answer: delete it.

But I yet have to see a programmer who removes his tests. Even so shitty as this one. We feel very personal about our code, I guess. So in case you are hesitating, let me remind you what Kent Beck wrote in his book about TDD:

The first criterion for your tests is confidence. Never delete a test if it reduces your confidence in the behavior of the system.
The second criterion is communication. If you have two tests that exercise the same path through the code, but they speak to different scenarios for a readers, leave them alone.
[Kent Beck, Test Driven Development: by Example]

Now you know, it’s safe to delete it.

So much for today. I have some good examples to show, some more stories to tell, so stay tuned for part 2.

You May Also Like

Zookeeper + Curator = Distributed sync

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground

Multi module Gradle project with IDE support

This article is a short how-to about multi-module project setup with usage of the Gradle automation build tool.

Here's how Rich Seller, a StackOverflow user, describes Gradle:
Gradle promises to hit the sweet spot between Ant and Maven. It uses Ivy's approach for dependency resolution. It allows for convention over configuration but also includes Ant tasks as first class citizens. It also wisely allows you to use existing Maven/Ivy repositories.
So why would one use yet another JVM build tool such as Gradle? The answer is simple: to avoid frustration involved by Ant or Maven.

Short story

I was fooling around with some fresh proof of concept and needed a build tool. I'm pretty familiar with Maven so created project from an artifact, and opened the build file, pom.xml for further tuning.
I had been using Grails with its own build system (similar to Gradle, btw) already for some time up then, so after quite a time without Maven, I looked on the pom.xml and found it to be really repulsive.

Once again I felt clearly: XML is not for humans.

After quick googling I found Gradle. It was still in beta (0.8 version) back then, but it's configured with Groovy DSL and that's what a human likes :)

Where are we

In the time Ant can be met but among IT guerrillas, Maven is still on top and couple of others like for example Ivy conquer for the best position, Gradle smoothly went into its mature age. It's now available in 1.3 version, released at 20th of November 2012. I'm glad to recommend it to anyone looking for relief from XML configured tools, or for anyone just looking for simple, elastic and powerful build tool.

Lets build

I have already written about basic project structure so I skip this one, reminding only the basic project structure:
<project root>

├── build.gradle
└── src
├── main
│ ├── java
│ └── groovy

└── test
├── java
└── groovy
Have I just referred myself for the 1st time? Achievement unlocked! ;)

Gradle as most build tools is run from a command line with parameters. The main parameter for Gradle is a 'task name', for example we can run a command: gradle build.
There is no 'create project' task, so the directory structure has to be created by hand. This isn't a hassle though.
Java and groovy sub-folders aren't always mandatory. They depend on what compile plugin is used.

Parent project

Consider an example project 'the-app' of three modules, let say:
  1. database communication layer
  2. domain model and services layer
  3. web presentation layer
Our project directory tree will look like:
the-app

├── dao-layer
│ └── src

├── domain-model
│ └── src

├── web-frontend
│ └── src

├── build.gradle
└── settings.gradle
the-app itself has no src sub-folder as its purpose is only to contain sub-projects and build configuration. If needed it could've been provided with own src though.

To glue modules we need to fill settings.gradle file under the-app directory with a single line of content specifying module names:
include 'dao-layer', 'domain-model', 'web-frontend'
Now the gradle projects command can be executed to obtain such a result:
:projects

------------------------------------------------------------
Root project
------------------------------------------------------------

Root project 'the-app'
+--- Project ':dao-layer'
+--- Project ':domain-model'
\--- Project ':web-frontend'
...so we know that Gradle noticed the modules. However gradle build command won't run successful yet because build.gradle file is still empty.

Sub project

As in Maven we can create separate build config file per each module. Let say we starting from DAO layer.
Thus we create a new file the-app/dao-layer/build.gradle with a line of basic build info (notice the new build.gradle was created under sub-project directory):
apply plugin: 'java'
This single line of config for any of modules is enough to execute gradle build command under the-app directory with following result:
:dao-layer:compileJava
:dao-layer:processResources UP-TO-DATE
:dao-layer:classes
:dao-layer:jar
:dao-layer:assemble
:dao-layer:compileTestJava UP-TO-DATE
:dao-layer:processTestResources UP-TO-DATE
:dao-layer:testClasses UP-TO-DATE
:dao-layer:test
:dao-layer:check
:dao-layer:build

BUILD SUCCESSFUL

Total time: 3.256 secs
To use Groovy plugin slightly more configuration is needed:
apply plugin: 'groovy'

repositories {
mavenLocal()
mavenCentral()
}

dependencies {
groovy 'org.codehaus.groovy:groovy-all:2.0.5'
}
At lines 3 to 6 Maven repositories are set. At line 9 dependency with groovy library version is specified. Of course plugin as 'java', 'groovy' and many more can be mixed each other.

If we have settings.gradle file and a build.gradle file for each module, there is no need for parent the-app/build.gradle file at all. Sure that's true but we can go another, better way.

One file to rule them all

Instead of creating many build.gradle config files, one per each module, we can use only the parent's one and make it a bit more juicy. So let us move the the-app/dao-layer/build.gradle a level up to the-app/build-gradle and fill it with new statements to achieve full project configuration:
def langLevel = 1.7

allprojects {

apply plugin: 'idea'

group = 'com.tamashumi'
version = '0.1'
}

subprojects {

apply plugin: 'groovy'

sourceCompatibility = langLevel
targetCompatibility = langLevel

repositories {
mavenLocal()
mavenCentral()
}

dependencies {
groovy 'org.codehaus.groovy:groovy-all:2.0.5'
testCompile 'org.spockframework:spock-core:0.7-groovy-2.0'
}
}

project(':dao-layer') {

dependencies {
compile 'org.hibernate:hibernate-core:4.1.7.Final'
}
}

project(':domain-model') {

dependencies {
compile project(':dao-layer')
}
}

project(':web-frontend') {

apply plugin: 'war'

dependencies {
compile project(':domain-model')
compile 'org.springframework:spring-webmvc:3.1.2.RELEASE'
}
}

idea {
project {
jdkName = langLevel
languageLevel = langLevel
}
}
At the beginning simple variable langLevel is declared. It's worth knowing that we can use almost any Groovy code inside build.gradle file, statements like for example if conditions, for/while loops, closures, switch-case, etc... Quite an advantage over inflexible XML, isn't it?

Next the allProjects block. Any configuration placed in it will influence - what a surprise - all projects, so the parent itself and sub-projects (modules). Inside of the block we have the IDE (Intellij Idea) plugin applied which I wrote more about in previous article (look under "IDE Integration" heading). Enough to say that with this plugin applied here, command gradle idea will generate Idea's project files with modules structure and dependencies. This works really well and plugins for other IDEs are available too.
Remaining two lines at this block define group and version for the project, similar as this is done by Maven.

After that subProjects block appears. It's related to all modules but not the parent project. So here the Groovy language plugin is applied, as all modules are assumed to be written in Groovy.
Below source and target language level are set.
After that come references to standard Maven repositories.
At the end of the block dependencies to groovy version and test library - Spock framework.

Following blocks, project(':module-name'), are responsible for each module configuration. They may be omitted unless allProjects or subProjects configure what's necessary for a specific module. In the example per module configuration goes as follow:
  • Dao-layer module has dependency to an ORM library - Hibernate
  • Domain-model module relies on dao-layer as a dependency. Keyword project is used here again for a reference to other module.
  • Web-frontend applies 'war' plugin which build this module into java web archive. Besides it referes to domain-model module and also use Spring MVC framework dependency.

At the end in idea block is basic info for IDE plugin. Those are parameters corresponding to the Idea's project general settings visible on the following screen shot.


jdkName should match the IDE's SDK name otherwise it has to be set manually under IDE on each Idea's project files (re)generation with gradle idea command.

Is that it?

In the matter of simplicity - yes. That's enough to automate modular application build with custom configuration per module. Not a rocket science, huh? Think about Maven's XML. It would take more effort to setup the same and still achieve less expressible configuration quite far from user-friendly.

Check the online user guide for a lot of configuration possibilities or better download Gradle and see the sample projects.
As a tasty bait take a look for this short choice of available plugins:
  • java
  • groovy
  • scala
  • cpp
  • eclipse
  • netbeans
  • ida
  • maven
  • osgi
  • war
  • ear
  • sonar
  • project-report
  • signing
and more, 3rd party plugins...