Hadoop HA setup

With the advent of Hadoop’s 2.x version, there finally is a working
High-Availability solution. Even two of those. Now it really is easy to
configure and use those solutions. It no longer require external
components, like
DRBD.
It all is just neatly packed into Cloudera Hadoop distribution – the
precursor of this solution.

Read on to find out how to use it.

The most important weakness of previous Hadoop releases was the
single-point-of-failure, which happend to be NameNode. NameNode as a key
component of every Hadoop cluster, is responsible for managing
filesystem namespace information and block location. Loosing its data results in loosing all the data
stored on DataNodes. HDFS is no longer able to reach for specific files,
or its blocks. This renders your cluster inoperable.

So it is crucial to be able to detect and counter problems with NameNode.
The most desirable behavior is to have a hot backup, that would ensure
a no-downtime cluster operation. To achieve this, the second NameNode
need to have up-to-date information on filesystem metadata and it needs
to be also up and running. Starting NameNode with existing set of data
may easily take many minutes to parse the actual filesystem state.

Previously used solution – depoying SecondaryNameNode – was somewhat
flawed. It took long time to recover after failure. It was not a
hot-backup solution, which also added to the problem. Some other
solution was required.

So, what needed to be made redundant is the edits dir contents and
sending block location maps from each of the DataNodes to NameNodes –
in case of HA deployment – to both NameNodes. This was accomplished in
two steps. The first one with the release of CDH 4 beta – solution based
on sharing edits directory. Than, with CDH 4.1 came quorum based solution.

Find out how to configure those on your cluster.

Shared edits directory solution

For this kind of setup, there is an assumption, that in a cluster exists
a shared storage directory. It should be deployed using some kind of
network-based filesystem. You could try with NFS or GlusterFS.

<property>
  <name>fs.default.name/name>
  <value>hdfs://example-cluster</value>
</property>
<!-- common server name -->
<property>
  <name>dfs.nameservices</name>
  <value>example-cluster</value>
</property>

<!-- HA configuration -->
<property>
  <name>dfs.ha.namenodes.example-cluster</name>
  <value>nn1,nn2</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.example-cluster.nn1</name>
  <value>master1:8020</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.example-cluster.nn2</name>
  <value>master2:8020</value>
</property>
<property>
  <name>dfs.namenode.http-address.example-cluster.nn1</name>
  <value>0.0.0.0:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.example-cluster.nn2</name>
  <value>0.0.0.0:50070</value>
</property>

<!-- Storage for edits' files -->
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>file:///mnt/filer1/dfs/ha-name-dir-shared</value>
</property>

<!-- Client failover -->
<property>
  <name>dfs.client.failover.proxy.provider.example-cluster</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<!-- Fencing configuration -->
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>
 <property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/home/user/.ssh/id_dsa</value>
</property>


<!-- Automatic failover configuration -->
<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
</property>
<property>
  <name>ha.zookeeper.quorum</name>
  <value>zk1:2181,zk2:2181,zk3:2181</value>
</property>

This setup is quite OK, as long as you’re comfortable with maintaining a
separate service (network storage) for handling the HA state. It seems
error prone to me, because it adds another service which high
availability should be ensured. NFS seems to be a bad choice here,
because AFAIK it does not offer HA out of the box.

On the other hand, we have GlusterFS, which is a distributed filesystem,
you can deploy on multiple bricks and increase the replication level.

Nevertheless, it still brings additional burden of another service to
maintain.

Quorum based solution

With the release of CDH 4.1.0 we are now able to use a much better
integrated solution called JournalNode. Now all the updates are
synchronized through a JournalNode. Each JournalNode have the same data
and all the NameNodes are able to recive filesystem state updates from
that daemons.

This solution is much more consistent with Hadoop ecosystem.

Please note, that the config is almost identical to the one needed for
shared edits directory solution. The only difference is the value for
dfs.namenode.shared.edits.dir. This now points to all the journal
nodes deployed in our cluster.

<property>
  <name>fs.default.name/name>
  <value>hdfs://example-cluster</value>
</property>
<!-- common server name -->
<property>
  <name>dfs.nameservices</name>
  <value>example-cluster</value>
</property>

<!-- HA configuration -->
<property>
  <name>dfs.ha.namenodes.example-cluster</name>
  <value>nn1,nn2</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.example-cluster.nn1</name>
  <value>master1:8020</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.example-cluster.nn2</name>
  <value>master2:8020</value>
</property>
<property>
  <name>dfs.namenode.http-address.example-cluster.nn1</name>
  <value>0.0.0.0:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.example-cluster.nn2</name>
  <value>0.0.0.0:50070</value>
</property>

<!-- Storage for edits' files -->
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://node1:8485;node2:8485;node3:8485/example-cluster</value>
</property>

<!-- Client failover -->
<property>
  <name>dfs.client.failover.proxy.provider.example-cluster</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<!-- Fencing configuration -->
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>
 <property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/home/user/.ssh/id_dsa</value>
</property>


<!-- Automatic failover configuration -->
<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>true</value>
</property>
<property>
  <name>ha.zookeeper.quorum</name>
  <value>zk1:2181,zk2:2181,zk3:2181</value>
</property>

Infrastructure

In both cases you need to run Zookeeper-based Failover Controller
(hadoop-hdfs-zkfc). This daemon negotiates which NameNode should
become active and which standby.

But that’s not all. Depending on the way you’ve choosen to deploy HA you
need to do some other things:

Shared edits dir

With shared edits dir you need to deploy networked filesystem, and mount
it on your NameNodes. After that you can run your cluster and be happy
with your new HA.

Quroum based

For QJournal to operate you need to install one new package called
hadoop-hdfs-journalnode. This provides startup scripts for Journal
Node daemons. Choose at least three nodes that will be responsible for
handling edits state and deploy journal nodes on them.

Conclusion

Thanks to guys from Cloudera we now can use an enterprise grade High
Availability features for Hadoop. Eliminating the single point of
failure in your cluster is essential for easy maintainability of your
infrastructure.

Given the above choices, I’d suggest using QJournal setup, becasue of
its relatively small impact on the overal cluster architecture. It’s
good performance and fairly simple setup enable the users to easily
start using Hadoop in HA setup.

Are you using Hadoop with HA? What are your impressions?

You May Also Like

Log4j and MDC in Grails

Log4j provides very useful feature: MDC - mapped diagnostic context. It can be used to store data in context of current thread. It may sound scary a bit but idea is simple.

My post is based on post http://burtbeckwith.com/blog/?p=521 from Burt Beckwith's excellent blog, it's definitely worth checking if you are interested in Grails.

Short background story...


Suppose we want to do logging our brand new shopping system and we want to have in each log customer's shopping basket number. And our system can be used at once by many users who can perform many transactions, actions like adding items and so on. How can we achieve that? Of course we can add basket number in every place where we do some logging but this task would be boring and error-prone. 

Instead of this we can use MDC to store variable with basket number in map. 

In fact MDC can be treated as map of custom values for current thread that can be used by logger. 


How to do that with Grails?


Using MDC with Grails is quite simple. All we need to do is to create our own custom filter which works for given urls and puts our data in MDC.

Filters in Grails are classes in directory grails-app/conf/* which names end with *Filters.groovy postfix. We can create this class manually or use Grails command: 
grails create-filters info.rnowak.App.Basket

In result class named BasketFilters will be created in grails-app/conf/info/rnowak/UberApp.

Initially filter class looks a little bit empty:
class BasketFilters {
def filters = {
all(controller:'*', action:'*') {
before = {

}
after = { Map model ->

}
afterView = { Exception e ->

}
}
}
}
All we need to do is fill empty closures, modify filter properties and put some data into MDC.

all is the general name of our filter, as class BasketFilters (plural!) can contain many various filters. You can name it whatever you want, for this post let assume it will be named basketFilter

Another thing is change of filter parameters. According to official documentation (link) we can customize our filter in many ways. You can specify controller to be filtered, its actions, filtered urls and so on. In our example you can stay with default option where filter is applied to every action of every controller. If you are interested in filtering only some urls, use uri parameter with expression describing desired urls to be filtered.

Three closures that are already defined in template have their function and they are started in these conditions:

  • before - as name says, it is executed before filtered action takes place
  • after - similarly, it is called after the action
  • afterView - called after rendering of the actions view
Ok, so now we know what are these mysterious methods and when they are called. But what can be done within them? In official Grails docs (link again) under section 7.6.3 there is a list of properties that are available to use in filter.

With that knowledge, we can proceed to implementing filter.

Putting something into MDC in filter


What we want to do is quite easy: we want to retrieve basket number from parameters and put it into MDC in our filter:
class BasketFilters {
def filters = {
basketFilter(controller:'*', action:'*') {
before = {
MDC.put("basketNumber", params.basketNumber ?: "")
}
after = { Map model ->
MDC.remove("basketNumber")
}
}
}
}

We retrieve basket number from Grails params map and then we put in map under specified key ("basketNumber" in this case), which will be later used in logger conversion pattern. It is important to remove custom value after processing of action to avoid leaks.

So we are putting something into MDC. But how make use of it in logs?


We can refer to custom data in MDC in conversion patter using syntax: %X{key}, where key is our key we used in filter to put data, like:
def conversionPattern = "%d{yyyy-MM-dd HH:mm:ss} %-5p %t [%c{1}] %X{basketNumber} - %m%n"


And that's it :) We've put custom data in log4j MDC and successfully used it in logs to display interesting values.

How to automate tests with Groovy 2.0, Spock and Gradle

This is the launch of the 1st blog in my life, so cheers and have a nice reading!

y u no test?

Couple of years ago I wasn't a big fan of unit testing. It was obvious to me that well prepared unit tests are crucial though. I didn't known why exactly crucial yet then. I just felt they are important. My disliking to write automation tests was mostly related to the effort necessary to prepare them. Also a spaghetti code was easily spotted in test sources.

Some goodies at hand

Now I know! Test are crucial to get a better design and a confidence. Confidence to improve without a hesitation. Moreover, now I have the tool to make test automation easy as Sunday morning... I'm talking about the Spock Framework. If you got here probably already know what the Spock is, so I won't introduce it. Enough to say that Spock is an awesome unit testing tool which, thanks to Groovy AST Transformation, simplifies creation of tests greatly.

An obstacle

The point is, since a new major version of Groovy has been released (2.0), there is no matching version of Spock available yet.

What now?

Well, in a matter of fact there is such a version. It's still under development though. It can be obtained from this Maven repository. We can of course use the Maven to build a project and run tests. But why not to go even more "groovy" way? XML is not for humans, is it? Lets use Gradle.

The build file

Update: at the end of the post is updated version of the build file.
apply plugin: 'groovy'
apply plugin: 'idea'

def langLevel = 1.7

sourceCompatibility = langLevel
targetCompatibility = langLevel

group = 'com.tamashumi.example.testwithspock'
version = '0.1'

repositories {
mavenLocal()
mavenCentral()
maven { url 'http://oss.sonatype.org/content/repositories/snapshots/' }
}

dependencies {
groovy 'org.codehaus.groovy:groovy-all:2.0.1'
testCompile 'org.spockframework:spock-core:0.7-groovy-2.0-SNAPSHOT'
}

idea {
project {
jdkName = langLevel
languageLevel = langLevel
}
}
As you can see the build.gradle file is almost self-explanatory. Groovy plugin is applied to compile groovy code. It needs groovy-all.jar - declared in version 2.0 at dependencies block just next to Spock in version 0.7. What's most important, mentioned Maven repository URL is added at repositories block.

Project structure and execution

Gradle's default project directory structure is similar to Maven's one. Unfortunately there is no 'create project' task and you have to create it by hand. It's not a big obstacle though. The structure you will create will more or less look as follows:
<project root>

├── build.gradle
└── src
├── main
│ ├── groovy
└── test
└── groovy
To build a project now you can type command gradle build or gradle test to only run tests.

How about Java?

You can test native Java code with Spock. Just add src/main/java directory and a following line to the build.gradle:
apply plugin: 'java'
This way if you don't want or just can't deploy Groovy compiled stuff into your production JVM for any reason, still whole goodness of testing with Spock and Groovy is at your hand.

A silly-simple example

Just to show that it works, here you go with a basic example.

Java simple example class:

public class SimpleJavaClass {

public int sumAll(int... args) {

int sum = 0;

for (int arg : args){
sum += arg;
}

return sum;
}
}

Groovy simple example class:

class SimpleGroovyClass {

String concatenateAll(char separator, String... args) {

args.join(separator as String)
}
}

The test, uhm... I mean the Specification:

class JustASpecification extends Specification {

@Unroll('Sums integers #integers into: #expectedResult')
def "Can sum different amount of integers"() {

given:
def instance = new SimpleJavaClass()

when:
def result = instance.sumAll(* integers)

then:
result == expectedResult

where:
expectedResult | integers
11 | [3, 3, 5]
8 | [3, 5]
254 | [2, 4, 8, 16, 32, 64, 128]
22 | [7, 5, 6, 2, 2]
}

@Unroll('Concatenates strings #strings with separator "#separator" into: #expectedResult')
def "Can concatenate different amount of integers with a specified separator"() {

given:
def instance = new SimpleGroovyClass()

when:
def result = instance.concatenateAll(separator, * strings)

then:
result == expectedResult

where:
expectedResult | separator | strings
'Whasup dude?' | ' ' as char | ['Whasup', 'dude?']
'2012/09/15' | '/' as char | ['2012', '09', '15']
'nice-to-meet-you' | '-' as char | ['nice', 'to', 'meet', 'you']
}
}
To run tests with Gradle simply execute command gradle test. Test reports can be found at <project root>/build/reports/tests/index.html and look kind a like this.


Please note that, thanks to @Unroll annotation, test is executed once per each parameters row in the 'table' at specification's where: block. This isn't a Java label, but a AST transformation magic.

IDE integration

Gradle's plugin for Iintellij Idea

I've added also Intellij Idea plugin for IDE project generation and some configuration for it (IDE's JDK name). To generate Idea's project files just run command: gradle idea There are available Eclipse and Netbeans plugins too, however I haven't tested them. Idea's one works well.

Intellij Idea's plugins for Gradle

Idea itself has a light Gradle support built-in on its own. To not get confused: Gradle has plugin for Idea and Idea has plugin for Gradle. To get even more 'pluginated', there is also JetGradle plugin within Idea. However I haven't found good reason for it's existence - well, maybe excluding one. It shows dependency tree. There is a bug though - JetGradle work's fine only for lang level 1.6. Strangely all the plugins together do not conflict each other. They even give complementary, quite useful tool set.

Running tests under IDE

Jest to add something sweet this is how Specification looks when run with jUnit  runner under Intellij Idea (right mouse button on JustASpecification class or whole folder of specification extending classes and select "Run ...". You'll see a nice view like this.

Building web application

If you need to build Java web application and bundle it as war archive just add plugin by typing the line
apply plugin: 'war'
in the build.gradle file and create a directory src/main/webapp.

Want to know more?

If you haven't heard about Spock or Gradle before or just curious, check the following links:

What next?

The last thing left is to write the real production code you are about to test. No matter will it be Groovy or Java, I leave this to your need and invention. Of course, you are welcome to post a comments here. I'll answer or even write some more posts about the subject.

Important update

Spock version 0.7 has been released, so the above build file doesn't work anymore. It's easy to fix it though. Just remove last dash and a word SNAPSHOT from Spock dependency declaration. Other important thing is that now spock-core depends on groovy-all-2.0.5, so to avoid dependency conflict groovy dependency should be changed from version 2.0.1 to 2.0.5.
Besides oss.sonata.org snapshots maven repository can be removed. No obstacles any more and the build file now looks as follows:
apply plugin: 'groovy'
apply plugin: 'idea'

def langLevel = 1.7

sourceCompatibility = langLevel
targetCompatibility = langLevel

group = 'com.tamashumi.example.testwithspock'
version = '0.1'

repositories {
mavenLocal()
mavenCentral()
}

dependencies {
groovy 'org.codehaus.groovy:groovy-all:2.0.5'
testCompile 'org.spockframework:spock-core:0.7-groovy-2.0'
}

idea {
project {
jdkName = langLevel
languageLevel = langLevel
}
}