[:en] Operational problems with Zookeeper

This post is a summary of what has been presented by Kathleen Ting on StrangeLoop conference. You can watch the original here: http://www.infoq.com/presentations/Misconfiguration-ZooKeeper I’ve decided to put this selection here for quick reference. …This post is a summary of what has been presented by Kathleen Ting on StrangeLoop conference. You can watch the original here: http://www.infoq.com/presentations/Misconfiguration-ZooKeeper I’ve decided to put this selection here for quick reference. …

This post is a summary of what has been presented by Kathleen Ting on
StrangeLoop conference. You can watch the original here:
http://www.infoq.com/presentations/Misconfiguration-ZooKeeper

I’ve decided to put this selection here for quick reference.

Connection mismanagement

  • too many connections
    WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /xx.x.xx.xxx - max is 60
    
  • running out of ZK connections?
    • set maxClientCnxns=200 in zoo.cfg
  • HBase client leaking connections?
    • fixed in HBASE-3777, HBASE-4773, HBASE-5466
    • manually close connections
  • connection closes prematurely
    ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately.
  • in hbase-site.xml set hbase.zookeeper.recoverable.waittime=30000ms
  • pig hangs connecting to HBase

    CAUSE: location of ZK quorum is not known to Pig

    • use Pig 10, which includes PIG-2115
    • if there is an overlap between TaskTrackers and ZK quorum nodes
      • set hbase.zookeeper.quorum to final in hbase-site.xml
      • otherwise add hbaze.zoopeeker.quorum=hadoophbasemaster.lan:2181 in pig.properties
WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectionException: Connection refused!

Time mismanagement

  • client session timed out
INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session <id>, timeout of 40000ms exceeded
    • ZK and HBase need the same session timeout values
      • zoo.cfg: maxSession=Timeout=180000
      • hbase-site.xml: zookeeper.session.timeout=180000
    • don’t co-locate ZK with IO-intense DataNode or RegionServer
    • specify right amount of heap and tune GC flags
      • turn on parallel/CMS/incremental GC
  •  
  • clients lose connections
    WARN org.apache.zookeeper.ClientCnxn - Session <id> for server <name>, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Broken pipe
    
    • don’t use SSD drive for ZK transaction log

Disk management

  • unable to load database – unable to run quorum server
    FATAL Unable to load database on disk !  java.io.IOException: Failed to process transaction type: 2 error: KeeperErrorCode = NoNode for <file> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:152)!
    
    • archive and wipe /var/zookeeper/version-2 if other two ZK servers
      are running
  • unable to load database – unreasonable length exception
    FATAL Unable to load database on disk java.io.IOException: Unreasonable length = 1048583 at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:100)
    • server allows a client to set data larger than the server can read from disk
    • if a znode is not readable, increase jute.maxbuffer
      • look for "Packet len <xx> is out of range" in the client log
      • increase it by 20%
      • set in JVMFLAGS="-Djute.maxbuffer=yy" bin/zkCli.sh
      • fixed in ZOOKEEPER-151
  • failure to follow leader
    WARN org.apache.zookeeper.server.quorum.Learner: Exception when following the leader java.net.SocketTimeoutException: Read timed out

    CAUSE:

    • disk IO contention, network issues
    • ZK snapshot is too large (lots of ZK nodes)

    SOLVE:

    • reduce IO contention by putting dataDir on dedicated spindle
    • increase initLimit on all ZK servers and restart, see
      ZOOKEEPER-1521
    • monitor network

Best Practices

DOs

  • separate spindles for dataDir & dataLogDir
  • allocate 3 or 5 ZK servers
  • tune garbage collection
  • run zkCleanup.sh script via cron

DON’Ts

  • dont’ co-locate ZK with I/O intense DataNode or RegionServer
  • don’t use SSD drive for ZK transaction log

You may use Zookeeper as an observer – a non-voting member:

  • in zoo.cfg
    peerType=observer
You May Also Like

Log4j and MDC in Grails

Log4j provides very useful feature: MDC - mapped diagnostic context. It can be used to store data in context of current thread. It may sound scary a bit but idea is simple.

My post is based on post http://burtbeckwith.com/blog/?p=521 from Burt Beckwith's excellent blog, it's definitely worth checking if you are interested in Grails.

Short background story...


Suppose we want to do logging our brand new shopping system and we want to have in each log customer's shopping basket number. And our system can be used at once by many users who can perform many transactions, actions like adding items and so on. How can we achieve that? Of course we can add basket number in every place where we do some logging but this task would be boring and error-prone. 

Instead of this we can use MDC to store variable with basket number in map. 

In fact MDC can be treated as map of custom values for current thread that can be used by logger. 


How to do that with Grails?


Using MDC with Grails is quite simple. All we need to do is to create our own custom filter which works for given urls and puts our data in MDC.

Filters in Grails are classes in directory grails-app/conf/* which names end with *Filters.groovy postfix. We can create this class manually or use Grails command: 
grails create-filters info.rnowak.App.Basket

In result class named BasketFilters will be created in grails-app/conf/info/rnowak/UberApp.

Initially filter class looks a little bit empty:
class BasketFilters {
def filters = {
all(controller:'*', action:'*') {
before = {

}
after = { Map model ->

}
afterView = { Exception e ->

}
}
}
}
All we need to do is fill empty closures, modify filter properties and put some data into MDC.

all is the general name of our filter, as class BasketFilters (plural!) can contain many various filters. You can name it whatever you want, for this post let assume it will be named basketFilter

Another thing is change of filter parameters. According to official documentation (link) we can customize our filter in many ways. You can specify controller to be filtered, its actions, filtered urls and so on. In our example you can stay with default option where filter is applied to every action of every controller. If you are interested in filtering only some urls, use uri parameter with expression describing desired urls to be filtered.

Three closures that are already defined in template have their function and they are started in these conditions:

  • before - as name says, it is executed before filtered action takes place
  • after - similarly, it is called after the action
  • afterView - called after rendering of the actions view
Ok, so now we know what are these mysterious methods and when they are called. But what can be done within them? In official Grails docs (link again) under section 7.6.3 there is a list of properties that are available to use in filter.

With that knowledge, we can proceed to implementing filter.

Putting something into MDC in filter


What we want to do is quite easy: we want to retrieve basket number from parameters and put it into MDC in our filter:
class BasketFilters {
def filters = {
basketFilter(controller:'*', action:'*') {
before = {
MDC.put("basketNumber", params.basketNumber ?: "")
}
after = { Map model ->
MDC.remove("basketNumber")
}
}
}
}

We retrieve basket number from Grails params map and then we put in map under specified key ("basketNumber" in this case), which will be later used in logger conversion pattern. It is important to remove custom value after processing of action to avoid leaks.

So we are putting something into MDC. But how make use of it in logs?


We can refer to custom data in MDC in conversion patter using syntax: %X{key}, where key is our key we used in filter to put data, like:
def conversionPattern = "%d{yyyy-MM-dd HH:mm:ss} %-5p %t [%c{1}] %X{basketNumber} - %m%n"


And that's it :) We've put custom data in log4j MDC and successfully used it in logs to display interesting values.