[:en] Operational problems with Zookeeper

This post is a summary of what has been presented by Kathleen Ting on StrangeLoop conference. You can watch the original here: http://www.infoq.com/presentations/Misconfiguration-ZooKeeper I’ve decided to put this selection here for quick reference. …This post is a summary of what has been presented by Kathleen Ting on StrangeLoop conference. You can watch the original here: http://www.infoq.com/presentations/Misconfiguration-ZooKeeper I’ve decided to put this selection here for quick reference. …

This post is a summary of what has been presented by Kathleen Ting on
StrangeLoop conference. You can watch the original here:
http://www.infoq.com/presentations/Misconfiguration-ZooKeeper

I’ve decided to put this selection here for quick reference.

Connection mismanagement

  • too many connections
    WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /xx.x.xx.xxx - max is 60
    
  • running out of ZK connections?
    • set maxClientCnxns=200 in zoo.cfg
  • HBase client leaking connections?
    • fixed in HBASE-3777, HBASE-4773, HBASE-5466
    • manually close connections
  • connection closes prematurely
    ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately.
  • in hbase-site.xml set hbase.zookeeper.recoverable.waittime=30000ms
  • pig hangs connecting to HBase

    CAUSE: location of ZK quorum is not known to Pig

    • use Pig 10, which includes PIG-2115
    • if there is an overlap between TaskTrackers and ZK quorum nodes
      • set hbase.zookeeper.quorum to final in hbase-site.xml
      • otherwise add hbaze.zoopeeker.quorum=hadoophbasemaster.lan:2181 in pig.properties
WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectionException: Connection refused!

Time mismanagement

  • client session timed out
INFO org.apache.zookeeper.server.ZooKeeperServer: Expiring session <id>, timeout of 40000ms exceeded
    • ZK and HBase need the same session timeout values
      • zoo.cfg: maxSession=Timeout=180000
      • hbase-site.xml: zookeeper.session.timeout=180000
    • don’t co-locate ZK with IO-intense DataNode or RegionServer
    • specify right amount of heap and tune GC flags
      • turn on parallel/CMS/incremental GC
  •  
  • clients lose connections
    WARN org.apache.zookeeper.ClientCnxn - Session <id> for server <name>, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Broken pipe
    
    • don’t use SSD drive for ZK transaction log

Disk management

  • unable to load database – unable to run quorum server
    FATAL Unable to load database on disk !  java.io.IOException: Failed to process transaction type: 2 error: KeeperErrorCode = NoNode for <file> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:152)!
    
    • archive and wipe /var/zookeeper/version-2 if other two ZK servers
      are running
  • unable to load database – unreasonable length exception
    FATAL Unable to load database on disk java.io.IOException: Unreasonable length = 1048583 at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:100)
    • server allows a client to set data larger than the server can read from disk
    • if a znode is not readable, increase jute.maxbuffer
      • look for "Packet len <xx> is out of range" in the client log
      • increase it by 20%
      • set in JVMFLAGS="-Djute.maxbuffer=yy" bin/zkCli.sh
      • fixed in ZOOKEEPER-151
  • failure to follow leader
    WARN org.apache.zookeeper.server.quorum.Learner: Exception when following the leader java.net.SocketTimeoutException: Read timed out

    CAUSE:

    • disk IO contention, network issues
    • ZK snapshot is too large (lots of ZK nodes)

    SOLVE:

    • reduce IO contention by putting dataDir on dedicated spindle
    • increase initLimit on all ZK servers and restart, see
      ZOOKEEPER-1521
    • monitor network

Best Practices

DOs

  • separate spindles for dataDir & dataLogDir
  • allocate 3 or 5 ZK servers
  • tune garbage collection
  • run zkCleanup.sh script via cron

DON’Ts

  • dont’ co-locate ZK with I/O intense DataNode or RegionServer
  • don’t use SSD drive for ZK transaction log

You may use Zookeeper as an observer – a non-voting member:

  • in zoo.cfg
    peerType=observer
You May Also Like

New HTTP Logger Grails plugin

I've wrote a new Grails plugin - httplogger. It logs:

  • request information (url, headers, cookies, method, body),
  • grails dispatch information (controller, action, parameters),
  • response information (elapsed time and body).

It is mostly useful for logging your REST traffic. Full HTTP web pages can be huge to log and generally waste your space. I suggest to map all of your REST controllers with the same path in UrlMappings, e.g. /rest/ and configure this plugin with this path.

Here is some simple output just to give you a taste of it.

17:16:00,331 INFO  filters.LogRawRequestInfoFilter  - 17:16:00,340 INFO  filters.LogRawRequestInfoFilter  - 17:16:00,342 INFO  filters.LogGrailsUrlsInfoFilter  - 17:16:00,731 INFO  filters.LogOutputResponseFilter  - >> #1 returned 200, took 405 ms.
17:16:00,745 INFO filters.LogOutputResponseFilter - >> #1 responded with '{count:0}'
17:18:55,799 INFO  filters.LogRawRequestInfoFilter  - 17:18:55,799 INFO  filters.LogRawRequestInfoFilter  - 17:18:55,800 INFO  filters.LogRawRequestInfoFilter  - 17:18:55,801 INFO  filters.LogOutputResponseFilter  - >> #2 returned 404, took 3 ms.
17:18:55,802 INFO filters.LogOutputResponseFilter - >> #2 responded with ''

Official plugin information can be found on Grails plugins website here: http://grails.org/plugins/httplogger or you can browse code on github: TouK/grails-httplogger.

Touki na Confiturze 2013 – ciąg dalszy

Kolejna partia nagrań z konferencji została udostępniona. Wśród nich m.in. prezentacja Michała Trzaskowskiego o rozpoznawaniu oczekiwań klienta podczas prowadzenia projektu z wykorzystaniem ”zwinnych” metodyk projektowych oraz wystąpienie Macieja Próchniaka o możliwościach Slick, jednej z bibliotek w Scali.