In Part 1 of this blog series, I discussed how we are using ZooKeeper to handle distributed applications. ZooKeeper is used to store application configuration, application state and handle workflows using queues. Here I’ll present how we accomplish this by utilizing a command-line application called ZooKeeper-RW, created in-house at FM and now publicly available on github.
First let me describe the issues that prompted us to create our own command line client, ZooKeeper-RW.
We have several distributed applications that need to share configuration, queues and states, across different servers in different data-centers, using different programming languages (Java, bash, perl, python).
We went with our own ZooKeeper command line utility because of a lack of support for our needs. ZooKeeper offers a built-in client as a shell environment, zookeeper-cli, which is useful for simple interactions, but it does not offer automated command line access. There is support for the ZooKeeper API in a few languages, such as C, Perl and Python, but we were looking for a simple command line interaction for anything including shell scripts. There is no built-in support for queues in ZooKeeper, although queue recipes are described in the documentation, and some additions have come from Cloudera’s contributions.
A colleague of mine at FM built the first version of the ZooKeeper-rw client way back in the early months of 2011. We have been using it since then and now we can finally offer the client to the public for your open-source enjoyment! It supports get and set functionality and supports a simple prioritized queue for adding and polling the queue items. The code base also includes bash scripts that illustrate using the basic commands to do more complicated things, such as checking the queue items (peek) and adding an item to a queue only if it is new (enQueueIfNew).
The main class is Zkrw. This class includes all the commands that can be called. For example, calling “zk get /some/zk/node” will return the value at that node. The “zk” command goes to a script “zk” (see the github code for zk) which calls the Java class and passes in the configuration for the ZooKeeper servers. The Zkrw class and other classes in the project can also be incorporated as an API into your own applications.
With this simple interface, we have our applications maintaining their job state and passing off work via queues.
For example, to maintain state of a job:
zk createOrSet /my-hadoop-app/jobs/current/2012/10/02/job_id_20121002_00 “done”
And to add the item to the queue for an ETL process,
zk qAdd /my-etl-app/queue "20121002_00"
zk qPoll /my-etl-app/queue
Using the client, you can manually add items to a queue for ad hoc tasks or via a cron entry for regularly scheduled tasks. Through this mechanism we have created simple workflows that are implemented in Python and shell scripts.
There are of course a few things to keep in mind (aka limitations). This client has only been tested for Linux. Since it is Java, it should work for any JVM. However, this would require a different “zk” script that works with that environment. The client takes care of some basic error handling, including re-attempting when the ZooKeeper connection fails. Currently it does not report back the details of the problem in all cases, such as when a node does not exist. It does return a standard error code (accessible by $?) with non-zero indicating a failure.
If you are interested in using or maintaining the project, feel free to contact me.