grun Job Scheduler moved to ZMQ
Posted 8:30 AM
As most people in cluster computing know, there really isn't a simple
and "reall open source" solution out there for job scheduling. Oracle SGE, LSF and others are massive things, with over 100k lines of code, and extraordinarily complex configuration for things ranging from MPI support to Kerberos. And yet they lack simple features (script plugins for configuration), that would make them more versatile.
grun was written to be an "extremely lightweight" and yet big-featured job scheduler. The early version was not much more than "ssh to remote host, run job, wait for response", while logging and keeping track of resources. It's evolved to use a TCP messaging system allowing the compute nodes, queue nodes, and clients to communicate. By v 1.0 the plan is to have better support for arbitrary metrics, and better handling of priorities.
Going from 0.8 to v 0.9, I decided to try using the zeromq library instead of TCP. At first it was hard to remember that you really don't need to worry about things like sending to a socket you just created, even with no one on the other end.
The net result of the ZMQ port:
- speed improved
- implicit perl moved to optimized c
- built-in multithreading takes better advantage of cpu
- ability to stop/restart any queue without losing messages
- improved reliability of message delivery
- improved code organization orientated around messaging
- 20% smaller code base, because we removed:
- all "double checks" to see if connections are there
- code that "breaks up" large messages
- all issues with blocking/vs nonblocking i/o
- the whole "select" loop complexity
ZMQ is not perfect (yet), but it was an overall improvement over straight TCP. Because of the forking needed to launch jobs, I had to do some fiddling with dup'ed file descriptors
to prevent zmq from acting wonky. The learning curve was worth it. I doubt I'll be using TCP again, especially since ZMQ has package support with most Linux distributions.