Progress on the OpenBSD Cluster
Tuesday, 2009-04-07 14:26, 1239114403 seconds since Unix epoch
As requested by some people, another update. The project is progressing nicely. There have been a few minor setbacks and a few more lie ahead. But things are going quite well nonetheless.
Since the last post a lot of timing and connection tests have been thrown at the cluster. I’ve bombarded the poor collection of boxes with ICMP floods, gigabytes of random data and h264 media while torturing it by pulling out random cables and hard disks. The setup withstood everything without real problems. I’d run a nuclear submarine on this system if I had one.
There are a few practical problems though. The firewalls use pfsync to share connections between the machines. This enables them to keep every connection alive when a fail over occurs. The entire IP state table is essentially duplicated constantly between the routing firewalls. If an application node fails though, it’s virtual MAC address (and associated IP addresses) will be transferred to another host, but the active connections cannot. So if you’re streaming some high definition pornography and the application node providing your audiovisual entertainment fails, you’re screwed. Well, that might be a bad choice of words, because the screwing would stop. But anyway, what happens is that the connection will survive from your home box to the cluster. Inside the cluster however, the connection won’t be re-established to the new node.
Right now I’m trying to get VLC to stream over RTP instead of HTTP. RTP (usually) uses UDP, so in theory it should work. I’m still working on understanding the nature of RTP streams, especially now I’m using RTSP. I haven’t got it running just yet, I guess it’s a firewall problem or something. Also, I suspect the RT(S)P states to be stored in a node’s VLC process, which would render this entire idea useless once again. I sure hope I don’t have to write an RTSP sync protocol.
As I’ve written in my previous post about this project, the hardware was kindly donated by my employer. Since the hardware can’t leave the building, I’ve been spending most of my time at my employer’s office instead of where I should be, at M2X. Luckily the hands-on part of building this cluster has been finished, so working remotely would suffice. I’ve been playing around with NX technology lately, so I thought it to be a great idea to put that knowledge to good use. The collective.borg.local cluster is running an X11 box, so that wouldn’t be that much of a problem, right? Since I’m an Open Source kinda guy, I’m using the FreeNX variant. It works, but it’s quite limited. The biggest problem is it’s lack of XDMCP support. With NoMachine’s NX server I could easily select the cluster’s XDM server and set up a compressed X11 session. FreeNX only wants to run local X sessions using XSession or KDE/Gnome scripts. I’ve solved this problem with a nice hack, using OpenSSH’s X11 forwarding. First, I made sure the server running NX could login on the cluster’s X11 box using public key authentication. After enabling X11 forwarding on that machine, an X session running a remote openbox would enable me to mimic XDMCP behaviour. Because the DISPLAY environment variable is set for every child process in the SSH session, every application I start using the window manager will be sent over the same SSH tunnel. The only thing the NX server’s .xsession file contains is exec ssh -X collective openbox. So when I login on the NX server, it will request the cluster to start an openbox process. That process will send it’s X11 data to the NX server, which compresses it and send it to my NX client. Anywhere on the internet. Yes, I could have just used SSH, but X11 is way cooler.
For the time I’ve got left a few really nice projects lie ahead. I’ve got to make this cluster nicely manageable with a web interface and stuff. I think I’m going to use Puppet and Ruby for that purpose. After that’s done some QA software has to be made to ensure the cluster’s streaming quality.
I hope the links are useful enough for you to understand what I’m doing. It’s not that hard, really ;)
Jochem Says:
Nice work tÿp, I see one particular window I’m quite happy with ;-)
Keep up the good work, and see you wednesday!
Jochem
geoff Says:
Interesting project.
Is your NX server running on OpenBSD? I’d like to know how to do that.