At work we’ve been trying to do more comprehensive release testing, on the basis that it’s a Good Idea to make sure our stuff works. Part of this is testing the deploy process.
Our software is installed into Linux appliance servers that are used specifically for this one purpose. One of our products works well enough in a VM that we have lots of customers deploying to virtual hardware instead of a physical machine, but another product is well-known to choke and die if run under virtualisation.
This is a huge pain for release testing, because what you really want to do is make a VM with a bare OS install, take a snapshot, and be able to roll back to that snapshot after each attempt at running the install tests.
Ideally you’d even want to be able to let the testers and developers do the rolling back without having to bother a sysadmin to go onto the VM host and do it for them.
Internally we use VMware exclusively for our VM hosting. We’ve got all sorts of different versions of it in use, but we’re a VMware-only shop. I’ve been happy with that – before I started working here I had only used Xen, so it’s been really good to get broader experience with another virtualisation platform. All of our various VMware set-ups have one thing in common: they’re all using file-backed virtual disks. I suspected that this might be what’s stopping our other big product from working in a VM. I also happen to know, from my experience with Xybur, that you can quite easily configure Xen to use physical disk partitions, and that you can use LVM to provide instant snapshots of those partitions.
Seemed simple at this point, but a couple of complications have left me wondering about the place of Xen for places that want virtualisation but aren’t setting up their own massive ‘cloud’.
First up, I couldn’t use Ubuntu for the dom0/host OS. Given the very specific use this system was being designed for I would have got around our blanket ban on using Ubuntu for non-desktop systems (mandated due to an SSH security vulnerability last year), but Ubuntu don’t provide a dom0 kernel any more anyway. They haven’t for some time, which is quite worrying – and may leave me having to use CentOS for my dom0 when I replace my own VM host.
CentOS 5 has support for Xen out of the box, and even ships with some nice tools to make it easy to manage. We use CentOS 4 for our application servers, so it seemed like a reasonable choice.
If you’ve been used to using Ubuntu and xen-tools then CentOS will be a bit weird at first. virt-manager is pretty good, but I found it difficult to create a paravirtualised VM from our netboot/kickstart setup. I quickly switched to creating a HVM (hardware VM, or fully virtualised machine) VM which I could netboot and install in our usual manner. At this point I was using the GUI virt-manager tool (with forwarded X11 over SSH from my hackintosh netboot, which I use as my main work machine now), which I found a bit glitchy – sometimes it wouldn’t release the keyboard/mouse and I had to use command-tab to switch out of X11.app on my Mac to Terminal so I could kill the SSH connection. I will say this: having access to the VGA console was invaluable and made using our existing netboot install process possible, which was a big win.
So I’ve got an LVM partition with a fully virtualised machine on it. So far, so good, right? Not quite. When you create a HVM Xen VM using these tools, it treats the partition you pass as a whole block device (think /dev/sda) instead of a partition (like /dev/sda1). If you manually tweak the Xen config file to pass through specific partitions, the VM won’t see them. virt-manager also defaults to creating /dev/hdX devices instead of /dev/sdX, which made the install process’s disk access slow until I caught it. I don’t know why it does that – is it something to placate crappy versions of Windows? Whatever the reason, it’s as annoying as HP’s servers defaulting to IDE mode instead of SATA. (Which they do. And it’s really annoying.)
You can’t mount the HVM partition as normal because it has a partition map, bootloader, and probably other stuff on it. This is less than awesome, since I was planning to turn it into a paravirtualised VM and pass it through as a partition with a separate LVM partition for swap.
In the end I found some documentation which helped and I got the partition mounted. To make it work as a standard partition I resorted to a brute-force cp -a to another, normal, ext3 LVM partition. I suspect I could have done something clever with dd, but this worked and was quick enough.
The CentOS wiki has some excellent documentation on turning a physical machine into a Xen VM. The same holds true for turning a fully virtualised VM into a paravirtualised one, and after following that guide I was able to boot up my test VM as a paravirtualised domU. I was also left quite impressed with pygrub, and I’m considering upgrading Xen on my own host so I can use it.
It was at this point I realised that I’d selected the wrong netboot option (I’d picked the one that does an unattended install of one of our products, instead of just installing the OS and nothing more) and had to do it all over again. Such is life.
Once I had the correct base OS image created it was really simple to write a quick shell script to automate turning off the VM, deleting its disk, creating a new disk as a snapshot of the base image, and turning the VM back on. A non-root user has been set up with sudo access just to that script, so our testers and developers can tear it down and rebuild it whenever they like.
A quick test by one of our developers showed that the system was performing as well as it would on the bare hardware – which wasn’t actually all that well since I was using a scavenged under-powered machine to build a proof-of-concept, but it’s enough for install testing, and much better than we’d see on one of our VMware hosts.
All that’s left is proper documentation and to find some more RAM for the scavenged box, and it’s ready for our next cycle of release testing. If it works out there, we might even use it to replace the VMware VM we use for our other main product – that way the testers have control over rolling it back and don’t need to wait until a sysadmin has time for it.
As happy as I am with what I’ve ended up with, it wasn’t straightforward. Creating a paravirtualised VM was difficult, the fully virtualised one had awkward disk requirements, and the whole experience – even using the GUI tools – wasn’t as slick as VMware. I wouldn’t be comfortable asking our client installations team to set up VMs using virt-manager but we do ask them to set up Windows VMs using VMware Workstation and VMware Server.
I really like Xen. It’s fast and it’s nicer to manage from the command-line than VMware. But this experience has left me wondering if it’s really a viable option if you’re not building up a custom virtualisation system? If someone just wanted to consolidate a bunch of servers, I’d tell them to use VMware, not Xen.