I run quite a number of virtual machines on a bare metal server in a data center, and most of those are almost identical copies of virtual machines running at home. So my strategy so far for a failure of that server has been to restore service with another set of copies of the VMs running at home. But in recent months, I have started to run a number of VMs there for which I do not have a master on my home server. So I needed a different backup/restore approach here.
I use QEMU/KVM for virtualization on my network servers, and creating a copy of a VM is as easy as copying the disk image file of the VM and transferring it to another place. The challenge: VM images tend to bloat quite a bit over their lifetime, particularly when snapshots are used, and transferring a 50 GB VM image over the Internet and relatively slow VDSL downlink takes quite a bit of time. The solution: Compacting and compressing the image before transfer. Turns out that has an interesting effect on VM snapshots, which are also stored as part of the disk image, that one should be aware of.
Shrink and Compress
QEMU/KVM has a nice command line utility to shrink VM disk images by detecting unused blocks on the virtual disk drive, overwriting them with zeros and then creating a new image without the unused blocks. This way I could reduce a 50 GB VM image down to 18 GB. Taring the image with ZIP compression afterwards brought a further reduction to around 7 GB, which is much easier to transfer over the Internet in a reasonable amount of time. Here are the commands for the whole process:
# One time install # # apt-get install libguestfs-tools # Optional: Specify a temporary directory in case there isn't # enough space on the standard temp dir. virt-sparsify below # will tell you. # export TMPDIR=/zfs-pool-1/data/tmpdir/ # And now use virt-sparsify to shrink. # virt-sparsify vm-NAME.qcow2 vm-NAME.qcow2.shrunk # and now TAR+ZIP the image tar cvzf vm-NAME.qcow2.shrunk.tar.gz vm-NAME.qcow2.shrunk
Restore and Test
Having a backup is great, but it’s even greater to know that it can actually be used to restore the service. So I uncompressed the VM image again and ran it on the local server. Worked like a charm! Then I wanted to go a step further and restore one of the snapshots that was part of the original image. To my surprise, the QEMU/KVM GUI didn’t show any of the snapshots I was expecting, nor did the command line utility.
On second thought I should have expected this. The VM shrink process ‘sparsifies’ the disk image, i.e. zeroes the unused blocks and then removes them. Why keep the snapshots in that case? Right! So I’m glad I actually tested the restore procedure because I might have relied on the snapshots still being there at some point. It’s not a problem that they are missing for my use case, however, and it shows that it pays to actually run a restore procedure before it really becomes necessary.