Efficient Xen Backups Using LVM and Rsnapshot
Effectively backing up your virtual machines is a problem with a multitude of potential solutions. Many solutions are centered around making a copy of the full volume(s) upon which your virtual machine(s) reside. But what happens if you want to recover just a single file and not the entire VM? And is it possible to delegate restoration to end users? Not to mention the space that is wasted by storing many copies of the same data.
Here I will describe how to implement a backup scheme that I have found very robust, efficient and reliable. It enables automated backups of Xen virtual machines that each reside on individual LVM volumes using a utility called rsnapshot (which is based on rsync).
Note: This method additionally can be utilized to back LVM volumes that aren’t housing Xen virtual machines.
LVM provides a convenient snapshot feature. This allows you to create an identical clone of a logical volume and store only the blocks that differ from it. It is an unintuitive action and the result is a mirror image of your virtual machine’s disk that you can mount, read and even write data to.
Rsnapshot is a very flexible backup application which provides robust automation facilities. It supports LVM, ssh, rsync as well as custom scripts and will manage the process of rotating your backup files according to a defined schedule and retention policy. For our purposes we will be using the built-in LVM snapshot support.
How it works
Each of my Xen virtual machines has own LVM volume per file system used by the system. For many systems this results in a single and somewhat large root file system housed on LVM. When rsnapshot is configured to back up one of these volumes it runs through the following steps:
Note: The use of LVM snapshots does introduce a write performance implication during the period that the snapshot exists. I have found this to be an acceptable tradeoff as most backup suites introduce some sort of performance effect while they are processing.
How to set it up
Assuming you already have a working LVM configuration (read more about LVM here), you’ll just need to install rsnapshot. It is available through dag for RedHat based platforms and available in the standard repositories for ubuntu and debian.
Once you have installed rsnapshot, defined your retention periods, and installed the corresponding cron jobs to call the backup process, you’re ready to add the LVM specifics to the rsnapshot configuration.
For example, I have a logical volume named “vm_insomniac-root” in a volume group named “vg0″. This represents the root filesystem of a virtual machine named “insomniac”.
[root@localhost .snapshot]# lvs | grep insomniac-root vm_insomniac-root vg0 -wi-ao 36.00G
To tell rsnapshot that I want to backup this LVM volume I add the following to /etc/rsnapshot.conf. Make sure this is tab delimited. Rsnapshot is unreasonably insistent upon tabs in its config file.
#/etc/rsnapshot.conf linux_lvm_cmd_lvcreate /sbin/lvcreate linux_lvm_cmd_lvremove /sbin/lvremove linux_lvm_snapshotsize 5G linux_lvm_snapshotname rsnapshot-tmp linux_lvm_vgpath /dev linux_lvm_mountpath /mnt/rsnapshot-tmp linux_lvm_cmd_mount /bin/mount linux_lvm_cmd_umount /bin/umount backup lvm://vg0/vm_insomniac-root/ insomniac/
My snapshot_root is set as “/.snapshot”, and as such the backups look like this:
[root@localhost .snapshot]# cd /.snapshot [root@localhost .snapshot]# ls daily.0/ daily.1/ daily.2/ daily.3/ daily.4/ daily.5/ daily.6/ lost+found/ [root@localhost .snapshot]# ls daily.0/ | grep insomniac insomniac/ [root@localhost .snapshot]# ls daily.0/insomniac/ bin/ boot/ etc/ home/ lib/ lost+found/ media/ mnt/ opt/ root/ sbin/ selinux/ srv/ tmp/ usr/ var/
As you can see, I’m only set up to keep a weeks worth of daily backups. If you have the disk space you could opt for a much longer retention period.
Delegation to end-users
To provide users with a simple interface to access their backups I share each of their rsnapshot directories as read-only NFS exports. To only their server of course. This allows users to traverse their backups.
One significant benefit is that there is no backup client for them to install and learn how to use. Restoration is performed with standard unix commands like cp and rsync, saving much time and frustration.
Potential Improvements and follow-up
Time permitting, I would like to test the performance, reliability and storage saving of using a dedup file systems such as lessfs and SDFS. I think these have the potential to make this solution even more robust, especially from a cost perspective.