Generally, I’m pretty happy with the backup strategy I have for my devices and my servers. I have several geo-redundant copies of my data and rotate backups frequently. I even use SSDs to clone my notebook installation and data so I can hit the ground running should the worst happen. But there is always more one can do. Things my backup strategy did not address very well so far, however, was accidental deletion of data only noticed days or weeks later or protecting against encryption trojans that could alter data that would then replicate into at least the latest backup if not caught in time.
An approach to protect myself against such scenarios would be incremental backups so I could go back to any state over weeks or months. Perhaps Borg Backup for Linux, of which I heard in Lightning talks during the past two Chaos Communication Congresses could complement my backup strategy!? So when I finally gave this open source package a try recently, it totally blew me away!
Borg Backup is a shell based incremental backup solution that works both locally and via SSH tunnels over the network. The introduction screencasts and documentation on their website are excellent so I won’t go into the details here. Instead, I thought I’d write a bit about my experiences and how I found the software to work for me in practice.
First Steps
For a start, I used Borg to store incremental backups of a 30 GB Nextcloud instance over the local network. Throughput over Ethernet was about 80 MB/s, so it took a while. Once done, all subsequent differential backups take just half a minute to complete. Differential backups work with stripes so renaming or changing a part of a file doesn’t transfer the complete file. Nothing has to be transferred in the first case apart from the changed filename and for partially changed files, only the modified parts are transferred.
Diffs and Mounting Backups
I particularly like the ‘diff’ option to show the difference between any two snapshots. What’s even better is the possibility to mount any backup into the file system so one can, for example, use rsync to find the differences between the current original and any snapshot previously made. This way it is also possible to restore files or revert back to any snapshot state. Mounting a Borg snapshot even over the network is fast and feels just like a real filesystem without any noticeable delay beyond that of a rotating hard disk. Brilliant! I played around quite a bit, restored files and compared checksums and couldn’t break things at any point.
Going for Terabyte Backups
Then I grew more adventurous and used Borg Backup locally with around 1 TB of data. The first backup took a few hours to complete. Due to compression and ciphering I suppose, data was written to a USB3 hard disk at a rate of about 70-80 MB/s. That’s far away from the 180 MB/s the drive is capable of in practice, but still all right for my purposes. Once the initial backup was done, subsequent snapshots just take a couple of minutes to complete. Restoring data from a backup runs at around 60-70 MB/s from the USB3 hard disk to an SSD and it doesn’t really seem to matter where files are located in a backup.
Securing a Central Backup Server
A final feature I would like to mention here is that Borg and the SSH daemon on a remote storage server can be configured to only allow adding snapshots and to limit all actions to one directory path. All other SSH and Borg features are then disabled. This way, a compromised Borg client machine can’t delete any backups or do other malice on the remote central backup server.
The guys developing Borg have really thought this through! Two thumbs up!