Remote backups with Borg and rsync
There’s a famous saying that “data that’s not backed up is data you’re prepared to lose.” I used Windows for a very long time, and managed to lose quite a bit of data back in the day because of either Windows Update bricking the system or just wanting to reinstall the OS (Windows has a habit of losing performance over time - easiest fix is a fresh install). I had been performing backups manually to an external hard disk, frequently forgot to backup something critical, and only had backups when I cared to make them (i.e. rarely). Fortunately, there’s a better way of doing things: automated backups to a remote server with setup-and-forget tools like rsync and borg. I haven’t lost data since.
Before we start⌗
To use either of these tools, all you need is a UNIX system (Mac/Linux) and a server or storage device to back up to. There are no other requirements. If you don’t have a system of your own, I highly recommend Rsync.net. Rsync.net is a very cheap/reliable backup provider that simply gives you an SSH endpoint to dump your files in. Though plans vary, the price is about $1 per GB stored per year, which is quite affordable, and the service comes with free snapshots and support. If you choose this option, be aware that there’s also a secret pricing tier for borg users which gives heavily discounted plans that do not include support or snapshots (since borg does that for you).
If you’ll be backing up to a remote system, you’ll want to setup passwordless SSH before you start (all of the defaults are fine here, do not enter a passphrase for your key). This is more secure than using a password to connect, and it means that backups over SSH can be performed non-interactively.
ssh-keygen -t rsa
# hit enter for all the prompts here, you typically do not want to set a passphrase
ssh-copy-id username@server.web.address
Before you do anything else, make sure that connecting over SSH to your backup server no longer prompts you for a password.
Alternatively, if you are backing up to a network drive, make sure you know how to mount and unmount
the drive via the command line (typically via mount
and umount
). Your backups should never be
mounted to your computer unless you are backing things up!
(This reduces the risk of attackers or accidents breaking your precious backups.)
Once this is done, you should be set.
Simple backups with rsync⌗
rsync
is a very handy file-copying tool that performs easy, straightforwards backups.
Backups are unencrypted and unauthenticated, but it’s trivial to setup and restore from.
If all you want is an up-to-date backup when things go bad, rsync
is the tool for you.
To create a backup⌗
This performs a very simple backup to any storage device. Files are copied as-is, and all attributes (ownership, permissions, modification times, etc.) are preserved. No authentication or encryption is performed, meaning that anyone could get at your files if they can access your storage media. Files that you delete are deleted from your backup, and only files modified since the last backup are uploaded.
To back up to a local disk or network drive:
rsync -az --delete /folder/to/back/up /destination/folder
To back up to a remote server:
Note that this command connects to the remote server over SSH, meaning that information is encrypted while being transferred to the remote server. The actual backups themselves, however, are unencrypted.
rsync -az --delete -e ssh /folder/to/back/up username@remote.host.address:/destination/folder
To restore from a backup⌗
To restore files from your remote backup, no special magic is required - just reverse
the source and destination folders in the above command. Alternatively, you can
use tools like scp
or sftp
to restore individual files.
rsync -az -e ssh username@remote.host.address:/destination/folder /folder/to/restore/in
Pros and cons of rsync⌗
rsync is useful for its simplicity. You get easy-to-perform backups for virtually no learning curve, and restoring files is a breeze (just copy them back!).
There are some big drawbacks to this method, however. rsync is very space inefficient - no compression or deduplication is performed. You only get access to a single backup as well. If you want multiple backups for multiple dates, you’ll need to manage these manually, and each extra backup will take up an equal amount of space (7 days’ of backups == 7 times the storage usage). Anyone with access to your storage media will also have access to your files.
In light of this info, rsync is a great tool for fast and dirty backups to local storage media, or when you are confident that your backup location is secure and cannot be accessed by anyone else. If you want multiple backups and access controls, you’ll need a different tool.
Secure backups with borg⌗
Borg is a fantastic tool that covers the weaknesses of rsync without sacrificing much in terms of usability. In particular, you’ll be able to keep multiple backups, save space through deduplication and compression, and secure your data with either passwords or a keyfile.
Setup⌗
Borg requires a little bit of additional setup before you can start using it.
Having borg installed on the remote server will speed things up.
This is already done for you if you use Rsync.net, although you should specify
the environment variable BORG_REMOTE_PATH
to use the most recent version of borg available:
# only for Rsync.net users
export BORG_REMOTE_PATH=/usr/local/bin/borg1/borg1
Additionally, you will need to initialize your repository before you can use it. To create a new repository that’s password-protected, use the following:
# see "borg init --help" for more options like storage quotas, encryption options, etc.
borg init -e repokey-blake2 username@remote.host.address:/destination/folder
Creating a backup⌗
For your first backup, you may wish to do it interactively so you can watch the progress
and verify that things work. The following creates a backup titled backup-name
:
borg create --progress --stats username@remote.host.address:/destination/folder::backup-name /folder/to/back/up
For subsequent backups, you will likely want to do things non-interactively. You can use the
following to create an automatically-named backup (computer name + date).
Note that further commands assume you’ve set the BORG_REPO
environment variable
(specifying a default repository to back up to).
# specify a password for non-interactive use
export BORG_PASSPHRASE='your repository password'
# specify the default repository to use for backups
export BORG_REPO='username@remote.host.address:/destination/folder'
borg create ::$(hostname)-$(date -I) /folder/to/back/up
Cleaning up old backups⌗
Chances are, you will not want to keep every backup ever made. You might want to keep say only 7 days’ worth of daily backups, 8 weeks of weekly backups, and 12 months of monthly backups. To do so:
borg prune --keep-daily 7 --keep-weekly 8 --keep-monthly 12
Inspecting backups⌗
To view a list of all backups:
borg list
# follows is a list of backups, dates, and ids
To view all files within a backup (EXTREMELY VERBOSE, so output has been piped to head
):
borg list ::backup-name | head
Restoring from a backup⌗
Restoring files is quite easy, although they are extracted to the current working directory
(so if you backup up /home/youruser/some-folder
, expect it to recreate that directory structure
unless you cd
to the root directory).
To extract a single file:
cd /location/to/restore/to
borg extract ::archive file/to/restore
To extract all files:
cd /location/to/restore/to
borg extract ::archive
Automating your backups (and sample scripts!)⌗
To run your backups automatically, you’ll want to create a script, and run it automatically through cron
.
Though crontab
is normally a great way to do this (run a task at a specified time and day), it is not
very flexible - if you set it to perform backups at 3am, and you’re not logged onto your laptop at 3am,
the backup wont happen! Instead, we’ll create a script and put it in /etc/cron.daily
. Scripts here
are automatically run about 15 minutes after logging on to your computer.
Here are some sample scripts that you can use for either rsync or borg.
Installation is the same - just copy these to /etc/cron.daily
.
You’ll note I’ve fully specified the path to programs here - this is a “best practice” when working with
scripts to be run under cron
.
rsync example script⌗
#!/bin/bash
# Backup a folder to a remote address using rsync.
# Usage: backup-rsync.sh
# To restore: rsync -az -e ssh username@remote.host.address:backups/$(hostname)/folder /restore/point
set -eu
/usr/bin/ssh username@remote.host.address mkdir -p backups/$(hostname)
/usr/bin/rsync -az --delete -e ssh /folder/to/back/up username@remote.host.address:backups/$(hostname)/
Borg example script⌗
#!/bin/bash
# Backup a folder to a remote address using borg.
# Usage: backup-borg.sh
# To restore: borg extract $BORG_REPO::computer-and-date
set -eu
export BORG_REPO='username@remote.host.address:borg/repo/path'
export BORG_PASSPHRASE='your password'
export BORG_REMOTE_PATH=/path/to/remote/borg
/usr/bin/borg create ::$(hostname)-$(date) /folder/to/back/up
/usr/bin/borg prune ::$(hostname)-$(date) --keep-daily=14 --keep-monthly=6
You’re set!⌗
Assuming you’ve added one of these scripts to your /etc/cron.daily/
folder,
all you have to do is wait. If you’ve added BORG_REPO
to your .bashrc
,
you can check in and verify that your backups are working properly with
borg list
(you should see a list of your current backups).