Pivert's Blog

CephFS backup


Reading Time: 2 minutes

Since it’s a snapshot capable file system, it’s very easy to back up.

Rdiff-Backup is an efficient backup tool with «deduplication». Easy, powerful, with python3 as only dependency.

Here is an example script that you can schedule from any server with bash and local disk space.
For instance, I use this script from a Synology NAS : The Synology task scheduler runs that script, and sends an email in case of failure.

The script needs rdiff-backup to be installed on both remote server (The PVE server with a Ceph filesystem to back up) and the local backup server (The Synology NAS).
The rdiff-backup version must be similar.

#!/bin/bash
# CephFS has snapshot support by just creating a subfolder in a .snap/ virtual folder.
# Let's take benefit of it for rdiff-backup

REMOTE_SNAP_BASE=/mnt/pve/cephfs/.snap

SNAPTIME=$(date -Is)
RSNAP="$REMOTE_SNAP_BASE/RDB_$SNAPTIME"

# Activate the Python venv for rdiff-backup
. ~/venv/bin/activate

# Create a new snapshot and delete all old but last 7 ones.
ssh pve1 << EOF
mkdir "$RSNAP"

for directory in \$(ls -1d $REMOTE_SNAP_BASE/RDB_20[2-9][0-9]-* | head -n -7)
do
  echo rmdir "\$directory"
  rmdir "\$directory"
done

EOF

CMD="rdiff-backup --exclude $RSNAP/archive pve1::$RSNAP /volumeUSB1/usbshare/rdiff-backup-cephfs"
echo "$SNAPTIME INFO: Running $CMD"

# Run the backup
$CMD

ENDTIME=$(date -Is)
echo "$ENDTIME INFO: rdiff-backup finished"

Explanation

  • The script generates a new name for the snapshot. The name must be alphabetically sortable to purge older snapshots.
  • Activate the python virtual environment containing rdiff-backup for the local user.
  • Remotely create a new snapshot and purge the old ones by keeping the last 7 through SSH.
  • Run the rdiff-backup from the backup server (Synology in this case)

Installation

Most of the time, you will have rdiff-backup in your Linux distribution, so just yum or apt-get install rdiff-backup.

If you do not have a recent version from your package manager, you can install it from pip since it’s a python package. Example :

  • A recent python3 interpreter is installed by default on most devices. If not installed on your device, install it.
  • Create a «backup» user with read-write access to local storage for backups.
  • In the home folder of the «backup» user, create a python virtual environment :
    python -mvenv venv
  • Install rdiff-backup. Do not forget to first activate the virtual environment before rdiff-backup installation:
    . ~/venv/bin/activate
    pip install rdiff-backup
  • As long as the venv is activated, you can manually test rdiff-backup
  • Make sure the backup script activates the venv, as in the above script example.
  • Schedule the script with cron or with a Task Scheduler like available on Synology DSM to keep logs and alert in case of failure.

Encryption

Rdiff-Backup won’t provide encryption.
If you need encryption or if you want to backup to other target such as S3 or public cloud, and you still want the benefits of deduplication you must run the backup from one of the CephFS node or a node it’s mounted to, and either :

Like it ?

Get notified on new posts (max 1 / month)
Soyez informés lors des prochains articles

Leave a Reply

Your email address will not be published. Required fields are marked *