Over this past year, I have set up a home lab Kubernetes 4-node cluster. I have moved my various household services over to it, including Wallabag, Calibre, Vaultwarden, and Immich. For someone who is experienced with Kubernetes, this feels like an obvious and very nice upgrade from simple docker compose hosting.

Unfortunately I have yet to set up a NAS for my files which would have a RAID to ensure the safety of the data. But even with a NAS, off-site backup is a good practice. I’ve been using Restic for years for my personal backup solution. I thought to myself, hey, Restic works so nicely and there is probably a stable docker image so it might not be too difficult to get it working in my Kubernetes cluster. Turns out I was right.

Quick overview

Here are the steps in a nutshell:

  1. Ensure Restic is working in a simpler environment.
  2. Add the Secrets to your namespace(s).
  3. Add the CronJob(s).
  4. Test the CronJob, check logs.
  5. Check Restic snapshots.

First make sure you have restic working in a simple use case

I have had Restic backing up my photos in Nextcloud for years so I already knew it was working. If you haven’t set up restic yet, I recommend getting that going in a simple environment first. My backup command was something like this for my photos:

❯ restic backup --tag=nextcloud --tag=photos /path/to/photos

Of course, your set up will be your own. Refer to the Restic documentation for all the backup repository options.

Add the Restic Secrets to Kubernetes

I am using the SFTP option for my backups. The variables in the secrets will of course need to match whichever type of repository you are using. Here is an example to create the secret.

❯ kubectl -n photos create secret generic restic \
  --from-literal=RESTIC_REPOSITORY=myrepo \
  --from-literal=RESTIC_PASSWORD=mypassword \
  --from-file=id_rsa=path/to/private_key \
  --from-file=ssh-config=path/to/ssh-config \
  --dry-run=client -o=yaml

After setting this correctly, remove the --dry-run=client -o=yaml options. Check the secret variables after creating them to make sure they are correct. In my case with SFTP, I also needed a ssh-config file, which could have been in a configmap, but it also makes sense to keep it together in this secret. If you are also using a SFTP repo, you will likely need the StrictHostKeyChecking off and the UserKnownHostsFile set to null.

❯ kubectl -n photos get secret restic --template='{{ index .data "ssh-config" | base64decode }}'
Host 192.168.1.*
   User dave
   StrictHostKeyChecking no
   UserKnownHostsFile=/dev/null

Cronjobs

For my Immich service, I have two cron jobs for backup: the files, and the database. Each of these have their own persistent volume claim. So to back them up, we simply need to mount the volume and run restic. Below is my CronJob for the photos, which handily uses the restic/restic image. See my comments for the variables that need to be set.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: immich-photo-backup
  namespace: photos                    # Set the namespace here.
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
      name: immich-photo-backup
    spec:
      template:
        metadata:
          creationTimestamp: null
        spec:
          containers:
          - args:
            - backup
            - --host
            - cronjob-pod
            - --tag=k8s      # Set restic tags however you like
            - --tag=photos
            - /home/photos/library/dave  # Match with the mountPath below
            env:
            - name: RESTIC_PASSWORD      # Set the secrets according to
              valueFrom:                 # your situation
                secretKeyRef:
                  key: RESTIC_PASSWORD
                  name: restic
            - name: RESTIC_REPOSITORY
              valueFrom:
                secretKeyRef:
                  key: RESTIC_REPOSITORY
                  name: restic
            image: restic/restic
            imagePullPolicy: Always
            name: immich-backup
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /root/.ssh/config   # For SFTP repo
              name: ssh-config
              subPath: config
            - mountPath: /root/.ssh/id_rsa
              name: ssh-key
              subPath: id_rsa
            - mountPath: /home/photos    # Match with command arguments above
              name: data
          dnsPolicy: ClusterFirst
          restartPolicy: OnFailure
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
          volumes:
          - name: ssh-config    # For SFTP restic repos
            secret:
              defaultMode: 256
              items:
              - key: ssh-config
                path: config
              secretName: restic
          - name: ssh-key
            secret:
              defaultMode: 256
              items:
              - key: id_rsa
                path: id_rsa
              secretName: restic
          - name: data
            persistentVolumeClaim:
              claimName: photo-library    # Your PVC you want to backup
  schedule: 38 10 * * *                   # But when?
  successfulJobsHistoryLimit: 3
  suspend: false

Depending on your application, you may have multiple CronJobs, as I do for Immich. It makes sense to have one for each volume that you are backing up as you can schedule them independently. My second one backs up the Postgresql database. Here is the diff of the two CronJobs, which I think shows the important details clearer than showing the CronJob again:

4c4
<   name: immich-data-backup
---
>   name: immich-photo-backup
12c12
<       name: immich-data-backup
---
>       name: immich-photo-backup
26c26
<             - /immich/postgresql
---
>             - /home/photos/library/dave
40c40
<             name: immich-data-backup
---
>             name: immich-backup
51c51
<             - mountPath: /immich/postgresql
---
>             - mountPath: /home/photos
75,76c75,76
<               claimName: data-immich-postgresql-0
<   schedule: 36 10 * * *
---
>               claimName: photo-library
>   schedule: 38 10 * * *

Always check your logs

To test the cronjob, I simply set the schedule to run in the next minute or two. You can watch for it to run using something like kubectl -n photos get cronjobs,job,pod. When you see that the pod ran, check the logs with a similar command to this:

❯ kubectl -n photos logs -f immich-photo-backup-28848458-7vhwp
using parent snapshot b5c709b6

Files:           5 new,     0 changed, 32828 unmodified
Dirs:            0 new,     7 changed,   244 unmodified
Added to the repository: 18.251 MiB (18.253 MiB stored)

processed 32833 files, 186.430 GiB in 0:28
snapshot 250f1da8 saved

If the pod doesn’t run, then kubectl describe ... might help debug the issue.

And check your Restic snapshots

After the backup completes, you should be able to check for the snapshots in the repository. In my case, restic snapshots --tag=immich reveals healthy snapshots. We could also test a restore of the data in another location to ensure it is correct. I’m not sure my home lab warrants a full disaster recovery exercise but I am happy knowing that I have off-site backup of the files that are important to me.