Hello Self-Hosters,

What is the best practice for backing up data from docker as a self-hoster looking for ease of maintenance and foolproof backups? (pick only one :D )

Assume directories with user data are mapped to a NAS share via NFS and backups are handled separately.

My bigger concern here is how do you handle all the other stuff that is stored locally on the server, like caches, databases, etc. The backup target will eventually be the NAS and then from there it’ll be double-backed up to externals.

  1. Is it better to run #cp /var/lib/docker/volumes/* /backupLocation every once in a while, or is it preferable to define mountpoints for everything inside of /home/user/Containers and then use a script to sync it to wherever you keep backups? What pros and cons have you seen or experienced with these approaches?

  2. How do you test your backups? I’m thinking about digging up an old PC to use to test backups. I assume I can just edit the ip addresses in the docker compose, mount my NFS dirs, and failover to see if it runs.

  3. I started documenting my system in my notes and making a checklist for what I need to backup and where it’s stored. Currently trying to figure out if I want to move some directories for consistency. Can I just do docker-compose down edit the mountpoints in docker-compose.yml and run docker-compose up to get a working system?

  • lazynooblet@lazysoci.al
    link
    fedilink
    English
    arrow-up
    6
    ·
    1 hour ago

    I’m lucky enough to run a business that needs a datacenter presence. So most my home-lab (including Lemmy) is actually hosted on a Dell PowerEdge R740xd in the DC. I can then use the small rack I have at home as off-site backups and some local services.

    I treat the entirety of /var/lib/docker as expendable. When creating containers, I make sure any persistent data is mounted from a directory made just to host the persistent data. It means docker compose down --rmi all --volumes isn’t destructive.

    When a container needs a database, I make sure to add an extra read-only user. And all databases have their container and persistent volume directory named so scripts can identify them.

    The backup strategy is then to backup all non-database persistent directories and dump all SQL databases, including permissions and user accounts. This gets run 4 times a day and the backup target is an NFS share elsewhere.

    This is on top of daily backuppc backups of critical folders, automated Proxmox snapshots for docker hosts every 20 minutes, daily VM backups via Proxmox Backup Server and replication to another PBS at home.

    I also try and use S3 where possible (seafile and lemmy are the 2 main uses) which is hosted in a container on a Synology RS2423RP+. Synology HyperBackup then performs a backup overnight to the Synology RS822+ I have at home.

    Years ago I fucked up, didn’t have backups, and lost all the photos of my sons early years. Backups are super important.

  • bdonvr@thelemmy.club
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 hour ago

    BTRBK for easy BTRFS snapshots to an external drive.

    Rsync to also upload those to B2 compatible storage (encrypted)

  • Sinirlan@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    4 hours ago

    I just took line of least effort, all my docker containers are hosted on dedicated VM in proxmox, so I just backup entire VM on weekly basis to my NAS. Already had to restore it once when I was replacing SSD in proxmox host, worked like a charm.

    • Dataprolet@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      6
      ·
      3 hours ago

      While that’s an easy solution it makes it impossible or rather difficult to restore single containers and/or files.

      • tofu@lemmy.nocturnal.garden
        link
        fedilink
        English
        arrow-up
        1
        ·
        27 minutes ago

        It’s not that difficult. You can create a second VM from the backup with a few clicks and move the necessary data with scp.

  • greatley@ani.social
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    4 hours ago

    I don’t know if it’s best practice but I mount everything to a specific folder instead of volumes. I also just stop all the containers before a backup instead of database dumping. Then just do an encrypted B2 backup using Restic.

    So far I had to restore twice in six years of running this system and assuming the folder path is the same the only thing after downloading all the data I had to do was

    docker-compose up -d
    
  • kewjo@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    3 hours ago

    caches are never really a concern to me they will regen after the fact, from your description i would worry more about db, this is dependent though in what you’re using and what you are storing. if the concern is having the same system intact then my primary concern would be backing up any config file you have. in cases of failure you mainly want to protect against data loss, if it takes time to regenerate cache/db that’s time well spent for simplicity of actively maintaining your system

  • Dataprolet@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    3 hours ago

    I use Borg to backup the default volume directory and my compose files. If you’re interested I can share my backup script.

  • plumbercraic@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    7
    ·
    4 hours ago

    I wouldn’t backup the volumes directly. Better to use the mount points as you suggest then back up those mounted directories. If it’s a database that usually needs to have its records exported into a backup friendly format. Typically I will do a db dump from a cron job in the host system to summon a script inside a container which writes to a mounted dir which is the thing that I back up.