As part of the migration from OpenMediaVault to Ubuntu for my NAS, one of the "missing pieces" was Duplicati. I'd been using the OMV Duplicati plugin right along, and it was obviously not going to work moving forward.

So at this point, we're looking at moving everything to Docker anyway - the logical thing to do is move Duplicati to Docker. I had no need (nor, to be honest, any real desire) to change this over to Docker, as it was working fine as it was... on OMV. (At least, I thought it was working fine. I was wrong.)

First though, let me explain the nature of the beast. I've got Duplicati running on most of the user systems in the house. (Not my laptop - I rarely use it at home.) All of these use SFTP to copy their user folders (Documents, Pictures, Desktop, etc.) to the NAS in a Backups folder on a ZFS dataset that takes snapshots every 15 minutes. None of this has to do with Duplicati on the NAS directly. Duplicati on the NAS backs up to OneDrive. It backs up all of our files, photos, backups, a few system backups for various critical VMs, my TFTP folder (backups of all my networking configurations), my entire eBook collection, couple of other things. No bulk media, but it still comes out to around 500GB.

When I went into Duplicati to export the configs for the storage job, I discovered that - apparently - it hadn't done a successful backup since June 22nd. See, Duplicati defaults to taking all of your files, compressing and encrypting them into 50MB (yes, fifty megabyte) chunks, and uploading those. This reduces the amount of data to download for recovering just a single file or two. Which is absolutely GREAT... when you don't have over five hundred gigabytes of data.

Fun fact: OneDrive does NOT handle indexing folders with well in excess of 10,000 files in a graceful manner. It gets downright cranky about it, in fact. Completely fails to finish indexing in a timely manner. This causes Duplicati to declare that there are files missing in the remote storage, fail to make the backup (because it can't figure out what to increment), and just sit there. Oh, and apparently I hadn't done anything that would actually give me notifications of a backup failure.

So, after spending a good hour kicking myself for NOT changing the default (and mind you, it's changed locally; the user systems back up in 1-gigabyte chunks, which the new setup now does as well) I used a different login to OneDrive (you get I think 7 now with Office 365) and set up a separate backup, with all the same source directories, just... with 1-gigabyte file chunks. Completing the upload took several days. Problem was, this was still on the OMV install. Getting this working again was absolute priority; we had no usable offsite backup while that was non-functional - the Docker migration could wait.

I really wish it hadn't. Most of what comes next wouldn't have happened at all.

When I finally got that working, exported the working backup with a much smaller file count, and got the NAS migrated over, I went to set up the Docker version - of course using the excellent LinuxServer.io images. Those images call for mounting the backup destination (if locally mounted/mapped) in /backups, the configuration directory in /config, and the source files in /source. If you don't do bind mappings, it will automatically create those as volumes when you deploy it. I had /config and /backup (note the lack of an "s") mapped, and mapped the base zpool folder as /source. This was fortuitous. I wasn't actually using the local backups destination, but it let me know what would happen if it wasn't used. Next I went to import. Hoo-boy.

See, the path for a lot of this stuff was /srv/<pool-name>/<dataset-name>/. It worked well. Changing it to /source/<pool-name>/<dataset-name>/ was... not a happy thing. It sort of choked on it. Rebuild of database was getting stuck and taking forever. (Oh. To make matters worse, I didn't grab an export of the config from the final day - I used one that was done three days before the old server went offline.) So I thought I was smart. I edited the export file (which is just JSON) to change srv to source. That didn't help. Pretty sure it didn't break anything though.

It was about this time I noticed the extra volume (I don't use them, since I prefer bind mounts for any number of reasons) that was created and I realized I'd misconfigured the mount for the /backups folder. Which meant it dawned on me that all I really had to do was change where I was mounting the dataset. I hopped into Portainer and used it to get a console, confirmed there was nothing in /srv, as expected, and just mounted the zpool root to /srv, just like on the base system. Still didn't really like it and the database was rebuilding endlessly.

So I cheated. I went and hooked up the old OS's drive, found the Duplicati configuration databases, and copied them over to the NAS. Note that on a normal Duplicati configuration outside of Docker, the configs are stored in /home/<username>/.config/Duplicati. Docker doesn't do this - it stores it in /config/Duplicati in the container (so whatever you've mounted as /config will have a Duplicati folder). Note, this is NOT in the subfolder .config in there. Copying the SQLite database file into that directory, along with the Duplicati-server.sqlite into the directory one level up, allowed the next start of the container to not only have everything needing to be rebuilt already there, but for it to run like normal. I did have to go into the Duplicati backup job, and under Advanced > Databases and tell it where the .sqlite file was located, but once I did that (it was referencing the old system's location) everything worked fine. The only hiccup was my /home folder was configured - as it usually is on Ubuntu - with a group that was the same as my username, and the backups relied on having the group set as users. Changing that has fixed the issue and we're all good to go.

See, I really hate it when people do stupid idiotic things. I hate it even MORE when I'm the one doing them. So I figured I'd write this up to hopefully help someone down the road some day. Lessons learned:

  • Do NOT UNDER ANY CIRCUMSTANCES ignore your backup jobs for ages. Check them regularly. We were fortunate - we left on vacation the day after our last backup completed successfully, and we were gone for two weeks, so we know nothing really changed in there (certainly nothing deleted anyhow), and ZFS snapshots would allow us to recover. Had we had multiple drive failures or a fire, those were lost for good. And that is everything - photos of our wedding, our daughter's birth, and my best friend who passed away many years ago now; our tax documents; my wife's novels in progress - you get the point.
  • If you have time to screw it up, you have time to get it right instead. Yes, I needed to get my backup running ASAP. Pulling the Docker image and setting it up on the existing system right then and there would've taken a little longer, true. But then I wouldn't have spent hours fixing the oddities created by converting from local install to Docker - it would've been Dockerized from the start of the new backups.
  • Just because you're not going to use it anymore, don't get rid of it. I used a spare SSD so I could set the new install up ahead of time. I could've just nuked the old install as soon as I was done, or installed over it with a disk image. Because I didn't, I had everything. True, I did it as a rollback method - the new system doesn't work? Just toss the old drive back in! - but that doesn't change the fact that small pieces were still available afterwards to fix things.

Now that that's been jotted down for posterity, I'm going to go doublecheck that the last backup run worked okay, consider ways to send myself notifications, and then go kill some people in Overwatch before bed.