My Backup Infrastructure, 2025 Edition
38 points by runxiyu
38 points by runxiyu
I too have managed backups in a somewhat similar fashion. My issue with local backups is the yield on the work seems low.
Like I have been managing an ever growing local/remote backup for 16 years. But the number of times I have been like “thank god I had that backup” in that entire time period is 3 times. It feels like the physical equivalent of storing every piece of paper trash mail I ever get.
I’m also stressed about the posters (and my own) post-death plan with this. It took a lot of thinking to come up with a strategy for how my spouse could actually get the stuff on B2 if I got hit by a truck. Like first, they’ll need to remember “B2 is where the stuff is backed up” when they see a monthly credit card statement.
Assuming they can get access to B2 web panel with my post-death password sharing process, now they’ll need someone to walk them through the encryption. Except the B2 web panel doesn’t handle any of this great. So they’ll need a nerdy person to pull down and decrypt each object through their API. This on top of dealing with all the work that comes when someone dies and their own emotions. Even if I write the script to do it, there’s nobody in my immediate family with the technical skills to even run a shell script.
I often worry that this entire backup process is largely pointless, sort of a “data hoarding hobby” I gave myself that serves little practical purpose.
Thanks for sharing. I also maintain a daily backup for the last 15+ years and have been very glad I have it many times.
It sounds like you have a low rate of mistakes in your work, e.g. you don’t often delete files you actually still needed that you would then need to restore?
If that’s the case, maybe the backup can allow you to work a little less hard because you have the extra safety net?
(And if not, yeah, just be less dilligent / strict about backups?)
I’m also stressed about the posters (and my own) post-death plan with this. It took a lot of thinking to come up with a strategy for how my spouse could actually get the stuff on B2 if I got hit by a truck. Like first, they’ll need to remember “B2 is where the stuff is backed up” when they see a monthly credit card statement.
This was a concern for me as well. I went the old fashioned wrote and wrote a document that spells out each step on the way to recovery. This lives in a secure location that my wife knows about. In addition, I have a close trusted friend that is far more technical than I am who also knows of the existence of the list (but hasn’t seen it) and that he may need to help her recover things.
I haven’t been as good at exercising that recovery mechanism, we should probably practice it more often. I’m also not as happy with how many account recoveries will likely require her to receive a text message on my phone number (which is problematic if I go out in a car accident or some such that renders the phone unusable and forces her to first restore service). I move as much of the recovery codes as I can to TOTP-based techniques and I have considered moving the phone verification to Google voice numbers which just point to my phone (and could point towards hers), but I’m leery of the service disappearing and being locked out of important accounts.
All that said, I am reasonably confident that if my wife survives me, she can get everything.
While I am not familiar with how these things work in your own current locality, in the US the way this would be done correctly is by passing along instructions to access your digital assets through the lawyer you use to set up your will.
Restic is a great tool. It’s fast for my use cases. I tend to double my backups - one restic session to a local restic server and a second paired with rclone and sent off to a cloud provider.
Instead of Backblaze I would use something like OVH Cold Storage which is also S3 compatible and is using Magnetic tapes behind: https://www.ovhcloud.com/en/public-cloud/cold-archive/
At a previous workplace, there was a cron job running every working day at 5 PM starting a backup from the servers to a hard drive locked in a safe. The engineering team had a recurring event in the calendar to be reminded and we would take turns taking the disk and plugging it to the rack a fews rooms away. The backup job was sending a mail when complete to have someone fetching it. It became kind of a ritual but was always nice to know we had the backups done and not only in the cloud.
How do you correctly combine something like Restic with cold storage?
Restic supports S3-compatible APIs and so does the OVH Cloud Storage offer I linked (https://help.ovhcloud.com/csm/en-public-cloud-storage-cold-archive-getting-started?id=kb_article_view&sysparm_article=KB0047338).
I haven’t tested it, to be transparent, but I would expect it to work as they present using the AWS cli utility like a normal S3 bucket.
By contrast, my setup is to have all public work duplicated across various independent git hosting providers, and all school/private/privkey/etc files pushed around a few computers (at school and at home), including but not limited to /srv/git/school.git which is 30 gigs right now. Not the most elegant but works.
Similar for me. Work, dotfiles, passwords and knowledge wiki are generally disseminated on various machines plus at least one source forge. Music, images, et all, are synced among different devices with rsyncthing.
And similar to OP, the leftovers of the home folder get synced with restic to backblaze on a manual non-entirely periodic schedule. :D
[edit] fixed media syncing with syncthing not rsync.
I actually version control all large media files too
What with? I tried this with git long ago and it had some kind of n^2 memory usage issue with large files. Checking in a 1 GB binary file tried to use like 50 GB of memory, which I did not have at the time. Did this get fixed at some point?
I store everything as normal files in the git repo and routinely git gc
them. git-annex and git-lfs allow files in the past to be missing/discarded if the annex/lfs backends don’t work correctly; most people would want the annex/lfs behavior, but I intentionally want to check each version of the large binary files into the repo itself
Not sure what’s effective here, but I have a passing awareness of git-lfs for this https://git-lfs.com/
Yeah at the time I tried this, git-lfs was relatively new and not supported well. I use it in production now and it does great. <3
I love restic. For those who want to go a little deeper on this rabbit hole, I wrote up my Linux backup setup which uses restic
, rclone
, and Backblaze B2. Set up a little differently than this post. I also compared and contrasted it to my proprietary backup setup on macOS using Time Machine and Arq 7. And also contrasted it to my Windows config using Arq 7 alone. Always with Backblaze B2 as the offsite cloud storage, and always using a local USB SSD for local backup repos, to get 3-2-1 backups everywhere.
I could leave the drives plugged in all the time, and run rsync automatically every day, but my MacBook Air doesn’t have enough ports for that and, also, this risks propagating data loss to the backups, which defeats the purpose.
I definitely got bit by this one. It wasn’t external drives, but it was my NAS that was the local part of my file backups. I had a cronjob calling rsync on a (nightly? it has been a few years now) basis for a number of folders. For various reasons (not a commercial project, low lift, had someone else’s personal data baked through it due to voluntarily helping a friend, etc) these folders all had a local git repository, I hadn’t thrown it onto any forge. Well, the friend I was helping with one asked a question a few months after I had finished the code for him and when I looked, I realized rsync had been dutifully copying an empty folder for the last ?? amount of time.
Had I been watching rsync execute, I would have had a better shot at realizing it was attempting to sync 0 bytes and been able to snag my local backup.
for self hosting backups there is more involved steps than just the basic “rsync, restic, backblaze” combo. I follow a similar setup. I use syncthing for keeping pictures mirrored from my phone to my nas i use a central nas to act as my local source of truth i use a secondary raid0 (2 disk) in a PC to act as a local backup in case the nas dies I use systemd timers for backing up self hosted applications, doing a db dump or picture backup, etc i use autorestic to backup various directories and targets i use backblaze with restic as my remote backup storage soluiton i’ve written backup/restore scripts which i’ve exercised to actually restore the data periodically
it’s been a lot of work that i really dont enjoy but the results give me confidence that i’ll never lose data. I’ve suffered various nas failures and data losses before. What I didn’t care about losing or plan for ended up hurting when I lost it!
Hi, I just added compression + encryption support to s3m for streaming backups to S3 ( https://github.com/s3m/s3m/), but it breaks resumable uploads when enabled. Still looking for a clean way to handle this. I found your post while researching, would love it if you give it a try and share feedback.
Should work with Backblaze, (https://s3m.stream/config.html)