Continuous Snapshotting Filesystem
18 points by natfu
18 points by natfu
I love the idea, but never got to actually use it. It is meant to be bulletproof, but from what I read there are a few quite rare conditions that can happen and break it. The problem is that there is no fsck. It is mentioned as a problem even in the man page for nilfs-tune.
But the idea of a relatively simple file system that is mature and in kernel already implementing a continuous snapshotting is very alluring.
I remember reading years ago NILFS2 was in the middle order of magnitude in case of bugs from the presentation about filesystems fuzzing from 2016 [0].
Time to first bug:
Hours:
Minutes:
Seconds:
I need to check if someone redid the experiment later. Though I would hope that most filesystems are more robust 10 years later.
The last time I tried using nilfs2 (precompiled into the kernel by the distribution, iirc) I would run into (different) panics left and right.I loved the idea, but the execution left a lot to be desired.
I used to have Zrepl make a snapshot of every ZFS dataset on my computer every single minute. Had to stop tho cause Zrepl always snapshots each dataset one at a time, instead of doing one big snaphost per pool, so sometimes it would take more than a minute to go thru all my datasets...
I'm curious, do you have any long-term backup/archival system set up in parallel to zrepl? If yes, then how? I was looking into using zrepl for a hybrid snapshot-backup setup (i.e. make snapshots every N, then turn every M-th snapshot into an actual backup), but I was told that zrepl does not support any kind of external hooks or integrations so I gave up on it and kept using my own homegrown solution.
The "one at a time problem" sounds fixable, but I don't know if I should spend any more time on zrepl.
zrepl supports replicating to another zrepl instance or to another pool. For VMs, I use it to pull snapshots to another machine.
For my NAS, I use a slightly more complicated setup. I have a remote machine that exposes a zvol over iSCSI over Wireguard. The remote machine uses zrepl to do decaying snapshots of that zvol.
On the NAS, I create a zpool over GELI (full-disk encryption). I then use zrepl to replicate snapshots to that pool. This (assuming I did it correctly) has the following properties:
The bit of this that’s relevant to your question: you can use zrepl to back up to anything that can look like a block device locally. Most cloud storage things have CUSE wrappers that can expose them as block devices (or as filesystems that you can do a loopback mount from), so the same model would work. If the storage can also do snapshotting (again, most cloud storage things can via external policy) then this also protects you from compromise of the client.
For my NAS, I use a slightly more complicated setup. I have a remote machine that exposes a zvol over iSCSI over Wireguard. The remote machine uses zrepl to do decaying snapshots of that zvol.
On the NAS, I create a zpool over GELI (full-disk encryption). I then use zrepl to replicate snapshots to that pool.
OK, so you expose remote storage as a network block device, then create ZFS over that block device, and replicate datasets onto that ZFS? And then the remote storage itself is a ZFS instance which is snapshotted independently. Is that correct?
you can use zrepl to back up to anything that can look like a block device locally. Most cloud storage things have CUSE wrappers that can expose them as block devices (or as filesystems that you can do a loopback mount from), so the same model would work
I use borg (which is a file-level deduplicating archiver) to store my long-term backups. I do not use cloud storage or anything block-device-shaped, nor would I want to: creating a ZFS over a remote block device (esp. an actual cloud storage exposed as a block device!) just to replicate snapshots onto it sounds horribly wasteful and inefficient, not to mention that it would never work fast enough to be practically useful.
Random 128K I/O against S3 or B2 object storage was what, 5-10 IOPS at most when I last tried it? That would never fly.
I'm curious, do you have any long-term backup/archival system set up in parallel to zrepl?
Every snapshot got replicated to my NAS, and once a day my NAS would place a hold on the newest snapshots, mount them, then run Kopia to back everything valuable enough to an object storage provider. Since my NAS exploded, I only use Zrepl for local snapshots, and have Kopia on my PC just take its own snapshots before backing up to the cloud.
but I was told that zrepl does not support any kind of external hooks
The "one at a time problem" sounds fixable,
Zrepl does support running commands before and after snapshots are taken. In fact, that's where the one at a time problem comes in. The hooks can filter down the list of datasets they run before and after, and for a long time the devs didn't know how to handle that while making multiple snapshots in one go (and they seemed a bit skeptical to the idea that anyone would actually want the most obvious behavior of "do all pre-hooks, do all snapshots, do all post-hooks".
Merged into master there is now an option to use zfs snapshot -r to make a recursive snapshot from a particular dataset and all its descendants, which of course doesn't work if you want to filter out some of those descendants. I don't know why they don't just pass multiple datasets to the command instead, since that's just as atomic.
and once a day my NAS would place a hold on the newest snapshots, mount them, then run Kopia to back everything valuable enough to an object storage provider
Did you implement that by hand, or is that integrated into zrepl somehow?
What I'm looking for is something zrepl-shaped, but with one change: after it runs all relevant scheduling/snapshotting/thinning-out steps and comes up with the final list of snapshots to act upon, I want an option to just give these snapshots to an arbitrary external command in lieu of doing regular replication.
The Kopia thing was completely separate from Zrepl. I just had Kopia scheduled to run once a day with cron, and Kopia was set up to act like that.
I want an option to just give these snapshots to an arbitrary external command in lieu of doing regular replication.
If you just make a snapshotting job without replication, shouldn't the snapshot post-hook in Zrepl be usable for this? https://zrepl.github.io/v0.6.1/configuration/snapshotting.html#command-hooks
In GEFS, I'm not quite as aggressive, but I do take a snapshot once a minute by default.
It's saved my ass over and over again.