rsync's defaults are not always enough
21 points by l0b0
21 points by l0b0
May I plug two of my tools? ;)
https://github.com/laktak/chkbit allows you to make hashes and test them with one source (rsync needs two)
https://github.com/laktak/rsyncy gives you a progress bar for rsync
The venerable mtree should be better known! It’s used in the BSD system builds to construct the standard directory hierarchy, but it can also generate and verify checksums. It’s mildly annoying to generate checksums because it isn’t the default, so you have to use a command like,
mtree -c -k mode,uname,gname,time,size,sha256
Sadly the version in Mac OS is antediluvian and only supports dodgy hash functions. (Fortunately when comparing existing files the relevant attack is second preimage so sub-2⁶⁴ collision attacks aren’t catastrophic.)
(The slightly weird security discussion in the mtree man page is basically saying, you can use mtree instead of tripwire – tripwire was created in response to a malicious intrusion, so it’s very 1990s cipherpunk threat modelling.)
rsyncy’s README mentions --info=progress2
but doesn’t have a comparison with it. Why would I want rsyncy if I use that option already? Is it essentially different formatting?
It really is a progress bar for rsync
[########################::::::] 80% | 19.17G | 86.65MB/s | 0:03:18 | #306 | scan 46% (2410)\
rsync will output the progress in between the list of files - rsyncy is just a wrapper for rsync, it will hide those progress lines and render a status/progress bar instead.
Ah right, progress2 doesn’t do a progress bar but a percentage of the (currently known) total.
This post is about how rsync trusts the mtime and size of files to determine uniqueness. Which as the author notes, can easily be wrong. The usual culprit is someone being clever backdating files with touch
.
But -a
doesn’t do lots of things you may need. The big one is hard links: you have to use -H
if you want hard links at the destination as well. --sparse
is sometimes essential if you’re dealing with large files with lots of empty parts. There’s also -X
(for “extended attributes”) and -A
(for ACLs) for unusual filesystems.
And this is solves a different problem but --partial --append
is helpful if you’re copying big files over a slow medium.
I lost a lot of faith in rsync when I found out that if you control the path someone passes to it to run on the remote server (i.e. rsync username@server:'user_supplied_string' ...
), you get instant RCE. The user_supplied_string is evaluated, unquoted, in username’s default shell.
This interacts particularly badly with fish-shell and filenames with (...)
in them, because if you do rsync user@server:'file (yes)'
it will run the yes
binary on the server.
Oh, and I’ve just discovered that in the 13 hours since I ran that command to check the syntax, I’ve had yes
running in the background on my server because apparently killing rsync with ctrl-c doesn’t kill the ssh children properly.
It’s been eating half a core of CPU this whole time :(
I’ve been bitten by this one amongst others (I used to use rsync –link-dest… as a backup tool), and it’s why rsync isn’t a nice tool to do backups with. It’s just too easy to shoot yourself in the foot with it.
I now reach for a dedicated backup tool instead :) - my personal favourite is borgbackup, but restic and others also exist.