Security advisory for Cargo
41 points by freddyb
41 points by freddyb
I was recently working on a program that writes out a bunch of directories and files, and learned how tricky it is to avoid following symlinks on files that already exist in the destination (which would make you write to an arbitrary location). You want to follow symlinks for the initial path the user gives, but never after that, using O_NOFOLLOW (or RESOLVE_NO_SYMLINKS if you’re passing an entire subpath instead of just one more component). But there is no way to replace an existing symlink in one step, so to do that atomically you need to create a temporary file and then rename it.
Just checked the CVE now and, yup, it’s symlink related. I sometimes wonder if symlinks were a mistake.
Tar is kind of notorious for this. I remember the Julia tar package was written carefully to avoid these surprising behaviours (and it deliberately does not support all the features of the format)
I suspected problems like these. For https://lib.rs I need to process the tarballs, but I don't write any of these files to disk. Everything from crate tarballs is processed streamed in memory.
This is the big advantage of zip over tar as a container format. Tar is the tape archive format. It is designed for streaming access and, in the worst case, you have to process the entire file to read any given file. This is compounded by compression, which it layered on top using stream compressors and not aware of the structure.
In contrast, zip files are designed to allow individual files to be extracted individually. The down side of this is that compression is per-file and doesn’t take advantage of any redundancy between files (which would be common in source code, where you have identifiers exported from one file and consumed in another). Some newer zip-like formats (and maybe retrofitted to zip?) have the option of a global dictionary that can be shared between all files to reduce space.
The secondary benefit of something like zip, which doesn’t require extracting to the filesystem to use, is that you can take advantage of filesystem features that don’t exist on the host, most importantly case sensitivity, but also potentially symlinks (Windows has the, but you must enable developer mode to use them. Zip doesn’t have the, but some newer zip-inspired formats do).
This was one of the things I liked about Java. Your source code had a directory layout that mirrored the namespace and class structure and the compiled form could, but could also be a zip file (with the .jar extension) containing the same structure. The zip file was easy to distribute and meant that you could use long file names even on Windows 3.x because the zip file’s contents were never copied out to the filesystem.
It would be nice if more new languages could copy the nice bits of that design.
I'm surprised that no advisory was created on https://rustsec.org for tar.
There’s now a PR open to add them: https://github.com/rustsec/advisory-db/pull/2737
I thought that {tar folks, Rust security team} would have done so right after publishing, but better late than never :-)