Traversal-resistant file APIs
21 points by freddyb
21 points by freddyb
I wonder… does something like that exist in rust? I wouldn’t mind starting a crate like that as a side project :)
There are a couple of crates that do this. The most popular is cap_std. It has a basic comparison of some other crates that do this too. You can also call openat/openat2 directly if you only need to run on Linux.
I would definitely like to see this incorporated into the standard library though, it’s a pretty great extra security feature especially for things like web servers.
Could someone help draw up an example of an attack that this would (uniquely or best) prevent? I don’t think I really understand this.
This is what I can understand so far:
In my own limited personal experience, ToCToU errors are handled by eliminating the timing gap. A simple example in Python involves switching from “LBYL”-style to “EAFP”-style.
Therefore, it seems to me that this mitigation assumes that the timing gap cannot be eliminated. The usage must occur after the check, so it then becomes necessary to harden the check. Presumably, the mechanism for checking must exist (probably by resolving to an absolute path and checking the relative location of this path against an allowable location.) The post mentions that this uses openat(int dirfd, const char *pathname, int flags)
on Linux. The code shows that it uses O_NOFOLLOW
to disallows symlinks in the pathname
. Of course, openat
does not require that the pathname be within the directory—it will still follow ..
if provided.
So it seems like the use of this is that you create an os.Root
with some validated root path, you are (still) responsible for checking that all paths passed to func (*Root) Open
are relative and do not include ..
, and you are guaranteed that a malicious attacker cannot swap a symlink in the timing gap to effect a read or a write to a disallowed location.
This seems like a lot of work for a fairly minimal mitigation. I’m not familiar enough with these types of attacks to tell if the malicious user could use a hardlink instead of a symlink and bypass this mitigation. Additionally, the root path also has to be validated to avoid symlink attacks, so it seems like there is a ToCToU opportunity unless this is a daemon process that can create the os.Root
prior to encountering the malicious user. (If we were willing to disallow symlinks on both the parent directory and target file, then it occurs to me that we would simply open the entire location with O_NOFOLLOW
.) Presumably, the authors do not believe we can eliminate the ToCToU opportunity by first opening the path, then using a mechanism like readlink
on /proc
to see where the opened file descriptor is, and validating after-the-fact. Why can’t we can’t just resolve the target path to an absolute path and then open the resulting absolute path with O_NOFOLLOW
?
On Linux, I would solve this problem using a mount namespace, perhaps with a tmpfs. This wouldn’t work for filesystem locations that the process should have access to but the user should not (e.g., configuration files,) but only if those files are watched or opened in the course of running. (If they are opened at startup, then we exec
into the mount namespace.)
These methods all accept filenames relative to the root, and disallow any operations that would escape from the root either using relative path components (”..”) or symlinks.
So no, you don’t have to check for that yourself.
That’s what I thought, but I couldn’t spot it easily in the code-path.
I looked again and…
root.go:r.OpenFile
leads to root_unix.go:rootOpenFileNoLog
. It is root_openat.go:doInRoot
that appears to do normalisation of the path. Honestly, I’m struggling trying to quickly skim how that code works, but it is clear that this is where it returns an error if the path escapes the parent. The sanitisation seems a bit rough, however, since root, _ := os.OpenRoot("./allowed")
followed by root.Open("../allowed/some-file")
will fail, claiming the path escapes from the parent directory.
Overall, I’m still struggling to understand the motivation.
In case this helps: that path does escape from the parent directory, in cases where the parent directory was renamed.
The motivation given in the post is “[a]n unarchiving utility that extracts a tar or zip file may be induced to extract a symbolic link and then extract a file name that traverses that link”; such an unarchiving utility may reasonably trust the user provided Root.