Linux Kernel Rust Code Sees Its First CVE Vulnerability
69 points by weinzierl
69 points by weinzierl
I’ve found a useful side-effect of Rust in the kernel to be the resulting documentation changes that better define the behaviour of other kernel subsystems - kind of like how adding new architectures to the kernel helps strengthen abstractions/assumptions during the development of new bindings.
In this case, a subsequent documentation-update commit rust: list: add warning to List::remove docs about mem::take helped clarify the linked list footgun that led to this bug.
Sidenote: I personally found @lina’s amazing work in the DRM subsystem fascinating to follow, as pushing invariants into the Rust type system unearthed many invariants that were not previously explicitly documented.
This is what the whole "Rust doesn't prevent logic bugs" thing misses -- it certainly doesn't prevent all logic bugs, but it gives you the tools to eliminate many classes of logic bugs through construction and encapsulation, and it also (for the most part) lets you scale up local reasoning to global correctness. These are all very valuable at preventing bugs in practice.
The article is a bit thin, so here is the mailing list thread.
A certain other site had an appeal mechanism to have the more authoritative link swapped in if one submitted an aggregator link that happened to get traction. If there is such an appeal mechanism here, I'd encourage you to use it, because that mailing list post contained everything I think a reasonable person would want to know about the vuln. I don't see a competing submission that could even be merged into this thread https://lobste.rs/domains/lore.kernel.org
Note the other 159 kernel CVEs issued today for fixes in the C portion of the codebase
Also:
the offending issue just causes a crash, not the ability to take advantage of the memory corruption, a much better thing overall
Note the other 159 kernel CVEs issued today for fixes in the C portion of the codebase
I'm not saying that Rust code isn't less likely to lead to vulnerabilities than C code, but if this note is intended to evidence that it is, it's flawed. What proportion of C code to Rust code exists in the kernel? I'm guessing it could well be above 159:1.
I don't know much about how CVEs are issued, but it sounds like that's only the number that happened to fall on the same day as this first Rust CVE. What about all the previous days that Rust has been in the kernel without a CVE? Surely you'd have to add those, and then the ratio might look quite different.
Indeed. My point was that taking the single-day figures and extrapolating anything about the safety of Rust code in comparison to C based on just the raw counts is already flawed, I did not say that Rust actually causes more CVEs than C after accounting for the LOC metric over any longer period of time. Again:
I'm not saying that Rust code isn't less likely to lead to vulnerabilities than C code
Hasn't Rust been behind experimental? I thought they didn't do CVEs for stuff behind experimental.
I don't know honestly (about the CVEs) but you're probably right. Still, even if it's only been out of experimental for a week or so I just wanted to point out that the numbers for a single day probably aren't fair on Rust and I think the original comment was just trying to add a little perspective. Obviously it's very early days.
I'm aware. Its been a week. If they don't do CVEs for experimental stuff, we should start tracking from when they first backported Rust stuff post it being no longer experimental.
Why would you exempt experimental code from CVEs ? Users are exposed to the code, and CVEs are important feedback about the experiment. Same thing with staging drivers.
Anyway: this CVE is in Binder, which just got merged in the latest release (6.18). Rust was still officially experimental in that release. And FWIW, the patch dropping Rust's experimental status hasn't been merged yet, it'll probably land in 6.20.
This CVE is an interesting counter example: The fix is to disable a mitigation technique when rust is enabled in the kernel. This CVE is technically not about a bug in rust code, but it is clearly about a bug that only affects rust code. In June, long before the announcement to drop RfL's experimental status.
the offending issue just causes a crash, not the ability to take advantage of the memory corruption, a much better thing overall
I'm not sure how this conclusion has been reached - the information in the post is somewhat sparse. It seems to be saying that it causes a memory corruption that then results in a crash. But that's not what defines "not exploitable", the current symptom may just be a crash. If the memory corruption is "we erroneously zero a pointer" then it probably isn't exploitable, if the corruption is "we leave a broken pointer" or similar then all you know is that the current symptom is a crash.
A memory corruption error should be presumed exploitable unless you can prove otherwise, not the other way around, so this glib comment really needed to be explained in more detail.
Somewhat separately, the article about Android Binder IPC is interesting ... I didn't even know that was upstreamed
https://www.phoronix.com/news/Rust-Binder-For-Linux-6.18
Binder has been evolving over the past 15+ years to meet the evolving needs of Android. Its responsibilities, expectations, and complexity have grown considerably during that time. While we expect Binder to continue to evolve along with Android, there are a number of factors that currently constrain our ability to develop/maintain it.
Better wait for the LWN.net article, phoronix.com is known for sloppy and low quality articles.
Yeah. Remember the days of early Ruby adoption. People brought so much love to make that place home. Or the Perl days with incredibly well written books.
The main thing to note here is that, yes, an unsafe block was involved here, but the actual bug was outside of the unsafe block.
The fix patch is here: https://lore.kernel.org/all/20251111-binder-fix-list-remove-v1-1-8ed14a0da63d@google.com/.
And the ultimate unsafe block that gets called is here: https://github.com/torvalds/linux/blob/ea1013c1539270e372fc99854bc6e4d94eaeff66/drivers/android/binder/node.rs#L998.
This is important to understand, because it's often touted that "all you have to do is inspect unsafe blocks to ensure that Rust code will be safe," but this is untrue since unsafe blocks can "infect" callers.
Yes, this is a really important point. An unsafe block breaks the safety seal. It needs to be resealed, in the sense of having some boundary around it that guarantees safety, by ensuring that the internal invariants the unsafe block relies on hold. That boundary could be the function the unsafe block is in, or the module it's in.
To ensure that Rust code is safe, you need to audit each unsafe block. To audit an unsafe block, you check that safety is guaranteed at its boundary.
That's why unsafe code should be documented with a comment explaining why the code is safe. To audit it, you need to confirm that the comment is, in fact, accurate (which means validating that any prerequisites it claims do, in fact, hold under all conditions at some boundary).
Essentially, the claim isn't "all you have to do is only inspect unsafe blocks to ensure that Rust code will be safe", it's "all you have to do is start inspecting at unsafe blocks to ensure that Rust code will be safe". The unsafe blocks are the "roots" of the inspection if you will. Any code that those roots don't lead to does not have to be inspected.