Asterinas: A Linux ABI-compatible, Rust-based framekernel OS

57 points by jmillikin

I’ve always wondered how good starnix is on fuchsia: https://fuchsia.dev/fuchsia-src/concepts/starnix

jmmv

I guess the project is fine, but I really don’t like how they frame it: “memory unsafety was the root cause of the massive Crowstrike outage, and Rust fixes that!!”. No, it wasn’t / doesn’t. If the kernel had been written in safe Rust, a misplaced unwrap() would have had the same effect of a correlated failure, for example. The real root cause was the lack of testing and a staged rollout on Crowdstrike’s side (and their postmortem eventually said the same). I did elaborate on this back in the day in https://blogsystem5.substack.com/p/crowdstrike-and-rust.

matklad

If the kernel had been written in safe Rust, a misplaced unwrap() would have had the same effect of a correlated failure, for example.

Not necessarily, you can ensure that all the calls to outside of TCB are protected by catch_unwind, and that all state changes are transactional, through library design.

Not sure if that’s what asterinas folks are doing, it looks “not quite yet” from the cursory look at the code:

https://github.com/asterinas/asterinas/blob/a2aedcfb3afa31d8cf27b9b458df7103e3c68bfe/kernel/src/thread/task.rs#L109

But, yeah, I don’t see how anything similar can work for infinite loops.
- lina
  
  catch_unwind doesn’t really work properly with shared state multithreading. You end up with mutexes stuck locked or poisoned, and shared data in an inconsistent state, which then most likely still takes down a significant chunk of the system.
  
  The Linux kernel effectively has a catch_unwind around every thread, but a kernel oops (not panic) is still very likely to effectively bring down the system or at least deadlock critical functionality, for this reason.
  
  If you truly have transactional shared state I guess you could avoid that, but it seems like that would impose some serious design constraints on everything…
lina

Yeah, the example is really not great, although Rust still helps with that kind of failure mode. But you need more than just no unsafe code, you need to ban unhandled panics (which is possible, but different), or at least treat them with the same rigor as you do unsafe code.
- jmmv
  
  OK, so then you have the possibility of an infinite loop stalling the kernel. Or a recursive algorithm exhausting the stack. Or a deadlock. Or the kernel looking at the hardware funnily because of a subtle coding bug. Etc. There are many ways to make the kernel break, and a programming language isn’t going to protect against them all. (I’m sure you can think of more ways than I do given your experience.)
  
  Still, don’t get me wrong, I’m all in for a Rust kernel for many, many reasons (security, expressing and enforcing invariants in APIs, etc.) – but the Crowdstrike example is misplaced.
  - lina
    
    The Crowdstrike example was specifically a logic error that panic-less Rust would have prevented though. Of course, there are other kinds of bugs that it would not.
- Miaourt
  
  If I understood right, a “framekernel” is an usual monolith kernel but with two separated privileges, one gets rust’s unsafe ability, the other none ? I must admit, giving the idea that it lies between a monolith and a microkernel arch is quite misleading.
  
  Happy to discover the project tho, Rust-based Linux ABI compatible is a nice goal !
  - jmillikin
    
    I think it’s an evolution of the idea that a userspace application written in a completely memory-safe language can be turned into a unikernel with no loss of security. The security primitive of address space separation becomes a mechanical property of the code, checked by the compiler.
    
    If you start with an seL4-style microkernel (audited C kernel + userspace drivers) and then rewrite the userspace in safe Rust, then you can fuse them back together to reduce the syscall boundary into normal function calls. Then you take the C “inner kernel”, rewrite it in full-power Rust, and you’ve got a monoglot memory-safe monolithic binary that is structurally a microkernel but can run a Linux userland.
    
    4ad
    
    You’re selling seL4 short.
    
    seL4 is not an audited C kernel. seL4 is formally verified from the specification to the machine code. The specification itself is proven to enforce integrity and confidentiality, and the machine code artefact (at least on ARM and RISCV64) is proven to implement the specification precisely (it does not assume a correct C compiler).
    
    It’s not so much that “seL4 is implemented in C with a bunch of proofs about the implementation”, it’s that the Isabelle/HOL defines exactly what seL4 what is allowed and isn’t allowed to do, as well as proving that it has the desired properties. The C code is a refinement of the Isabelle/HOL specification—necessary because C code can be compiled to machine code, while Isabelle/HOL cant be, directly. The C code is proven to be a refinement of the Isabelle/HOL spec, but for all practical purposes, seL4 is written in Isabelle/HOL, not in C.
    
    One not familiar with Isabelle/HOL, or formal verification in general, might get confused about this specification vs. implementation distinction. In colloquial usage, for run of the mill software, specification is something vague, small, usually written in prose, and usually without proofs done about the spec itself. The implementation is usually big, complex, messy, and at best informally following the spec.
    
    Not so in this case, there is more Isabelle/HOL code than C code (by orders of magnitude), and the spec is not only mathematical, but precise, it defines exactly what seL4 is—it is seL4. The behavior allowed by the refinement of the spec (written in C), is less, not more than the behavior allowed by the spec.
    
    As for Asterinas, as they themselves cite in their paper, the idea of running in the same address space provably safe modules (in this case proved safe by the Rust type system) is not new, for example, Singularity did exactly this. And of course, there is also MirageOS which does this with OCaml.
    
    Not to detract from their achievements, what they are doing is great and necessary.
    
    jmillikin
    
    seL4 is not an audited C kernel. seL4 is formally verified from the specification to the machine code.
    
    This might just be my unfamiliarity with the preferred terminology. I don’t consider a C codebase generated from a machine-checked specification to be materially different from a C codebase with its own accompanying proof of correctness, and I’m using “audit” in the general sense of humans having satisfied themselves that the proof applies to the actual machine code – I don’t mean “audit” in the sense of an ISO 9001 or SOC2 checklist exercise.
    
    The seL4 codebase is written in C and to the best of my knowledge that C is hand-written, not generated in the sense that Isobelle can be used to generate Haskell from a proof. That it’s proven to correctly implement the spec written in Isobelle is of course valuable, but it seems strange to claim that the presence of a machine-checked spec makes the proof qualitatively different from other proof systems.
    
    One not familiar with Isabelle/HOL, or formal verification in general, might get confused about this specification vs. implementation distinction. In colloquial usage, for run of the mill software, specification is something vague, small, usually written in prose, and usually without proofs done about the spec itself. The implementation is usually big, complex, messy, and at best informally following the spec.
    
    I’m not sure this is true? A specification would typically be something like WebAssembly or JSCert. There’s no point to a specification that is vague and imprecise, that’s what RFCs are for.
    
    junon
    
    If I recall correctly seL4 is not generated. They write everything in Isabelle/HOL and then hand translate it to C.
    
    matklad
    
    with no loss of security.
    
    I think a clarification is warranted. Abstractly, yes, a memory-safe language can give you a security boundary (arbitrary, malicious code will have only bounded effects). I am 0.9 sure that this is not true about Rust where you syntactically check for absence of unsafe keyword.
    
    On the meta level, Rust doesn’t try to give any guarantees about malicious code, so there should be no expectations that it can achieve them.
    
    On the object level, there are known soundness bugs in compiler implementation, so it is possible to circumvent safe Rust guarantees if you try hard enough. And even if you fix specific known holes, you have still more work to do to arrive at convincing demonstration that no more holes are left.
    
    More concretely, suppose that you have a kernel which relies on Rust type system to enforce isolation. Now, someone gives you, say, a compression codec you can plug into your driver, and you check that:
    
    the codec’s code doesn’t transitively contain unsafe
    
    the codes interface is bytes in/bytes out, no IO.
    
    If you use this code, it still can read arbitrary memory, by making use of some soundness hole!
    
    As far as I can tell from quick skimming, the post/paper don’t clam general security in the face of malicious code, and only talk about memory safety (non-malicious code treating malicious inputs), so its all good! But it’s worth emphasizing the gap between the two claims, as it might not be obvious that there are, in fact, two different possible claims here.
    
    jmillikin
    
    Note that in this context we’re not worried about malicious code, which would be caught be code review when contributing to the outer kernel. There is a worry about implementation errors in the Rust compiler/stdlib, but those would be “non-adversarial code does the wrong thing” instead of something like rust#25860.
    
    Similarly, you’re not going to be running arbitrary userspace-provided code in the kernel. There’s still a boundary between the kernel and userspace, it’s just closer to a monolithic kernel (like Linux) than to a microkernel (like seL4).
    
    The limitation compared to a traditional monolithic kernel would be in performance, not security, I think. You can’t implement a machine code JIT in safe Rust, so something like BPF either needs a slow non-JIT interpreter or you’d need the entire BPF JIT to be part of the audited TCB. Important low-level optimizations such as MaybeUninit and the unchecked slice methods would be unavailable.
    
    Miaourt
    
    To me, it’s hard to pretend being able to audit a whole kernel + drivers, even with the guarantee given by Rust.
    
    Mistakes are still mades, and can still be exploited, putting bound and capabilities checking behind IPC/syscalls is still the way to allow a driver to completely fuck up and still having a sound system.
    
    Say a driver has a non-malicious flaw being exploited, not caught in reviews, you simply don’t have to guarantee that a seL4-style microkernel provide: that solely the driver + the capabilities given will be impacted ?
    
    (Feel free to correct me, as I’m in no ways into kernel dev)
    
    david_chisnall
    
    I think there are two somewhat distinct threat models to reason about here:
    
    Someone writes code that contains a bug, what’s the worst it can do?
    
    Someone writes actively malicious code and submits it to the kernel, what can they do?
    
    For the first threat model, I would expect the Rust type system and a sensible set of APIs (cap-std is great for userspace, for example) to be able to do a lot. You can provide a Rust component with more rights than it should have, but you can do that with a microkernel too and a lot of bugs come from this kind of thing.
    
    For the second, I don’t think Rust helps nearly as much. There are soundness bugs in the Rust compiler that are very hard to trigger accidentally, but I don’t think I’d want to trust supply-chain security on ‘can reviewers spot combinations of language features that exploit a bug in the compiler to do a malicious thing?’ There are some patterns (such as the ones used by cve-rs) that linting tools will now catch.
    
    It’s possible that Java has made me too skeptical of language-based security. Java had a really nice model for running programs that contained different trust domains with the JVM providing the isolation. It never worked in practice and all of the infrastructure for doing it was ripped out of the language a couple of releases ago (and from Android-flavoured Java from day one).
    
    The problem was that you needed one bug in the unsafe bits to be able to fully escape. Language-based protections start with an attacker having the ability to run code in a convenient place for attacking the system. A single bug in unsafe Rust code that’s callable from a malicious component can completely undermine the security. And all of this was before we knew about speculative side channels. If you can write arbitrary safe Rust code, it’s trivial to write spectre gadgets that can disclose all of memory from a kernel component. You’re reliant on code review to prevent these things and preventing bugs in code review is hard enough when they’re accidental.
    
    cmcaine
    
    Idk, it seems similar to a microkernel to me.
    
    A microkernel has a small core (TCB: trusted computing base) that handles memory safety and an API that trusted processes (drivers) can call to extend the microkernel with extra features. The claimed benefits are usually that a crashing driver can be restarted rather than killing the whole kernel; and sometimes that the API is capabilities-based and this reduces the damage a malicious or buggy driver can do.
    
    A “framekernel” looks like pretty similar to me, but instead of using separate processes, it relies on the rust type system and compiler to give a boundary between the inner kernel (TCB) and the drivers.
    
    With both systems, the implementation needs to be good if it is actually to avoid memory unsafety in drivers or to recover if a driver crashes.
    
    It’ll be interesting to see if the authors really are able to pull it off and keep the TCB small; the drivers safe; and if rust turns out to have few enough soundness issues for it to all work out.
    
    junon
    
    I’m writing an OS from scratch - also in Rust - and my assertion is firmly that no, this is not enough. To firmly assert that no malicious activity can reach a protected resource the “TCB”, as it were, needs to deny the existence of the resource entirely to anything except the applications that actually need it and have been given explicit permission to use it.