Proposed Rust kernel extensions in place of eBPF

18 points by ysun

Talk: https://m.youtube.com/watch?v=ivcLS4LFfKE

Slides: https://lpc.events/event/19/contributions/2190/attachments/1798/3878/rex-lpc.pdf

k749gtnc9l3w

I failed to skim this quickly from the slides. If a part of the memory safety is based on the compiler, what is the gateway across the security boundary? Root-owned Rust compiler being run as helper program from a location configured as boot parameter?

ysun

If I correctly interpreted what they claimed in the talk, they have the same model as regular eBPF program, i.e. a compiler toolchain, program loader (basically replaces bpf(2)), no verifier but a fixup step for identifying and loading symbols, and a in-kernel runtime

So safety is completely dependent on Rust compiler (they only allow safe Rust code)

Also they claimed since eBPF program compiles to bytecode, and kernel JITs the program, Rex have better performance (~5% mentioned at the last portion of the talk) as it compiles to native code and they can utilize more optimization passes v.s. LLVM backend
- k749gtnc9l3w
  
  As far as I understand, the point of BPF/eBPF as bytecode is that both verifier and JIT are in-kernel, and the compiler toolchain is an untrusted convenience provided to the programmer. But even malicious bytecode cannot crash kernel.
  
  So if safety depends on the compiler, it is immediately a different security model, isn't it? Either they just go the «better loadable kernel modules» way, where each module is trusted at the loading time and needs to be provenance-tracked to check absence of runtime bypasses, or they have some interface to mark a specific Rust compiler trusted and accept source code at runtime (probably with caching of compiled code)
  - ysun
    
    So if safety depends on the compiler, it is immediately a different security model, isn't it?
    
    Agreed, tho slide pages 34 and 35 imply the program safety (in the sense of termination and exceptions) are handled by the in-kernel runtime. IMO in this sense it’s similar to BPF
    
    better loadable kernel modules
    
    In their defense, kernel modules can cause panic but assuming Rex is implemented correctly, their “BPF” program can’t crash the kernel
    
    or they have some interface to mark a specific Rust compiler trusted
    
    Slide 34 “trusted compiler”
    
    k749gtnc9l3w
    
    I am not sure I have the same page numbers in the slides PDF. Slide «№10» (page 33) says that memory safety is compile-time, and I have noticed the point about trusted compiler — so my question is basically at what point in time and who manages the trust.
    
    In their defense, kernel modules can cause panic but assuming Rex is implemented correctly, their “BPF” program can’t crash the kernel
    
    This is probably dependent on the question how trust in the trusted compiler is managed. Like, is it designed to be allowed in a container without simplifying too much root-inside-container breaking out?
  - fanf
    
    So safety is completely dependent on Rust compiler (they only allow safe Rust code)
    
    Hmm, last slide, last bullet point:
    
    Does the trust we put on the Rust toolchain make sense and how can we potentially make it more trustworthy?
    
    Rust has some known soundness bugs which haven’t been fixed partly because they aren’t practical problems in normal trusted code. If the source code is not trusted then soundness bugs have a very different complexion. Compare, for example, the treatment of type confusion bugs as serious security vulnerabilities in JavaScript implementations. Different sides of the airtight hatchway.
    
    ssokolow
    
    *nod* This sounds like another proposal I remember seeing on the rust-lang.org forums years ago, where the answer was basically "Nope. We're not willing to make our checks and transformations a security boundary. Neither are LLVM or GCC people. No full-fat optimizing compiler is."
    
    ssokolow
    
    I tracked down the thread I remembered.
    
    https://users.rust-lang.org/t/negative-views-on-rust-language-based-operating-systems/70449
    
    muvlon
    
    Yes, Rust is not designed to be a language sandbox. Its safety guarantees have known holes in very exotic circumstances that no well intentioned programmer would accidentally walk into but that are trivial for an attacker to exploit on purpose.
    
    ysun
    
    But this is based on the assumption of their “safety is completely dependent on Rust compiler”
    
    So safety is completely dependent on Rust compiler (they only allow safe Rust code)
    
    I think was I wrong on this statement since they do have runtime guards…
    
    k749gtnc9l3w
    
    If there are some binary code blocks (marked as not needing any symbols nor runtime-provided functionalitty) treated as «ready for direct execution», and they go corrupt the memory, not much a runtime can do. And I think trusting the compiler means that…
    
    fanf
    
    What I understood from the slides is that the runtime support is mostly a special panic handler that does resource cleanup without requiring a userland-style stack unwinder; the other part is a small linker.
    
    Their motivation at the start of the talk argued that it’s better to move the type safety checks out of the verifier into the compiler, so the verifier is ditched and the compiler becomes the security boundary. The reason was to get better error messages for type errors, especially when there’s a mismatch between the source language type system and the verifier’s type system. (Can’t have a mismatch if there’s no verifier!) It made me wonder if it might have been more helpful to get better source map output from the compiler so that verifier errors can be more easily related to the source code.
    
    david_chisnall
    
    Their motivation at the start of the talk argued that it’s better to move the type safety checks out of the verifier into the compiler, so the verifier is ditched and the compiler becomes the security boundary
    
    This is a bad idea. A compiler is very complex, a verifier can be designed to be simple. There is value in expressing more of the things that a verifier will wish to check in the source language’s type system.
    
    NaCl was the trend setter here. They demonstrated that they could build a very simple verifier for a handful of properties and build sandboxing, but still benefit from a rich optimising compiler. The verifier was small enough to be formally verified. Verifying end-to-end full abstraction (or the subset necessary for sandboxing) in a compiler is incredibly hard. CakeML is probably the only implementation that has actually managed it (CompCert doesn’t count because it does not make any claims in the presence of source code containing undefined behaviour).
    
    Rust, in particular, does not make any such claims. Rust’s type system is very powerful (and, with things like Verus, amazing) as a helper for the programmer. It provides a load of tools for the programmer to express intent and have the compiler reject their program if they have violated this intent. It does not assume that the programmer is malicious.
    
    We have decades of experience of people trying to use language-level protections as security boundaries. Java Applets were supposed to do this. Flash and JavaScript also both promised that nothing could violate the sandbox. All of the, we’re routinely broken because a single bug in the implementation is often enough for a sandbox escape. So you have reduced the problem to ‘if we can write a few million lines of bug-free code, then we can have security’.
    
    There is a reason that the major browsers no longer regard the JavaScript VM as a defensible security boundary and assume that JavaScript code will gain access to anything in the renderer process.
    
    moltonel
    
    What's the threat model here ? Rex might never be good enough against a malicious user, but I don't think bpf is either: loading a bpf is still a privileged operation ? If the guarantee is just "a bug in a rex programm should not crash the kernel", then trusting the compiler should be good enough ?
    
    Even if Rex can't be used in as many security contexts as bpf, the more writable/readable programs and improved perf make it attractive.
    
    edwintorok
    
    I think initially eBPF was allowed for non-root users. But due to many security bugs it got restricted to root only. And at that point the restrictions of eBPF could be cumbersome, and another system specifically designed to allow root users to safely run code inside the kernel on production systems might be useful. Still, a combination of compiler + verifier might be useful to retain. As said above, a compiler is very complex and could be buggy (although that would already apply to the existing kernel and modules if there are bugs in the C compiler, or if the code triggers undefined behaviour).
    
    https://support.scc.suse.com/s/kb/Security-Hardening-Use-of-eBPF-by-unprivileged-users-has-been-disabled-by-default?language=en_US
    
    muvlon
    
    But we already have a system to let root run Rust code in the kernel. It's called init_module.
    
    polywolf
    
    The cve-rs crate does exactly that, mainly by chaining an old unsoundness bug rust-lang/rust#25860 into various other interesting unsoundnesses (transmute, buffer overflow, etc.). At a glance, looks like Rex doesn't limit this at all.
    
    moltonel
    
    This is addressed at 9:25 to 11:33 in the presentation: Rex doesn't aim to defend against malicious programmers. And note that, despite a sturdier architecture, eBPF hasn't achieved that either.
    
    muvlon
    
    But privileged users can already load arbitrary code into the kernel, including code written in Rust. Is Rex simply a quality of life improvement over kernel modules?
    
    ysun
    
    I take it that they are basing their efforts on a broken axiom (at least as of now and partially)?
    
    orib
    
    How would you know that the kernel extension was written in Rust, if it's loaded into the kernel as a binary blob?
    
    The ebpf verifier is also just as much about preventing unbounded pauses as unsafety; how do they go about that?
    
    srbaker
    
    The same way you know if a person is a pilot.