Tier list of Linux security mechanisms (2024)

24 points by runxiyu

mousetail

Capabilities deserve to be in F tier. More than half of them allow privileged escalation to root. More gun than foot.

A quite from my blog about this:

In general, creating comprehensive SECCOMP profiles is very difficult since so many APIs allow privilege escalation, usually by manipulating mounts in some way. In 2002 Linux introduces the “cababilities” API, intended to block or allow entire APIs at once rather than listing individual syscalls. However, they completely failed at this. At the time of writing, the majority of capabilities include at least one method to escalate privileges. One capability, CAP_SYS_ADMIN is extremely broad and encompasses a wide range of kernel subsystems. There has been a recent movement to split this capability off in more reasonable sections but the newly created segments on their own still allow privileged escalation in the majority of cases.

strugee

Also, not that this is the biggest problem, but they're misnamed. Linux capabilities aren't "real" capabilities.
- Corbin
  
  Honestly, the wrong name might be the biggest issue. Linux capabilities are a bitset of flags which replace the traditional Unix kernel's superuser check; whereas Unix can only say whether a process's user is currently root or not, Linux can allow processes to be partially root. If we called it the "superuser bitset" or "root flags" or something else more descriptive then the author might see the appeal too. For what it's worth, I'd say that Linux capabilities are A tier; the biggest issue aside from the name is that there are currently forty-one flags defined.
  - runxiyu
    
    the biggest issue aside from the name is that there are currently forty-one flags defined
    
    How is this inherently an issue? Do you think it's too many or too few? (I'd suggest that it's too few)
    
    muvlon
    
    I think it's both. It's too few for the granularity that's often required, but it's already too many for a fixed set (as opposed to something user-extensible).
    
    It's similar to the popular saying about function parameters "if you write a function taking 12 separate arguments, you probably forgot one".
  - david_chisnall
    
    I don’t entirely blame Linux for this. Symbian also had kernel ‘capabilities’ that were almost identical to the thing Linux called capabilities and similarly unlike capabilities.
- david_chisnall
  When I’ve tried to write compartmentalised software on Linux, I’ve ended up using seccomp-bpf to build a limited approximation of Capsicum. I really wish they’d just adopt it instead of building a growing list of worse things.
  
  It’s worth having different lists for the two categories of things here though:
  
  Things that you use while writing software to limit the damage it can do if compromised, and
  
  Things that you use running existing software to limit the damage it can do if compromised or actively malicious.
  
  There’s some overlap but the requirements are very different.
  
  In the first case, the programmer model is very important. You must be able to reason about the behaviour of the program at the source level. Capsicum is great for this because all of the security policy is expressed in things that the program does directly, not some additional look-aside policy.
  
  In the second case, you can’t require code changes and the policy should be extrinsic and auditable without reference to the source code.
kuijsten

Landlock was actually modelled after a BSD mechanism called pledge, as in "I pledge to not ever access those files".

Actually, Landlock is like unveil. pledge is more like seccomp (except pledge is easier to use and more powerful with privilege separation).
- matu3ba
  
  Can you be more specific on your statements, ideally with link(s)? I was hoping Landlock can be (in future) an AppArmor / SELinux replacement with more sane tooling and behavior.
  - kuijsten
    
    unveil is for restricting file access, pledge is for restricting syscalls. The reason I find pledge easier to use for restricting syscalls in privilege separated software is because children don't inherit the restrictions from the parent process, which allows a very tight pledge on the parent.
    
    Note that I'm talking from the programmer perspective, just like Landlock, both pledge and unveil are statements you write in your source code. AppArmor and SELinux are enforced outside of the software and don't require any changes to the program itself.
- mccd
  
  Strange that landlock got C tier, its so easy to use for developers!
  - phaer
    
    Not as strange if you just consider this one persons personal preferences.
    
    E.g. AppArmor / SELinux are "C tier" because you need to adapt each application.
    
    But the very same is true for all of the B and A tier as well and no objective criteria is given for those ratings.
- matu3ba
  To me this reads like a lot of redundant complexity:
  
  1 File permissions (file ACLs)
  
  2 capabilities (call permission ranges and inheritance rules)
  
  3 seccomp (call permissions with inheritance)
  
  4 new privileges (special prctl call somehow seperate from seccomp?)
  
  5 AppArmor / SELinux (path based ACLs / arbitrary constrained ACLs)
  
  6 cgroups (system resource constrains including process group management)
  
  7 namespaces (global system resource isolation)
  
  8 landlock (network and file manipulation ACLs)
  
  9 polkit (group membership based capability delegation)
  
  10 xdg-dbus-proxy (polkit with filter for ACLs)
  
  2-4 feel like they should be grouped together, 9 and 10 like workarounds for missing privilege delegation. 5 feel like special cases of 8. 6 and 7 feel incoherent as they both deal with system resources. Ideally there would be something better for process group management.
  
  Very interesting that inheritance with complex (security) rules are somehow fine, if it is for process creation.
  - muvlon
    
    special prctl call somehow seperate from seccomp?
    
    no_new_privs is actually amazing (one of the few things from this post I agree with), because it's so so much simpler and more obviously the right thing than seccomp (at least seccomp-bpf which is the thing the post is discussing).
    
    To implement no_new_privs using seccomp-bpf you'd have to maintain an up-to-date list of things that the kernel exposes that may potentially give the calling process new privileges. That list is growing all the time, for example you can now pass SCM_CREDENTIALS ancillary messages over sockets using no actual recv(|from|msg) syscall by using io_uring.
    
    seccomp-bpf, for many purposes, is simply at the wrong abstraction level. Rarely do I care about what specific syscalls a program can do, I'm much more interested in what it can do to which resources. Things that do operate on that abstraction level are namespaces and landlock. But in order to be able to rely on and reason about those mechanisms, we have to plug a ton of holes in the default security model that are left in basically for legacy reasons. Calling PR_SET_NO_NEW_PRIVS does that by opting you into the new model, where your process can never gain new "ambient" privileges through stuff like setuid/setgid. And it does so beautifully by requiring no complex configuration that would require continuous maintenance, it's a simple "please stop being silly" button.
    
    matu3ba
    
    Oh, indeed that is an excellent reason, feels like object capability model like and I'm unaware of a simple technique to do it with ACLs.