Rethinking sudo with object capabilities

42 points by runxiyu

tonyfinn

How does this differ from polkit? Is the main distinction replacing piles of xml with shell commands (and write your own service files to make them persistent)

lonjil

the primary difference seems to me to be that Polkit uses a permission model, where the XML files says who is allowed to do what, while capsudo uses sockets, and allows anyone with access to a socket to send commands to the corresponding capsudod process. One way to control such access would be standard Unix permissions and ACLs, but also, you could imagine something like a program with access opening a socket and then handing the file descriptor to a child program. No need to define a permission set or a new account or group or anything, the permissions of the child would be defined exactly by the set of open sockets handed to it.

Edit: and, I suppose, it would also make forwarding permissions to other computers pretty easy, as Unix sockets and be forwarded via OpenSSH. You can imagine a remote server allowing a user access to a limited set of capsudod sockets, and absolutely nothing else. In essence, capsudod becomes a framework for doing things that traditionally are done with bespoke access daemons.

jfred

This is very cool! :D

Great demonstration of object-capability security in an environment people are familiar with and in a practical way. I hope projects like this get more people interested in ocaps. :)

freddyb

This is pretty neat. The main issue I can see is that third party installers or upstream service scripts don’t come with the instructions what kind of privileges they need.

It would be a useful addition to add a learning mode where you provide full root for an invocation and get the used permissions in return. Saves you an audit, but expects a script to be deterministic.

Corbin

I'm excited for this! Previously, on Mastodon, I suggested that capsudo is a sort of spellserver: a generic capability-safe remote-code-execution service. Previously, on Lobsters, we examined what I'm calling Warner-style spellservers. In a Warner-style spellserver, authority to evaluate code is delegated via cryptographic signatures; users may present signed scrolls of code for the spellserver to execute. By contrast, Ariadne's introducing what I'm calling Conill-style spellservers, where there is a static delegation mirroring an existing hierarchy and a local token — here, the Unix service account hierarchy and the ability to connect() to a socket — which passes argv and envp onwards to that statically-delegated endpoint. The difference is that Warner-style spellservers are inherently well-suited to networks with encrypted messaging and Conill-style spellservers are well-suited to whatever substrate they mirror, like Unix in the case of capsudo.

As long as it's on my mind, I'm going to do a historical comparison with Ostiary (Wayback, GitHub mirror), which dates back to 2009 or perhaps earlier. Ostiary isn't quite a spellserver because it lacks the ability to delegate; a true spellserver will allow users to construct certain sorts of universal gadgets out of sheer delegation, but Ostiary's scripts are fixed and do not return references to output objects. I bring this up because Conill-style and Warner-style spellservers are like iterations on two different features of Ostiary, namely the ability to act as a Unix user and the ability to cryptographically ensure that unauthorized access is denied.

indolering

I strongly agree with using magical metaphors for all computer science topics and calling their practioners wizards because it serves as a warning that they are NOT to be trusted.

Wizards are crazy and weird and often smell bad. Just handing ambient authority to do whatever they want is a bad idea.

We certainly wouldn't have allowed them to have as much power as they wield today had we treated them with such suspicion!
hauleth

That looks like nncp-exec idea, where you can also specify commands that can be ran in similar vain. The goal there is to run commands mostly as an regular user over the unreliable and untrusted network, but the same idea may be useful there.

rau

This looks neat enough, and provides a useful capability (pun intended). However, one of the author’s motivations is sudo’s supposed ‘non-declarative and non-hierarchical configuration format.’ Whether or not one considers aliases and rules to be declarative, what could be more imperative than a forest of running, state-holding commands? As a system administrator, there’s no straightforward way for me to know who is allowed to do what with capsudo (while with sudo I can just read /etc/sudoers). I could run ps -ef | grep capsudod, but that misses any other program implementing the capsudo protocol with a different dæmon. Centralised policy seems like a good thing for an administrator.

It is neat that capsudod in theory permits users to expose capabilities to other users without administrator involvement. I think POSIX ACLs might enable this.

I am also concerned about the fact that anything which can access the socket can invoke the capability. That means any program running as the empowered user can, silently and without any prompting. That includes programs such as web browsers which invoke untrusted code all day long. At least sudo typically requires a password!

Then there are the ergonomics. It’s obvious what sudo reboot does; it is non-obvious what capsudo -s /home/user/reboot-capability does. capsudo might make a decent building block to be used in a larger system, but it’s not great on its own.

Unix sockets are sadly underused, and the idea of using them to hold capabilities is neat, but I am unconvinced that this sufficiently addresses the issues with sudo. In my experience, in 2025 sudo is basically used as what it looks like: a super-user do. doas looks like a pretty decent simple replacement.

fanf

It is neat that capsudod in theory permits users to expose capabilities to other users without administrator involvement.

An old way to do that is userv

indolering

Reading this was a reminder of just how much of the computing landscape is shaped by the random design choices made for the first MVP. We basically took the bare minimum needed to implement time sharing and pushed it as far as it can go.

ethoh

The real problem is ambient security, which is the security model of UNIX.

The solution is a system based on a microkernel that has native, first-class support for capabilities, such as provided by the excellent seL4.

Genode is such a system.

talex5

In the example capsudod -o mountd:mountd -s /run/user/mountd/cap/mount -- mount, it looks like the extra arguments come from the user of the capability. In capsudod -s /run/user/mountd/cap/mount-dev-sdb1 -- /usr/sbin/mount /dev/sdb1 the arguments come from the provider. Does that mean that the program being run can't tell the difference? It would probably be safer to be explicit about which arguments come from where.