If you could redesign Linux userland from scratch, what would you do differently?
95 points by runxiyu
95 points by runxiyu
If we kept Linux the kernel exactly as it is today, but redesigned everything in userland from scratch (the init system, the filesystem hierarchy, the shell, libc, packaging, configuration, dbus, polkit, PAM, etc.), what would you do differently, and why?
Sandboxing by default and the norm would be that programs don't get universal access.
I was negatively amazed when I test drove the latest Fedora Atomic desktop and then installed Firefox via a flatpak... and then I opened my ssh private key with the browser and it just opened it. The tools to prevent this are right there (in fact, I can do this using flatseal), but the default still remains universal access.
Most unix tools admittedly do need almost universal access, but almost no applications do.
I'd love sandboxing purely so apps can stop trashing my home directory with some dotdir or dotfile.
Yes, XDG Base Directory exists, but a lot of apps don't follow it or make you set their own application specific environment variable, so it just becomes endless treadmill work.
Part of the reason my $HOME is on tmpfs is to notice it almost immediately when things start to litter outside of XDG base (or even into unexpected directories there! my XDG is also on tmpfs, with things I want to persist symlinked to a persistent place). Every time I reboot, my $HOME is clean, and only contains what it should contain.
If anything tries to litter there, there's like 1MiB of space. They'll run out quick, and I can bonk them on the noggin'.
What's your process when you have to use a non-compliant software for your work?
I'll symlink its files and directories to persistent storage. Either ahead of time, if I know it will litter there, or once it explodes when the filesystem gets full.
I have a system where /persist/home/algernon/<app-name> will have all the directories underneath it symlinked into $HOME. For example, /persist/home/algernon/librewolf/.librewolf will be the target of ~/.librewolf, and /persist/home/algernon/librewolf/.cache/librewolf will be the target of the ~/.cache/librewolf symlink.
This has the added benefit that if I no longer need a particular software, I can rm -rf /persist/home/algernon/<app-name>, and remove all its stateful data.
I'm curious in what way this causes issues for you, and what you use homedir for if not to hold dotfiles.
This feels silly to write out, but my home directory stores the files relating to me as a user! Code projects, documents, images, videos, games...
On the daemon / system side, we've all collectively agreed that having separating directories for certain types of files like /etc, /tmp and /var/lib are useful and important (and that structure allows for the system to provide creature comforts as well, like tmpfs on /tmp).
But on the single-user / desktop side, we decided that dumping it all into the user equivalent of / is fine.
Ignoring that, XDG_CACHE_HOME and XDG_CONFIG_HOME are also helpful for signaling which types of files are important for me to keep safe and which ones aren't, but some devs simply never learn.
Just thinking out loud but since homedir thrashing seems to be a lost cause, maybe we could instead create a ~/Home directory containing Documents, Downloads, Music, whatever, and update the XDG_ variables accordingly? Not sure what kind of issues would arise though, I assume many apps use hardcoded variants of ~/Downloads...
I know https://wiki.archlinux.org/title/XDG_user_directories exists, so you 100% can rename / relocate the usual user directories (for localization for example).
I also vaguely remember someone moving their home directory into /var instead.
So yeah, that all 100% works in the real world, ignoring the non-standard configuration.
But this is the "what would you like if userland was remade?" thread, I'd rather other apps to go into a box (either by properly separating directories like for multi-user apps or how Android and Flatpak give apps their own isolated fs) rather than hack around this faulty practice :^)
Not the OP, but I find it very irritating when some tool puts its configuration directly into my home directory instead of into $XDG_CONFIG_HOME. I like to keep that kind of stuff under version control—or if not, then at least ignored on purpose—and it seems like a terrible idea to turn $HOME into a Git repository.
You seem to be implying that the home directory is the appropriate place for dotfiles. Which it certainly was, historically, but I think that even the people who established that tradition knew it was suboptimal: why else would they have prefixed the names with dots? Surely it was because they knew that most people didn’t want to see those files there most of the time.
Did firefox just open it or were you prompted through xdg desktop portals?
It just opened it.
When you explicitly open a file the sandbox puts it in a special runtime directory (something like /run/user/1000/doc/hashhash/id_ed25519).
So while you can open your private SSH key from the file manager with Firefox, Firefox will be denied from accessing $HOME/.ssh/id_ed25519 itself.
Firefox's default permission set does allow it to freely access files from the Downloads folder, though.
I can confirm this; I just tried this with Flatpak Firefox on Arch.
Opening file:///home/myuser/ in the browser shows only .local/share/flatpak (empty), .mozilla (not my actual ~/.mozilla), .var/app/org.mozilla.firefox (real) and Downloads (real).
Using Ctrl+O to open my ~/.ssh/authorized_keys copied (or mounted?) it to /run/user/1000/doc/b3bdf961/authorized_keys in the sandbox and opened that.
Attempting org.mozilla.firefox ~/.ssh/authorized_keys in a terminal opened a File not found page.
What does explicitly access it mean ? Drag 'n drop ?
Because I can just type file:///home/proc/.ssh/id_ed25519 into the URL bar. Don't see how that's different from what an exploit would do in terms of opening it.
I use the Flatpak, and accessing my SSH keys on my system through the address bar gives me a 404 error.
file:///home/pmc/.ssh/id_ed25519 -> "Firefox can’t find the file at /home/pmc/.ssh/id_ed25519."
$ ls -al /home/pmc/.ssh/id_ed25519
-rw------- 1 pmc pmc 387 May 26 20:37 /home/pmc/.ssh/id_ed25519
Would be great if that had happened. Somehow, Firefox from Flatpak happily opened my ~/.ssh until I used flatseal.
Unfortunately I cannot doublecheck this now, since I've moved on.
in fact, I can do this using flatseal
Wait. Did you install the Fedora Flatpak? Does it have full $HOME access?! The official Mozilla one on Flathub does not, I'm pretty sure.
I think it was the Fedora Flatpak since that should be the default? Cannot check anymore unfortunately, since I moved on from there. Whatever it was, it certainly had full $HOME access until I used flatseal on it.
I agree, good sandboxing is not that hard to retrofit into Linux. I've been a long advocate of better Firejail (declarative) support on NixOS, with limited success. The community is quite hostile because it's "inconvenient".
If you do it right, the inconvenience is minimal. Firefox has no business accessing your ~/.ssh. In fact, disk access should be limited to ~/.mozilla and ~/Downloads (or equivalent).
Mobile distributions like MeeGo and SailfishOS have used different sandboxing solutions to achieve Android-like isolation of applications.
Personally, putting Firefox and Emacs inside a Firejail is the first thing I do when I configure a computer. Bubblewrap is probably a better solution as it has a smaller surface of attack, but it's less automated.
Using Downloads as the designated dump ground is good enough for most common usage. Except for tools such as text editor
Most unix tools admittedly do need almost universal access, but almost no applications do.
Do they? If I run grep -r stuff src/, grep only requires read-only access to src/. In particular it doesn't need write access to anything (in particular, no network access), so it couldn't really exfiltrate any data sans side channels.
I think the same is true for most unix utilities.
I don't really know how you'd practically go about this sort of sandboxing? Trusting the programs to sandbox themselves misses the point. You could try having small trusted wrappers which you could quickly audit - or the shell syntax could be adapted to pass capabilities, but I'm also not yet sure how you'd do that well.
I do nsjail-based sandboxing with wrappers (I trust grep and even vim though), and I'd say realy a lot of work can be meaningfully sandboxed with «OK, you get access to the current directory and below but not up».
Oh, and half of the time it can be RO. And I do have some wrapper-specific markup to say «this file should be copied into the sandbox, with the path replaced on the command line with the in-sandbox special path». And also «among all the following arguments, if there is an accessible file, copy-and-replace»
It would be hard to solve this in a user-friendly way, though. Having a popup saying "are you sure you want grep to be able to read from src/" would be the worst of all worlds. Perhaps an audit log listing all the files that a utility has accessed, together with an "undo" option per access, would be more user-friendly? (It would be hard to implement, though).
Yeah, this is something I feel really could use a lot of research. I have had some ideas in this space a few years ago and wanted to write some prototypes, but I've realized Linux doesn't really provide you the tools to experiment with novel interfaces for sandboxing.
I ended up writing my own kernel because of this. Sadly I've kinda abandoned that project because uni has been killing me for the past two years. I've been meaning to get back into that project, and to finally publish some articles about it, but uni has burned me out and I haven't quite found the motivation for this yet :/ (although this entire thread is a good push, lol)
I think the app sandboxing itself makes a big difference (for example landlock), because that can be reviewed by the package maintainers and yourself quite easily and lowers the burden too. The amount of people you'd need to trust is way less.
Which sort of app sandboxing? I really hate per-program (non-recursive) sandboxing, the sort of stuff I believe e.g. OpenBSD's unveil or AppArmor (?) are doing.
For example, back on Windows, I used TinyWall to limit which programs can access the internet. I used curl frequently, so I had to whitelist it - but that meant any program could just run curl and access the internet through it.
As another example, unveil("/", "rx"); looks like it prevents a program from writing to the filesystem, but it can just execute another program (e.g. sh) to write to the filesystem for it. It looks secure, but it's not. (despite this, some programs in base do exactly that :P)
I'm not an AppArmor wizard, but the AppArmor profiles with hardcoded binary paths sure make it look like it's in the same category. Landlock does specify that it's stackable though!
...sadly it's still built on top of the Linux kernel, which is a giant pile of unauditable code (as opposed to e.g. seL4, which you could actually read the entire source of in your lifetime!). So, sure, we solved the problem of putting blind trust in user applications, but we're still putting blind trust in the entire stack under that.
Hmm, well mainly I compare them to tools like firejail, android permissions or VMs. I agree with your complaints about the implementation details of unveil and Linux being hard to audit, but I still think the concept is sound. What I like with landlock and pledge is that they are very simple, intuitive and unobtrusive, they put less burden on the user of the system.
Subjectively, sandboxing is one of the worst things happening with Linux in the past years.
The very reason program issues were fixed so quickly in the past is exactly that programs need to take care about each other and issues with some programs helped in identifying issues with other programs.
Moreover, this "ssh key" example seems strange to me. Firefox is your program, ssh key is your file, of course your program should be able to go open your file, what's the issue here?
The problem is if Firefox gets compromised.
A browser’s very purpose is to handle untrusted data and remote code execution vulnerabilities are a real possibility (example).
Well, sort of, but the issue is that we, for some reason, are letting some unknown third parties execute untrusted code on our machines, that is, use our computers without paying us rental fee.
This is extremely weird.
Not every vulnerability has to do with intentional code execution. There were zero-click RCEs in Chrome and Safari two years ago that were caused by a memory safety issue in image decoding done by libwebp.
That is not a reasonable world view. Computers are made for running programs but the owner of the computer has no reasonable way to make the programs trusted. And the point of sandboxing is that many of them do not need to be because their purpose doesn't need a lot of capabilities.
Trustworthiness of sandboxing is as illusory as the distinction between root and user.
If your user gets broken into, year, the malware cannot break the system, but it doesn't really matter, because it can steal the most important - your data.
Yeah, your isolated browser cannot read your ssh key, but it can still read your passwords, bank card numbers and gmail web interface.
All software must be open source and publicly audited, everything else is a security theater.
Yeah, your isolated browser cannot read your ssh key, but it can still read your passwords, bank card numbers and gmail web interface.
That's why you need layered security and why modern browsers will also sandbox individual tabs. If one tab gets compromised, the attacker will not have access to things that happen in other tabs.
All software must be open source and publicly audited, everything else is a security theater.
Sorry, but this 90'ies *The Cathedral & the Bazaar, "given enough eyeballs, all bugs are shallow"-mindset has long been invalidated. Yes, you need auditing, but you also need multiple layers of security, everything from W^X, guard pages, address space randomization, MIE to sandboxing and blastdoor. All software, open source or not, will contain 0-days and worst case these are found by actors that do not report them. Layered security is why for exploiting OSes like iOS requires long exploit chains that have become very expensive.
If desktop Linux was an interesting target (had more users), it would be a security nightmare due to the lack of security layers. Most of the bits are there, but only very few people/projects are working on strongly integrating them and those who are often attacked from within (look at any thread about Flatpak, systemd sandboxing, secure boot, etc.).
Yeah, your isolated browser cannot read your ssh key, but it can still read your passwords, bank card numbers and gmail web interface.
How about all the other random pieces of software I run? Why should the software I use to catalogue my music be able to see my SSH key? Why should my chat clients see my medical documents? Why should this random npm package for a webapp I'm working on be able to access my email client?
How about all the other random pieces of software I run?
Well, I certainly want bash, awk, sed, and grep to be able to read my ssh key.
Why should the software I use to catalogue my music be able to see my SSH key?
Because presumably you have some central repository of that music metadata so that you don't want to re-create from scratch on all your devices? And it runs scp/rsync/whatever-over-ssh to sync it?
Why should my chat clients see my medical documents?
Because you're sending them to your doctor over that chat app, and the doctor uploads them to their off-shore image analysis supplier, which you have zero control over.
Why should this random npm package for a webapp I'm working on be able to access my email client?
It shouldn't it, but in practice it is hard to avoid. You IDE probably has your ssh key opened somewhere to access your deployment machine, and it probably cannot analyse that npm package without running bits of its code.
Like, let me give you a metaphor. Computers are digital brains. Assuming that isolation works is like assuming that at work and at home you are two different people with two different brains, and your home doesn't influence your work and your work doesn't influence your home life. Life does not work like this.
Well, we are discussing exactly the fact that software isolation tooling requires too much entry barrier to use properly. Your examples are mainly examples of that (except when you do need third-party use), not of this problem being there for logically inherent reasons.
Music catalogue touches media file formats. It should not handle its own syncing! It's a maybe whether it should even be able to issue an RPC call to trigger that syncing.
Same for IDE and git push, ideally. Although it is more complicated there because push/fetch should be outside IDE sandbox, while many people will reasonably want the actual merge/rebase to be inside the IDE for conflict resolution. But git supports such a split!
Well, I certainly want bash, awk, sed, and grep to be able to read my ssh key.
Do you want all bash scripts to be able to read your ssh key, or only the ones that need to do so?
Because presumably you have some central repository of that music metadata so that you don't want to re-create from scratch on all your devices? And it runs scp/rsync/whatever-over-ssh to sync it?
I sync them using a different program anyways - I was referring to programs such as Picard, which are used to tag music files. They handle a bunch of different formats, require internet access by design - there's a lot of attack surface. Also, why should I blindly trust the Picard devs in the first place?
And hey, who says I couldn't give it access to my ssh key if I did want it to connect to some server of mine for me? Even better - maybe I could let it connect to my NAS, but not to the Tor relays I run. Right now it just gets access to everything with no oversight.
Because you're sending them to your doctor over that chat app [...]
I'm yet to send any medical documents to my doctor over IRC or Signal. If I wanted to I could explicitly grant these clients access to them[1]. This is not a reason to give them access to it by default. I trust these clients for my private communication, but that doesn't mean I should trust them with everything on my PC.
[1] Well, I'm actually running Signal under Flatpak, which does limit its filesystem access (sort of, it can still easily escape the sandbox via X11). In theory I am able to give it access to individual files only when it needs them, via xdg-desktop-portal. In practice that broke on every Linux install I've been using Flatpak on. This isn't an issue with the overall idea, just with the implementation.
You IDE probably has your ssh key opened somewhere to access your deployment machine, and it probably cannot analyse that npm package without running bits of its code.
It's not even about the LSP - I have to run the package itself. A server for a web app has no reason ever to access my email client. If it needs to access e.g. a database running locally, I'd be happy to have to manually specify that.
Maybe the isolated browser instance touching card numbers should not be the same one exposed to news websites.
Although any reduction in functionality-locking behind JS would be welcome.
This is just wrong. If somebody breaks into my MP3 player they can steal my mp3s, my music preferences, maybe when I was at home, or which device was playing; but certainly not my ssh keys...at least that would be true if we had decent sandboxing.
In principle you can use a different ssh key in you mp3 player for syncing its data.
But in practice I don't believe that the issue can be completely resolved no matter how much software sandboxing we invent.
The world consists of hardware, and software is just a confuguration of that hardware. As long as we, humans, use hardware, we are the weakest link in the system. But also as long as software is made for humans, it will be vulnerable, because otherwise it will be inconvenient.
It is not "weird" for people to neither own or benefit from the programs running on their computer. At least, not for the last 20+ years it isn't. In fact, most normal people seem to think running free or open source software that you control, such as Linux, weird. And while it can't decide what is good, at least the majority gets to decide what is normal.
In fact, most normal people seem to think running free or open source software that you control, such as Linux, weird.
I’m not necessarily disagreeing with you, but most normal people don’t conceptualize “running FOSS software” as “running software that you control.” Desktop Linux isn’t weird because it’s FOSS; it’s weird because most people are only familiar with some combination of Windows, macOS, Android, and iOS. The “control” aspect just doesn’t enter most people’s minds at all.
(And if your explanation of how to control some piece of software starts with “just clone the repo and run ./configure…” then I hope you can see why that benefit of control does not practically exist for most people.)
How much you control the software you run is directly proportional to the amount of work it takes to manage. So yeah, just because it isn't in those words, tell them you're self-hosting a media server or whatever, and see their eyes glaze over when they realize you didn't just pay for spotify.
I have some software I control where the effort investment was once and for decades, and there is software that doesn't let users control it and pushes workflow changes once per quarter, so over long term there are different scenarios.
I don't think you're taking what I said seriously. You deployed software you have full control over and there was no time investment at all outside of deployment; so my grandmother could do it in a single evening with no learning required and she would be able to make changes without my help? Do you understand what I'm saying?
I think you misunderstand what I said about amortisation. You need changes, you do need to learn it — it's just that it doesn't put you on a treadmill so you can change it once and use for a very long time.
You said that effort is proportional to control.
My point is: some controllable software is indeed inaccessible to non-technical people, but for technical people it saves both effort and aggravation.
Then you are talking past what I am saying.
It is important to distinguish «we are making a choice based on preferences that are weird to them» and basically «we are taking a good deal that they would also consider if they could afford it» though.
I have upvoted your original point «it is not weird for things to be bad», but once the subthread is also about implications of familiarity with the tools, I think it is important to remember this distinction.
Especially important if we are ever asked «what do you use for X?» — for some things the answer is a version of «you don't really want to know», for some things we have spent the effort of learning and can very easily setup the same elsewhere for others and hope to solve the problem for a few years (although it was hard the first time), many things are in-between.
Except that Firefox is already a sandbox. So if they can punch a hole in the Firefox sandbox, why not any other box you wrap it with?
Because the layers use different approaches, so now they need to use more vulnerabilities for a single attack, and possibly also handle difference in layouts between different systems.
It's not a «can they at all», it's either «is it worth the effort to them», or «will their mistargeted attackon someone else take you out as collateral damage».
Sandboxing is incredibly important because it limits damage. It's not a theoretical situation. For example, Python packages have been compromised several times. In one of those attacks, the compromised package (torch) stole SSH private keys.
With adequate sandboxing, these attacks would have caused little damage. The tools to implement sandboxing are there, both at kernel (namespaces, BPF) and userland level (Firejail, bwrap). They just need a bit of polish and more adoption.
With adequate sandboxing this issue would have remained undisclosed for an indefinite amount of time.
With sandboxing open source loses its main benefit compared to closed source, because the main benefit of open source is not that it's open, as reading even open code is hard, but that it's all interdependent, and modules are constantly testing each others' interfaces.
This is an extremely bizarre argument against security, code inspection, and a few other things.
And, by the way, the malicious torch package got discovered because a security researcher found a weird package with a name clearly intended to mislead people and inspected what it did.
because the main benefit of open source is not that it's open, as reading even open code is hard, but that it's all interdependent, and modules are constantly testing each others' interfaces.
Can you explain how sandboxing would change this?
I don't think we can get to a state of the world where all software is so good that we don't need these boundaries anymore. Even if we could, when you're writing the software, it is much more economic to put in such boundaries so you don't have to spend the effort to make all parts of the system perfect.
Also that world just doesn't exist, xkcd:2347 is true. E.g. look at libxml2, which is most likely the most-used XML parser, that many applications use for parsing untrusted data, in early-2000s style C. The project has recently lost its (I think only?) maintainer: https://discourse.gnome.org/t/stepping-down-as-libxml2-maintainer/31398
They also started treating security bugs as regular bugs slightly earlier, because dealing with them was unsustainable for a single, unpaid volunteer: https://lwn.net/Articles/1025971/
Sandboxing by default and the norm would be that programs don't get universal access.
I was negatively amazed when I test drove the latest Fedora Atomic desktop and then installed Firefox via a flatpak... and then I opened my ssh private key with the browser and it just opened it.
Interesting.
What you want to see is the number one complaint about the Snap-packaged Firefox in Ubuntu. It is one of the primary reasons there is so much hatred for snaps and snapd.
I use a desktop with a global menu bar. Firefox doesn't work with this. So, I install my own browser, Waterfox, which does, and I use that. There is no snap version, so I don't have the issue -- although I don't much care either way.
But I think what we have here is a good strong example of the conflict between what security folks want, and what ordinary moderately-techie users want, and how the two badly contradict one another.
This is not a principled or exhaustive list but a grab bag of things that have bugged me, mostly in libc.
There are probably more things I would change if I can touch the kernel.
This comment is in entirely the right direction. I haven’t thought about all these things so I don’t know if I agree with everything, but the thrust is correct.
The problem is that libc is broken. It’s broken in the sense that there are a number of pieces of essential functionality that have very serious errors (crashing, memory corruption) if you use them without controlling every line of code in the process. Environment variables is the relatively well known one, but exit() is actually worse. The entire concept of dlclose() is shaky. Fork() has issues. All these things need another pass.
I implemented 1, 5 and 6 on my old liblinux project! Getting rid of errno alone made everything so much nicer. In addition to the environment, I also passed the auxiliary vector to the main function. No global or thread-local variables anywhere at all!
My plan was to bypass point 2 by masking all signals on process start and making signalfd the default way to handle asynchronous signals.
I didn't get far enough to implement my own dynamic loader and threads but your ideas seem reasonable. Libraries should not do anything other than provide symbols for the program to reference.
Other things I wanted to do:
Get rid of .init and .fini sections
Experiment with the "standard" file descriptors
Come up with ways for programs to have arbitrary numbers of them. Instead of having logic to open files, the program could say that logs will be written to file descriptor 5. The shell opens the file and redirects it to file descriptor 5. Programs would be able to design their own file descriptor APIs.
Come up with ways for programs to have arbitrary numbers of them. Instead of having logic to open files, the program could say that logs will be written to file descriptor 5. The shell opens the file and redirects it to file descriptor 5.
That... already works fine in regular unices?
> python -c 'open(5, "w").write("some output")' 5>foo
> cat foo
some output%                                                                                                                                                                                                                                       
Of course the program needs to decide what happens if the caller does not provide anything at fd5 but that's no different from anything else. And arbitrary numbers are pretty shit an interface.
The standard streams work exactly like that. Programs just assume the file descriptors 0, 1 and 2 are already open and just use them directly without opening anything.
The idea is to replace the "standard streams" with per application conventions. The program says what each file descriptor is supposed to be instead of reading everything from 0 and writing everything to 1.
What if programs could read structured input from specific file descriptors instead of parsing data from a single source? What if instead of outputting multiple things to standard output the program outputs one thing on multiple different file descriptors? Could simplify or eliminate parsing. Could enable new ways of interconnecting programs. Maybe each of these outputs is supposed to go into a separate program.
Maybe what I'm describing has been done before. I don't know...
I'd ditch C being the defacto lingua franca.
Libc, which would no longer be called libc, would define its interface in a language-agnostic FFI description format. Figuring out how to call it properly from another language wouldn't involve puzzling your way through header files. There would be no concept of untagged unions in the interface language, int sizes would either be explicitly specified or "machine width". Constants wouldn't be defined as "whatever this string substituted into a C compiler gets evaluated as". Etc.
My new lingua franca would have both an intra-process-communication format (function calling convention) and a inter-process-communication format (RPC). The RPC variant would also have a self-descriptive subvariant where a schema is included. We'd use it as a uniform wire format for system level IPC things. The shell would speak it, and thus would have the full power of libc and not be purely stringly typed. Stderr would be a stream of self descriptive RPC messages instead of stringly typed nonsense and terminals would distinguish between stdout and err and not mix them at the character level like they currently do. It might also be called something like stdmsg instead of stderr to suggest that it can be used for non-error non-string messages.
Combining this with @laat's excellent suggestion instead of executables being passed a list of strings as an argv that happens to sit above the top of the stack, they'd be called using the standard function calling convention with the shape of the input defined by a schema in some standard section of the executable format. I.e. as suggested by laat calls to executables would be typed - and the type information necessary to call them would be stored in the executable itself.
I don't think typing returns from executables makes as much sense, because executables could misbehave.
Caveat: Because we can't modify the kernel this only applies to executables loaded by the dynamic loaded. Executables loaded directly by the kernel would have to use the existing convention.
I think a lot of the shell ideas here are covered by arcan-shell. https://arcan-fe.com/2022/04/02/the-day-of-a-new-command-line-interface-shell/ I believe Arcan goes further, transporting the RPC stream across machines to enable sharing processes/pipelines between machines.
Multi-architecture binaries ala Mach-O.
I truly believe this is one of the things that has allowed Apple to transition between architectures so seamlessly. Meanwhile on Linux (and Windows to be fair) we have these abominations of lib32/lib64 and that only got us through x86 -> x64. For something like Apple's UB2 where they emulate x64 software on ARM, does that mean we'd need libx64 and then lib (for ARM native)? Absolute nightmare.
We should have had fat binaries forever ago.
And we very nearly did… IIRC this was killed for the usual reason nice things in Linux are killed: politics, and convincing people there is a problem worth solving. https://icculus.org/fatelf/
Yeah, it's a bummer. I read through one of the LKML threads and, while there were a few points about worrying about patents, overall it just seemed to be that most people didn't see this as a problem that needed to be solved.
Ah well. The future we could have had.
I think having binaries for different architectures in different directories is cleaner and makes more sense than multi-architecture binaries. It means you can easily install only the binaries for the architectures you need. For example, Alpine Linux supports 9 different architectures. If it used multi-architecture binaries for every package just in case you wanted to run x86_64 software on your aarch64 machine or s390x software on your ppc64le machine, the resulting binaries would be huge and take ages to build locally (since you'd need to cross-compile for all 9 architectures). Whereas if you just have different directories (e.g. /lib/x86_64, /lib/aarch64, /lib/s390x, /lib/ppc64le, ...), it's easy to split each architecture into different packages which can be installed (and built) separately.
Fat binaries are kind-of convenient for distribution outside of a package manager because you only need to distribute one binary, but especially if you include every architecture supported by Linux, they're going to end up being huge. I don't think the minor convenience of not having to download a binary specific to your architecture is worth the extra size.
I'm sympathetic to your point of view – I would like to make clear that I am not supporting a maximalist "fat binary" ecosystem in which all packages are shipped with every possible architecture. I think it would still make sense to have separate images/ISOs for major architectures. Fat binaries would be fully opt-in for the distros to decide what to do with.
The thing is, I lived through the 32 to 64 bit transition and it was an absolute mess. We didn't end up with /lib/x86_64 and /lib/i386 – we had /lib32 and /lib64. Which one was /lib symlinked to? Couldn't tell you! Fat binaries would have solved this. macOS systems are so clean. Hell, distros could have even used something like Mac's lipotool (for directly manipulating fat binaries) to pack multiple architectures into one lib file on the client.
Yes, it's not a huge imposition to ask people to know what architecture their system uses. But on Mac, people [i]barely even noticed[/i] – even across two separate architecture migrations (PowerPC -> x64, x64 -> arm64). The Linux (and Windows) solutions are simpler, sure, but the Mac solution is just seamless.
I would like to be able to do everything without ad hoc parsing and "Unix sludge" :-)
ls -l | sort -n -k 5 to sort by size is another form of ad hoc parsing - the sort tool does splitting
ls -l -t to sort by time is a non-orthogonal solutionI think there should be one tool / language that starts processes in parallel too
e.g. I think of make / ninja / xargs -P / GNU parallel as "a form of shell"
It would be nicer if you can do that all with a single language. YSH has objects and closures, so I believe it's already possible, but it needs to be tested (help is welcome)
find is also a little language that's like awk -- e.g. find and test: How To Read And Write Them
The idea behind YSH is to basically do to internal/embedded DSLs (with YSH syntax) rather than external DSLs like awk, make, find, xargs
And as vegai mentioned, programs/processes should be configured with the principle of least privilege (PLOP)
The shell is a language for expressing those policies
putting every Unix tool in the same process doesn't scale. There should be no plugins; they should just be Unix processes.
Or to put it another way, Unix processes are the plugins.
Yes exactly! An additional argument I'd make is that it's a bit inefficient (from the POV of software size / cost) to have plugins for text editors, plugins for shells, plugins for browsers, etc.
Instead it's nicer if they can all use approximately the same plugins, which are just processes
This also reminds me a bit of programming systems "versus" operating systems -- https://lobste.rs/s/yjbsof/programming_system
i.e. I think the operating system IS naturally the programming system, or at least it's the most fundamental one. I don't want a separate system on top with a separate plugin system
A programming system and an operating system are technically very similar, but very different socially.
A programming system exists for running/launching programs, but a an operating system exists to manage user's attention, show him as much ads as possible, report to the boss when the user is slacking, etc.
So whereas they are superficially similar, in the principle of operation they are very different.
However, UNIX, unlike Windows, tried to make a programming system into an operating system.
Very much this. In /. terms, +1 Insightful.
Classic MacOS is an operating system: programming stuff is hidden away, and it's designed for end-users, and for them, it was superb. (Pace some unfortunate limitations that were necessary compromises from a historical POV.)
Unix is a programming system, and the few fairly successful end-user Unixes have succeeded mainly by almost totally hiding this -- to the ire of programmers and those who identify with them.
There is no real technological reason they need be the same OS, if our tools were a bit more grown-up.
an operating system exists to manage user's attention, show him as much ads as possible, report to the boss when the user is slacking
Lack of "/s" almost gave me a heart attack :-P
But yes, governments and corporations figured out that software is a good way to control people's behavior ...
And it's not a coincidence that the largest companies in the world now are OS companies
nushell and PowerShell don't count because they require "plugins" - your stipulation was without without rewriting the kernel, and putting every Unix tool in the same process doesn't scale. There should be no plugins; they should just be Unix processes.
Yuup. I kind of live in nushell these days, but the boundary between nushell and "plain ol' Unix processes" is.. harsh.
In theory, it'd be great if there was a way for the shell and command to signal to each other that "oh, actually, we both speak JSON, let's just use that instead". In practice.. "stable two-way out-of-band handshake that isn't implicitly inherited by subprocesses" seems to be just about anathema to the Unix process model.. :/
Here are some notes to answer the rest of the question ... Note that I'm more concerned with Unix machines participating in a distributed system / cluster -- but desktop machines obviously fit that description now.
the init system
To manage clusters, the init process should be controllable over the network
There is a project to sort of "unify" systemd and the kubelet - https://aurae.io/ (unclear on its status)
I generally agree with the philosophy -- the problem is that the two inits/process supervisors  are overlapping and complex pieces of software.  The kubelet makes gRPC calls to a container runtime, and then the container runtime uses runc or crun, and then there is even a process shim for every OCI container I believe.  (A running OCI container is of course a Unix process.)
And then they both have different ways of getting the logs, etc.
So I'd like to see at least one of these systems become smaller (in deployed systems)
the filesystem hierarchy
I wouldn't actually redesign this -- I think it can evolve, and be cleaned up
Though actually, to "evolve" the file system into an abstraction suitable for a cluster -- I think you clearly need git-like / Bittorrent-like checksumming of subtrees. i.e. NFS isn't a good abstraction
the shell
As mentioned, this is the center of user space in my mind, because it's both a UI and a programming langauge
libc
libc should remain stable -- I think the point of Unix is that you can have programs in newer languages with Go and Rust, composed with C programs
packaging
A middleground between OCI/Docker and Nix/Bazel :-) Without "sludge" again -- the Nix language + shell + YAML/TOML is a form of Unix sludge
configuration
YSH has "Hay Ain't YAML", which can generate JSON, etc.
dbus
IPC is more of a kernel issue ... e.g. I learned recently that systemd uses DBus for "rootless cgroups" - https://lobste.rs/s/zo4nto/snooping_on_slow_builds_using_syscalls#c_wfnshm
I don't think this requires DBus. I would try to use Unix sockets for most things
polkit, PAM
TBH I don't know much about these, probably because I'm more concerned with Unix as part of a distributed system (again which desktops are, because they receive updates, etc.)
Amazingly, this suggestion is not in the list yet:
Typed signatures for all executables and scripts, just like TypeScript adds them to JavaScript natively or with .d.ts files.
This way the shell or shellcheck can check the types of the arguments to each command. Why do executables get to be wildly overloaded untyped, thread-unsafe functions?
A simple example: shell scripts should resolve paths to the executables they need at the start of the script and load their 'function signatures'. Then the script can check if all arguments fit the calls. Is the 3rd argument a file that exists at that point in the script? Does it conform to the JSON syntax? It should because the signature of the previously run command says so.
I recommend trying to build it!
Then you might not find it "amazing" -- because it doesn't work at Unix scale
For some reasons explained here: https://lobste.rs/s/sqtnxf/shells_are_two_things#c_pa4wqo
It can work in nushell -- because nushell has an "interior" [1] design
But a consequence of that design is a limitation - https://lobste.rs/s/ko5i9y/if_you_could_redesign_linux_userland_from#c_os2n68
[1] Oils Is Exterior-First (Code, Text, and Structured Data) - https://www.oilshell.org/blog/2023/06/ysh-design.html
I don't see why this couldn't be done, as long as you're willing to give up static typing on the caller's side. Each executable could expose a statically typed interface, which a client dynamically binds to using runtime "reflection" on signature metadata provided by the executable.
This is quite natural when the client is a shell (e.g., PowerShell), less so when the client is another strongly-typed program, but that's where proper APIs with bindings should come in – unfortunately, on Linux, this is still very hard to do across languages; the Windows world at least has COM/WinRT, however arcane they are.
This is obviously much less powerful than the "ideal" of system-wide static typing would be, but still would be a huge improvement on today's status quo (outside of the mostly-closed universes of PowerShell/Nushell/...) in type safety, consistency and ease of use.
A bunch of the things mentioned in the question I think of as "this is okay/bad/terrible but it's what we have now".
I think that DJB figured out process supervision and logging correctly.
I think that init as pid 0 should be so tiny you can see there are no bugs in it, and probably should be integrated into the kernel.
I think that a fully integrated ZFS as the default and primary filesystem would alleviate a lot of stress. There are too many tools trying to compensate for not having ZFS around, or which inspired features of ZFS.
I think that we should remember that the filesystem is a database, and any time someone proposes a key-value store, a config file format, or anything which resembles a tree, a directory structure with filenames and contents is an appropriate alternative.
I think that the basic filesystem permission system is completely inadequate for systems larger than embedded or single-user, but most of the alternatives are hard to reason about. Groups should be easier and more popular; bringing them into per-user namespaces might be correct.
I think that everything is made up of dependencies and context along with whatever you were trying to do.
I think that init as pid 0 should be so tiny you can see there are no bugs in it, and probably should be integrated into the kernel.
There is a certain je-ne-sais-quoi that I love about the idea of pid 0 being so small that you could send it a command to restart all of userland without requiring an actual reboot. Even, like... other people have mentioned sandboxing and it doesn't seem implausible that you could have multiple boot profiles that you could switch between without needing an actual reboot (say one with a small encrypted filesystem for sensitive work and another for general purpose mucking around)
Edit: I am aware of runlevels. They're... almost there but not quite.
I'm not sure I've ever wanted to restart all of userland except when something was so borked that I wanted the kernel to be restarted, too.
But I certainly would not object to named init sessions.
directory structure with filenames and contents
I like this idea but wonder if there's a missing UI for it. When I have one big config file, I can open it and see many key/value pairs (and comments) on the same screen. I wonder if there's an editor plugin that takes a tree of filenames and contents, and presents it as one big buffer.
There could be, but in a well-designed config tree and a shell with basic autocomplere, grep, ls, cat and echo are all you need to make a single change.
I think that we should remember that the filesystem is a database
I disagree.
I had this argument often when I publicly made an argument that in an era of widespread cheap persistent memory (PMEM, e.g. Intel Optane) -- yes I know it was cancelled -- we could seize the opportunity to build a new generation of OSes that do not need the 1970s abstraction of a filesystem or files. That we should consider that as legacy baggage and eliminate it.
If you consider them as a Venn diagram, there is a big overlap between what databases do and what filesystems do, but it is not total. They are not the same things. Replacing one with the other is so very very hard that nobody has ever succeeded in doing it, and multi-billion-dollar efforts have failed.
I didn't say it's a relational database. If you have a need for a persistent key-value tree, that's a kind of database, and any filesystem works pretty well for that. If you need a single-access RDBMS, sqlite is obvious, and if you need concurrent access, we all know where to find Postgresql.
But keeping your config in sqlite is probably overkill, he said looking at Firefox.
Firefox config might plausibly run out of inodes on some systems!
Also: Git relies on filesystem data consistency in a way that most filesystems in use now do not promise, and I have had unfortunate hard-shutdowns for random reasons corrupt Git clones beyond repair… which is one of the reasons why I trust SQLite-backed DVCSes more than Git.
I think that a fully integrated ZFS as the default and primary filesystem would alleviate a lot of stress. There are too many tools trying to compensate for not having ZFS around, or which inspired features of ZFS.
This isn't possible without kernel changes, unless you want to use FUSE. (And isn't possible in the kernel either unless Oracle relicenses ZFS under a GPL-compatible license.)
This isn't possible without kernel changes, unless you want to use FUSE
Yes, it is, and multiple Linux distributions do it today. Off the top of my head: Ubuntu, Void Linux, NixOS, TrueNAS (based on Debian), and Proxmox (also based on Debian), all include OpenZFS by default. Some other distros can optionally use the Proxmox kernel in order to use ZFS; I think OMV is one.
It is possible to boot directly off ZFS.
GRUB has native ZFS support, and Sun released enough of ZFS under GPL 2 to permit this.
https://lwn.net/Articles/418869/
Or via ZFSBootMenu:
While I agree that TOML is an improvement over JSON and YAML, I'm still wondering why are we as an industry apparently incapable of designing and widely adopting a sensible, human-friendly config format.
Closest thing to an "ideal" config format for me that I know of are .psd1 (PowerShell data) files:
@{
	Name = "go-lang"
	Architecture = "x64"
	Version = "1.25.2"
	Install = @{
		Url = "https://go.dev/dl/go1.25.2.windows-amd64.zip"
		Hash = "..."
		Archive = $true
	}
	ListOfThings = @(
		"string"
		1
		@{A = 5}
	)
}
I could do without the top-level dictionary literal, replace @{} with {}, @() with [] and $true/$false with true/false, and a few similar nits, but overall, it gets most things right:
For handwritten files, I'd also somewhat like at least basic support for variables, to deduplicate parts of the config (which PowerShell already provides a compatible syntax for), but I'm still not 100% sure it's worth the downsides (mainly the significant difficulty of machine-editing such config files).
I kinda like roff. It should be modernized, but it is quite compact and the result is nice.
I agree but it gatekeeps people from writing manual pages.
Many projects just see it as a mandatory thing to get into Debian, often resulting in a low-quality manual pages.
It should be modernized
mdoc? It still looks arcane but IMO it's justified - IMO it's actually pretty pleasant to use, but it's more powerful than typical markup languages without overwhelming syntax, and I like how it really encourages semantic newlines.
No memory-unsafe userlands.
I wonder if there’s a way to prove that a given binary was checked for memory safety…
I mean, a toolchain that produces reproducible binaries plus a memory safe language and/or a language with a proof system that can (and was used to) prove memory safety. You could even embed the source code (and proofs if done with a proof system) in the elf file. In principle, yes. In practice, no one does this and there's no tooling built for this.
I'm not convinced this is solving a relevant problem though. The distro generally builds all the software itself anyways... you just make it a requirement to be packaged that the program is memory safe.
I started several projects with the goal of reimagining Linux user space. My current goal is to rewrite it all inside my own lisp interpreter.
My idea was to leverage the stability of Linux system calls as well as their language agnostic nature to create freestanding, zero dependency Linux applications.
First I started a liblinux project whose aim was to provide lightweight stub functions for the Linux system calls with none of the heavy machinery found in C libraries. I also provided optional minimal process startup code whose only job was to collect all the process parameters that Linux puts on the stack and pass them as arguments to the main function.
It turned out one could get surprisingly far with just this. I managed to write some example applications rather easily, such as a program that prints the terminal size. Lack of libc functions were the biggest pain point but I just wrote my own versions.
I eventually abandoned liblinux because I discovered the kernel itself was developing an awesome nolibc.h file which did the same thing as my project. Competing with the kernel guys wouldn't have been smart. By now it's become a sprawling directory full of functionality, no doubt much better than what I came up with.
So I moved on by starting the lone lisp programming language. It's a completely freestanding lisp interpreter targeting Linux exclusively. My goal is to make it powerful enough to rewrite the entire Linux userspace inside it and then just start using it for my own utilities. It has a system-call primitive, so it should theoretically be able to do anything, from controlling terminals to mounting disks. It's still very much a work in progress though. I recently added delimited continuations and am now working on generators. I still need to implement some sort of record type so that it can work with kernel data structures.
One feat I'm extremely proud of is the code embedding mechanism I came up with. I made it so programmers can create applications by making a copy of the interpreter and adding code to it. The interpreter introspects into its own ELF image at runtime and runs the code.
Everybody told me I had to open and read /proc/self/exe but I found a way to make Linux itself mmap the data for me before the interpreter has even begun executing. This supports early boot environments where procfs hasn't even been mounted yet! Even if my project is a total failure in the end, I hope people will at least remember this!
I also worked on adding a linux_system_call builtin to GCC in order to eliminate the need for inline assembly. It'd simply move things into the correct registers and emit the system call instruction. Discussion can be found in the mailing lists. Sadly I lost that work due to a hard drive crash after my laptop fell down.
This is my lifetime project. I have no idea if I'll succeed. Hopefully I will at least inspire others to try similar things. The Rust folks for example. Yes, you can rewrite the entire Linux user space in Rust if you put enough effort in. No, you don't need libc.
Ban dynamic linking altogether. I think this would fix most pains I ran into when developing software that needs to run on Linux.
In many cases, I'd agree. However, for certain security-critical code such as cryptographic libraries, I'd prefer them to be implemented as dynamically linked libraries.
The last thing I want with my security sensitive libraries is for there to be bugs because I compiled an application with a different set of headers than the library was compiled with.
Just have a packaging system that tracks build inputs and knows how to rebuild the world.
Yeah this has been a real issue for me in the past. As soon as you start dynamically linking you partially lose control over the end-user experience. You are left to the mercy of “semantic” versioning, which has nothing to do with semantics and everything to do with humans naming things. It always inevitably goes wrong and you run into ABI breakage or features you depended on being classified as bugs and fixed.
Personally I prefer to be in control of every version of every library that’s part of my application. If something is broken in an open source library, I can deploy a hotfix without having to wait for the maintainer to merge it (+package managers to adopt it). More importantly if it works, it (usually) actually works.
I heard from multiple application maintainers that they do not use the latest library versions because their target distros do not ship it, and dynamic static linking is actually forbidden on many. This leaves everyone worse off, except for (nebulous) disk/memory savings and “security”.
Obviously there are limitations to statically linking everything, but the Linux kernel has been an extremely stable ABI. Hopefully one day we can say the same for a graphics/input API and be able to just copy applications from one distro (version) to another and run them…
Rust and Go already do this and it seems to work just fine in practice 🤷♂️ Very often the vulnerabilities in massive crypto libraries are not exploitable for most users. Static linking allows you to aggressively remove unused code, mitigating that. Additionally if a CVE is found, the software vendor has to review if their users are affected and if everything has been updated correctly anyway. Couple that with the downsides of dynamic linking and the argument in favor starts to be awfully thin…
Relevant post: https://gavinhoward.com/2021/10/static-linking-considered-harmful-considered-harmful/
Hear hear.
The best case scenario for dynamic linking might be macOS's singular full-featured SDK. The system frameworks folder holds 1.4GB of shareable code. But even there, Foundation is 2.4 MB. AppKit is 12MB. It would begin to be OK if many processes statically included those. And in the Electron era the benefits are given up anyway.
Unfortunately the frameworks constantly break applications when upgrading macos though, which is exactly the thing static linking would solve when redesigning userland.
Windows is in a similar position. Dynamically linking a stable userland API is obviously fine, but Linux does not have this 🥲
Dynamic linking is essential for being able to compile libraries separately from each other. You don't want each library to contain a copy of all their dependencies.
That's not how static linking works. Statically compiled libraries only contain their own object files. A statically linked application includes all its dependencies however.
No it isn't. At best you could argue that's a limitation of current compilers and since the proposal is to rewrite the whole userland including compilers... but it isn't even that.
In C land .a archives contain .o files corresponding to individual .c files. If you put only the .o files into the .a for the library itself, and not its dependencies, and then when you link everything together while you build the binary you pass the dependencies .a files to the linker as well... everything just works.
when developing software that needs to run on Linux.
I don't understand this point. No one's forcing you to use dynamic linking when shipping to Linux. Why can't you just ship statically linked executables if that solves problems for you?
Part of the problem with that is that the default libc most Linux distributions use doesn't let you do static linking. Even if you link every library - including glibc - statically into your binary, it will still try to dlopen stuff, and will still depend on those stuff being available. Making a portable static binary with glibc is challenging at best - but practically impossible.
Now there's musl, the go-to libc when one wants to statically link on Linux, but that comes with its own set of problems, like having to compile all your dependencies against it too. And there are software and libraries out there that do not work all that well (or even at all) with musl.
No one's forcing you to use dynamic linking when shipping to Linux.
For desktop applications, you're forced to dynamically load the graphics driver/display server/sound server because those are specific to the user's desktop environment, which means you also have to dynamically load dlopen/glibc, which makes it challenging to ship software with better guarantees than "works on my machine".
Back in the pre-zig days, on multi-year projects you also had to deal with the extremely unfunny problem of getting repeatedly blocked by the system compiler (or entire OS if you're unlucky) exploding because it's a dynamically linked house of cards...
glibc imposes dynamic linking requirements, as described here: https://stackoverflow.com/questions/57476533/why-is-statically-linking-glibc-discouraged
I don't understand this point. No one's forcing you to use dynamic linking when shipping to Linux. Why can't you just ship statically linked executables if that solves problems for you?
that doesn't work (and neither using an alternative libc does) as soon as you want to use the GPU as you need to dlopen the libGL.so that matches your user's GPU driver. Technically if you want something that doesn't work on 70% of consumer GPUs you can compile and link mesa statically, but if you have users that require nvidia driver then you have no choice but dynamically linked against glibc
After dealing with statically linking everything(ish) multiple times I am very comfortable doing it. My point is that it is a massive pain to do so, for (almost) no benefit whatsoever. See https://gavinhoward.com/2021/10/static-linking-considered-harmful-considered-harmful/
Go/Rust/Zig solved this problem and compiling software is an absolute joy compared to C/C++. Zig even went so far as to ship an amazing cross-compiler for C/C++, just because it was such a painful thing.
I feel a better solution would be something like winsxs, keep the actual dll version-ed, preferably signed, so we have a relatively stable abi. Windows does much better dealing the dll hell than Linux, think about the user space infrastructure improvements between windows 95 and windows 10, Linux doesn't have that much of change during last few decades.
Most .so files have a version suffix, and this works fine for leaf dependencies. It is a disaster for diamond dependencies. If my app depends on libraries A and B, and they depend on C, bumping the version of B may change the version of C it needs. So now I have two copies of C and they have conflicting symbols. You can sometimes work around this by using linker namespaces, but now passing a type exposed from C between A and B will not work.
None of this is really specific to dynamic linking, except that in a static-linking world you get exactly one copy of C and have to fix the incompatibilities at link time.
Side-by-side installation of multiple versions
Window vista used this approach, and each lib can request for a particular version, or slotting in Gentoo.
What do you mean by "preferably signed"? I don't see how cryptographic signatures would help with this.
Sad times for anyone who relies on LGPL libs, of course.
You have quite a few options:
For me the issues all stem from the system being dynamically linked per default. Making dynamic linking the exception rather than the rule is what I tried to say.
Well that I could get behind. Obviously the problem would be more acute/annoying if (5) was unavailable, which is what I took from your comment.
Mine are mostly simplification, if you could start from scratch. These aren't meant to bash what I'd get rid off. I get why all of these exist, but I think if one could start from ground a lot of complexity could be removed. But yeah with the condition of starting from scratch.
Get rid of containers, replace with things like WASI, pledge, etc. as well as static binaries everywhere, also get rid of Flatpak, Snap, etc. Why not have it be the default by expecting software developers to package things into their own system.
Make something akin to OpenRC the default, but rigid standards/linters, so things would simply refuse to start if not properly done.
logfmt (the basic one) + tooling for it. Not that "binary, but it's still text" mess we have now.
Eg. your OpenRC-alike might use something like daemon to take care of restarts, reloads, logging, etc. All these logs do the classic text + gzip/lz4/... locations (but not by the applications themselves! there is daemon for that). Then we have a utility similar to journalctl (but with quite a bit more features) to handle those files and allow structured log filtering. Instead of specifying a unit it could just be filtering app or something if it's logfmt anyways. That very utility could also do on demand index building/expansion. I imagine a cron job building indices, and basically when the tool is run it will just increment them with newer log lines. Of course with a "don't use indices" switch and detecting errors on indices (eg. version + checksum or something), etc.
Regular old cron jobs could be extended with timers features. So just another character indicating whether it should run at a later time if the system was off and such. Again the status and so on could be checked with something paring the logfmt logs. Or it could keep status differently. Haven't thought this through.
Get rid of NetworkManager, etc. with something interacting with ifconfig and a config file like hostname.if(5). I think netctl is pretty close.
Switch to the most recent OSS, maybe sndio and extend upon it for sound. Maybe something simple and have extra use cases be extra. Sound got way too complicated trying to be everything. I know, ti's harder than it sounds (hehe), but it's a bit weird how much of a hassle it has become, sometimes because of misunderstandings. It started with the lie that multiple applications couldn't talk to ALSA at the same time, then specific hardware came in, then networking and now sadly a lot of it is way too messy, for relatively basic use cases that were trivial in the late 90s/early 2000s. I know there are more now. I know audio hardware became more complex. I just think that maybe the problem emerges from in addition to that trying to be everything, then every screw you turn on messes with everything else. Don't know for sure this could really be fixed in a better way, but just a hope. What we would get rid of is layers upon layer of compatibility for applications all doing things differently. I think that alone would be a benefit of starting from scratch.
I am sure there is something like the above also existing for PAM, pokit, etc. Not sure what though. PAM was a really cool idea, but I think there are learnings to be had from the various ways of auth and permissions. Maybe using sudo/doas/... in a less "just get root" way would be the way to go? Could be totally wrong, but the fact that everyone seems to disagree makes it seem like if you'd start from a scratch with today's learnings again one could come up with a simpler solution.
Since there are actually quite a few projects in that regard: I think a middle ground for X and wayland would be nice. I get why people hate on X, but completely disregarding some cool ideas just to be different doesn't seem to be the right approach. Nowadays there are learnings from both to be had.
Embracing one standard config format. I'd go for UCL, toml or something akin to torrc. Just not YAML or JSON which clearly aren't config file formats. They are not bad, they are just horribly abused for config files, so much so that people think they are just bad config file formats, when they are not.
Don't reinvent inetd, but maybe use a good version of it. Have make be a good default way to make things. Have a lot of things that yell at you when you are doing things wrong. Use stdout, stderr, maybe even more, to make piping more flexible.
Maybe make DBUS simple, maybe make a lot more use of 9P, have an easy way to do NFS, also in terms of auth and encryption, so one can just share something over the network. It's a shame that this protocol became almost obscure in many places and all the newer ways have other shortcomings. I think there's certainly something that works for the vast majority of use cases and when starting from scratch this could be the standard.
What also would be cool if one could agree on things like --version and -V and have a clear distinction between -h for help and -h for human readable. I so frequently so things along the lines of du -h | sort -h. These are things where I wished there were linters.
To be fair though popular utilities like find having single dash currently makes things very mixed.
Regexes! Agree on one way to do regexes. I have opinions on which ones I like and don't like, but at this point I wished people would just agree on one and be done. Maybe re2?
Short fun rant: Force everyone to have a better file selector than what the last few GTK versions are up too. It's an absolutely atrocity.
Maybe make things on the desktop a tiny bit more messy again? Something where I totally understand why it exists and completely agree are how there are application.desktop files, but I have to say I miss being able to do arbitrary "shortcuts". They are a bit like ad hoc shell scripts which are nice. Maybe both could coexist?
Another "messy" thing that I've heard multiple people complain on both Windows and Linux (after KDE3) is that you cannot "graphically select files more in the sense of scrolling through your file list and simply select all the files with a certain file name length by having the rectangle you drag being in a certain position. I think on the UI side of everything being super simple and slick we as an industry overshot and make more things more complicated by having to go through dozens of screens, just to a have fewer icons and buttons. I think it could be an either or. Eg. you could be the person that would never dream of opening Windows Explorer or you are the person that does.
Also make interacting with file systems nicer. Especially across file systems. I've come across so much software that simply will break when it either doesn't have permissions all the way up to / or when it traverses a file system. Sometimes even with symlinks and such. I don't know where this stems from, but I notice that a lot of software doesn't like which is extremely annoying when you buy storage to have more storage for a certain directory and it still won't work. As mentioned, I did not investigate that too much yet, but happened a couple of times in the last year or so.
Switching from ALSA would be major regression. Alsa had lots of fixes to the drivers during pulseaudio development. The other kernel sound stacks wont support bluetooth audio and other complicated output devices. Linux audio is right in very good place with ALSA + pipewire (which has compatibility with all userspace APIs!)
I agree with almost all of this, except this:
Get rid of containers, replace with things like WASI, pledge, etc.
You talk a lot about simplification, which I too think is really important... but then you suggest removing something that's conceptually quite simple and clean, and replace it with multiple more complex things. Why?
Why not just go the Plan 9 route: every process is a container all the time, and there is no "real" level. Everything is virtualised, and if a process doesn't have permissions, it can't see outside of its box.
I know there are tools to do it but I would love to have an easy and universal interface for per process access (permissions, resources, etc) and the same for delegating access as well. Such that one user could easily and safely delegate some commands to another and one or a series of user scould only run commands in those folders with such permissions, etc. Nowadays I feel like it's only possible with a lot of different tools used together and without a common interface across all types of accesses.
Love this line of inquiry. A lot of what folks already said resonates with me already, but a couple of thoughts:
ever read the Debian packaging manual; pretty fucking arcane
That's only a self-imposed wound of DEBs and RPMs, Arch does it much more approach-ably.
People have thankfully already talked a lot about sandboxing (although I think the Linux kernel wouldn't be a good fit for a tightly sandboxed system - but that's a whole other topic), which is my main concern, but I'd also clean up the shell a bit.
Plan 9 allows commands to be in subdirectories, e.g. git/commit. I think that's much cleaner than subcommands, and allows their tab completion without special support in the shell.
On that note, it'd be nice if getopt syntax was unambiguous. I think one step in that direction would be to require = before non-positional arguments. For example, instead of df -hxtmpfs, I'd run df -hx=tmpfs, which can be unambiguously parsed as  df -h -x=tmpfs. I don't think there are any real benefits to this, but if we're designing the system from scratch, why not?
Zsh with setopt path_dirs has allowed this git/commit idea for decades.  Ever since POSIX took it away in POSIX sh, really.  In earlier eras, whole systems like git were built around this.  Besides your Plan 9 example, another notable one is probably the MH mail handling system which would have fairly non-descript names like "inc" to incorporate email, but you could invoke as mh/inc.
Zsh is really popular which argues that this always remained in user space.  It is true, though, that exec[lv]p[e]* probably won't do this for you and definitely system which calls the shell won't.  Anyway, you sound like you might be a Zsh user in the making. :-)
although I think the Linux kernel wouldn't be a good fit for a tightly sandboxed system
Partly in jest, but Android begs to differ 😅 It would definitely be a change, but user-per-app + seccomp + namespaces can go pretty far when it comes to sandboxing.
Linux has the same sandboxing system as plan9. The difference is that in plan9 you talk to the kernel using filesystem (9p) rather than syscallls which give you much more granular sandboxing.
Plan 9 actually barely has any sandboxing. IIRC (been a while) "upstream" Plan 9 only really has fork(RFNOMNT), which completely disables mounts (a core feature of the system), breaking most programs. 9front has auth/box which I believe is actually useful.
The former is actually an issue Linux sort of shares. Sandboxing often controls access to system features instead of the actual resources you care about. This means some programs might just not work with reduced privileges. In particular, you might not be able to sandbox the sandboxing tooling itself, which is unfortunate (but I've ran into this with other sorts of programs too).
Not limited to Linux either… too many systems object more if the user were lying to the programs (which would get full APIs, just simulated data) than when the programs are lying to the user.
This question looks spectacularly like "If you could redesign Linux kernel from scratch..."
Well, systemd was a smart redesign of init, because of those great kernel features (taking advantage of the cgroups was enough reason to create systemd).
Libc, dbus, pam, will probably look the same, as there's no compelling reason to explore dramatically different ways or split/join functionalities. Maybe the kernel could implement features to simplify them (yet, a couple projects failed at moving dbus features into the kernel).
I guess sound and graphics could do better... if commercial hardware design was not a mess.
This question looks spectacularly like "If you could redesign Linux kernel from scratch..."
Really? It looks like the polar opposite to me.
Take the kinds of guarantees we seek within userland processes, like eliminating shared mutable state, and lift them up to interprocess concerns. A bit like the pure keyword that's always proposed for language functions—at how many levels of execution hierarchy can we guarantee-and-require that the subtree beneath has a certain category of safety?
Reconsider the filesystem hierarchy standard, or maybe look at plan 9's namespacing. A world of sandboxing apps would really like things to be organized by use, not by kind. However, I don't want to go so far as to stick user data in that app sandbox where the app begins to own it. I don't know the right answer but, by analogy, I have a much better time focusing on my goals when each of my project folders has a todo list and a repo in it, compared to when I used a todo database app with one mountain of tasks and a single repos folder with all my git clones.
Taking the loose definition of Linux userland: I'd also take a pass at user interfaces. Apply some of the principles we learned in the GUI era like progressive disclosing details, having each program teach you how all programs work, etc. Iron out inconsistent names of common CLI options. Bless the idea of subcommands and orderings, gathering many commands into fewer. Don't hide help away in separate commands like man and info or the browser.
While we're at it, of course, standardize the common keyboard shortcuts whose use we expect to last. Copy and paste and undo have got to work the same everywhere. Another thing to swipe from macOS is the customizable behavior of the keyboard in all GUI text entry fields, whose defaults provide a useful handful of Emacs keys.
It'd be nice to catch up to the 1980s and have SIGINFO, at least.
I think signals are quite an abomination and if it were me I'd design a custom IPC mechanism for everything like this instead of using signals, although in theory it's possible to define SIGUSR1 to be SIGINFO if we control all of userland.
Wouldn't the default action kernel performs for it remain "kill the process", rather than ignore?
I think «control all of userland» includes «default it to inheritable ignore, and re-set to inheritable ignore before each exec»
Way to do gpu graphics without linking to system libc. Honestly i wish mesa made their vulkan impl loadable by simple elf loader, no dependency to libc. TLS might be pain. And non mesa impls probably would have the same problems, but at least it would be way forward.
Video very related: https://youtube.com/watch?v=pq1XqP4-qOo
You can already use any libc you want (as long as the GPU driver library is also linked to it). Self-contained graphics drivers would be nice but I think the real solution is eliminating NVIDIA.
What i mean, i want to do graphics from a static binary. Its not possible on linux. And libc dep is reason you cant have binary that just works on every distro.
Why not ship libc?
Wont work, it has to be same libc as what mesa (system libs) are using.
The video goes very well on the issue.
What I meant to suggest was shipping all the libraries you depend on, including libc and Mesa.
In theory you can do that. But then your thing wont work on new gpus for example as userland drivers need update.
You likely will have to write your own GPU userland. This isn't impossible but takes a remarkable amount of effort per supported chipset.
One of the partial designs that I had:
I think I haven't fully appreciated what people want from contains and I need to investigate further about what's possible with unshare.
Filesystem hierarchy| Place | Description | 
|---|---|
/boot | 
Bootloader, Linux, initramfs, PID 1 binary, and supporting files | 
/devices | 
Device files under a custom structure and naming scheme, by some eudev-like thing | 
/system/config | 
Base system configuration | 
/system/components | 
Base system components | 
/system/data | 
Base system mutable data | 
/system/logs | 
Base system logs | 
/system/run | 
Base system runtime files such as UNIX domain sockets | 
/system/kernel/sys | 
Typical Linux sysfs | 
/system/kernel/proc | 
Typical Linux procfs | 
/vendor/config | 
Third-party configuration | 
/vendor/components | 
Third-party components, e.g., each traditional "package" is one directory in here | 
/vendor/data | 
Third-party mutable data | 
/vendor/logs | 
Third-party logs | 
/vendor/run | 
Third-party runtime files such as UNIX domain sockets | 
/local/config | 
Local software configuration | 
/local/components | 
Local software components, e.g., each traditional "package" is one directory in here | 
/local/data | 
Local software mutable data | 
/local/logs | 
Local software logs | 
/local/run | 
Local software runtime files such as UNIX domain sockets | 
/magic | 
Reserved for per-process use via mount namespaces | 
/magic/self | 
Potentially an extended version of /proc/self... if feasible without changing Linux | 
/mounts | 
Global mounts | 
/mounts/removable | 
Global removable media | 
/mounts/network | 
Global network drive mounts | 
/root | 
Home directory for the root user | 
/users | 
Home directories for normal (non-system) users | 
/users/<name>/home | 
The part of the home that's actually manaually managed by the user | 
/users/<name>/config | 
Individual users' program configs | 
/users/<name>/components | 
Individual users' software components if they want that for any reason | 
/users/<name>/data | 
Individual users' program data | 
/users/<name>/logs | 
Individual users' program logs | 
/users/<name>/run | 
Individual users' runtime state and sockets and such | 
/users/<name>/registries/executables | 
Individual users' executables symlink farm | 
/users/<name>/registries/libraries | 
Individual users' libraries symlink farm | 
/users/<name>/registries/sources | 
Individual users' sources (mostly for libraries) symlink farm | 
/users/<name>/registries/resources | 
Individual users' shared resources symlink farm | 
/users/<name>/registries/ipc | 
Individual users' IPC interface sockets symlink farm | 
/users/<name>/registries/services | 
Individual users' service interface definitions | 
/registries/executables | 
Whole-machine executables symlink farm | 
/registries/libraries | 
Whole-machine libraries symlink farm | 
/registries/sources | 
Whole-machine sources (mostly for libraries) symlink farm | 
/registries/ipc | 
Whole-machine IPC interface sockets symlink farm | 
/registries/resources | 
Whole-machine shared resources symlink farm | 
/registries/services | 
Whole-machine service interface definitions | 
gui_file_selector in a non-user context. So, for example, a user registry
might have an executables/gui_web_browser which is actually a symlink to
/vendor/components/firefox/executables/firefox_wrapper_for_this_operating_system_interface.
Programs that need to call a web browser call the gui_web_browser with a
pre-defined calling convention specific to gui_web_browsers as defined by
its schema. Similar goes for IPC: there are well-defined things that provide
notifications_daemon and how it is invokved. There may be helper utilities
that manage registries. Maybe the package manager could tell the user that
"the package you just installed provides a notification daemon, you may use
registries_config or just plain symlinks to point to it if you would like
to use it. Users/admins are expected to use their service manager to start
the correct services to listen at the pointed-to sockets with the right
protocol.config/data directories./devices will be structured like /devices/block/nvme/nvme0n1p2./local means not package-managed at all. Admin get to do whatever they
want. Kinda like /usr/local on most Linux systems./system and /vendor.name@major@fullversion. Then each name symlinks to a
name@major and each name@major symlinks to a name@major@fullversion?
So things that expect a particular (set of?) major versions could use those
major versions while humans could use the unversioned ones. It's rare for
things to want to use the completely-versioned one unless they hook into each
other's private APIs. However applications might want a name@major but want
a minor version of something or above... tbf this should probably be resolved
through the package manager. The name@major should typically reference the
latest real component of that major version since semantic versioning is
required here, so things won't break in theory.Boot into a BASIC interpreter?
Or lisp interpreter!!
I wrote my own freestanding lisp, one of my long term goals is to boot Linux directly into it and bring up the system from inside the REPL.
I will ask for a standard common base of GUI development. Not a unified desktop, just a common API to do most stuff like Win32 or BeOS/Haiku API. It wouldn't need to be perfect, but at least a minimum common denominator for everything. Commercial UNIXes had it with Motif, but Motif was not open source. X11 or Wayland are too low level to target directly.
Abandon all old terminal emulator protocols and only go for a newly designed protocol aiming to be a better terminal (TUI, not GUI) than VT100, than xterm-color and better than what was available in DOS. DOS terminals are old, but they never had the issue of text becoming "skewed", and seldom had the issue of terminal codes becoming a mess and needing a reset. The Kitty terminal emulator is on the right track, but I believe there could be room for a new and even better "ultimate TUI protocol".
I'm not going to come at this from some super level of expertise...but more like things that annoy me...
There are more, but now i'm sounding like a whiny user...so i think i'll stop. ;-)
Design goals/principles for a concept I call WikiOS (feel free to steal it):
Sorry I don't have the technical chops to specify what would have to be implemented at lower levels to enable these design goals. Maybe you could help? ;-)
What does "link" mean to you in the context of software?
Yeah, sorry, I saw that elsewhere in the thread people are talking about static vs. dynamic linking of binaries/libraries, but it was too late to edit my comment.
I'm talking about the user interface, linking in the sense of URLs on web sites and wikis, or perhaps even Project Xanadu. What I imagine is I am writing a document discussing a specific program on my computer, and I can link to it so that anyone reading the document can simply click on the link to launch it. Maybe it has a little preview card so they know what they're launching, and they have some indication that I'm not simply rickrolling them. Or perhaps I have a photograph, and I want documentation of its history connected to it, so in its annotations/metadata I link to an ebook explaining its origins.
One important goal is to enable users to document what everything is and how everything works, in their own words / images. My hope is for them to truly understand and be able to edit everything on their computer.
We can sort of do some of this linking online, but if you want a local-first design, it's pretty grim.
I would hope that for batch processing tools, it is a reference to some kind of a prompt where a part of the request is prefilled and you are prompted to fill the rest (so, like a link to a grep command line with the pattern already there and filenames left to specify); and for GUI predesigned-window-structured things, a reference to a state with specific loadable data loaded (if applicable) and a specific window open with a chosen set of controls having the emphasis.
I'm not sure but I wouldn't break userspace
I can't count how many times something broke on an upgrade. Sometimes while using an out of the box config
For the shell, I'd support proper data structures for nested hashes and arrays so that any json / xml / yaml data can be represented, manipulated and searched with the same globbing being able to apply to either the filesystem or data using the same syntax which would be more like xpath or jq and not quite as terse as you get with zsh globs. For pipelines, I wouldn't do the object API of Powershell but use something like json so whatever replaced sed, awk, cut etc would be dealing with the data structures more directly but retaining flexibility and still allow piping of unstructured text. Otherwise, shell globs would not match directories by default, not apply in command-positiion and fewer characters would be special. For the tty interface, I'd have a more flexible separate, out-of-band channel so that you can query the terminal for characteristics without interfering with the keyboard input buffer (this currently exists in a very limited and inflexible form). But I would resist anything that could lead to the terminal getting anything like HTML with malware, adverts and crap filling it.
I would change text files so that the newline character is used to start a line rather than end one. This simplifies a lot of code when appending - similar to when generating comma-separated lists, you always need logic to detect the first/last item in the list. For Unicode text encodings, I'd put combining characters before the base character so you don't need to read ahead.
i would take assign a benevolent designer responsible for simplifying the ecosystem & making it consistent. the gnu/linux userland is a hodgepodge of accidents, tragically mashed together. once you understand what a cohesive system "feels" like, you realize how much you're missing out on.
if you've only ever used Linux, consider giving serenityos a shot - spend a day or two fiddling with it - it's a 90s-feeling operating system, designed without constraint - everything feels fresh - the terminal, applications, and overall workflow.
or, if you're a terminal lover, i suggest trying openbsd for a bit - learn about pf, the various ctl commands, unbound, the gui+ssh integration, the extensive and extremely useful documentation.
Maybe this is something that can already be done but I would want to put all user data on a USB stick or network. Currently it seems to almost work but it runs into two problems:
Having recently watch the keynote of USENIX ATC '21/OSDI '21 Joint Keynote Address-It's Time for Operating Systems to Rediscover Hardware by Timothy Roscoe (as pointed out in another Lobste.rs thread), it would appear that the kernel is the problem as it does not represent the computers we actually have and use...
Sandboxing for everything but heavily using layers. If a login to my computer, it should create a new namespace/container …
/sys (read-only)/ctl
/apps
/
When I want to launch an app, I'd use a command (maybe launch) to create a sub-namespace/-container with …
/sys (still read-only)/ctl-files/-dirs or even just files/dirs mocking them. This includes a control socket to request access to more resources. Such a request could trigger permission dialogs or file pickers./apps to /
Using this layering, I can use apps provided by my system but I can also override an app with another version or an alternative and install my own apps by modifying my view of /apps. I can also override parts of an app so this app cannot change a certain config by placing data inside of the overrides for that app.
Of course, there can be more layers. My user data can also be on a remote fs.
My desktop environment would also be an app but I'd mount some control sockets and maybe everything into it so it can launch other apps, do the permission dialogs and mount some data directories into apps.
Naturally, each namespace/container has their own init system. My user might use a more complex init system than a calculator app which would probably only launch it's main binary. A browser or desktop might have more complex needs again.
I might also launch new sub-namespaces/-containers for projects that need different dependencies. This is similar to dev containers but because I layer these dependencies on top of what I already have, I can still use my favourite tools.
Probably errno for starters would be up there. At least on Windows, there are a much larger and more defined set of error numbers. POSIX reuses error numbers for far too many disparate things, and extending it is hard because of direct comparison of existing errno values.
libc kernel interface instead of syscalls.Stable and language-agnostic system calls are one of the things that make Linux unique. Why would you sacrifice that?
People already treat libc as the de facto kernel interface.
Is "get rid of fork()" on the table?
https://www.microsoft.com/en-us/research/wp-content/uploads/2019/04/fork-hotos19.pdf
That's a kernel issue, not in the scope for this question. I have less I'd want to change for the userland, a bunch of changes for the kernel.
Yeah same, a lot of what's under discussion here wouldn't be unix compatible, and without being a unix, would gnu/linux gone anywhere? I kinda doubt it, the history of computers is written by systems that are compatible
I believe that Plan9 like process namespaces and fs binds would make both docker and nixos obsolete or at least much simpler so I'd port them to Linux. Also get rid of SUID.
But linux already has them, and namespaces are exactly how both nixos build sandboxing and docker works
So basically just QoL and I'll leave the hard problems to the other people hee ;)
no more locales
How would you solve localisation then? I'm curious, I can't tell if you mean that the POSIX locale system is bad, or that localisation is somehow unneeded?
Here's a hot take: All software should be exclusively in English, with the only exceptions being user-generated content (for obvious reasons) and cases where exact phrasing matters, such as business, law, government-related use cases in a non-English speaking country.
Yes, and let's just go further and socially engineer people to remove all other languages altogether! Just so we could remove locales.