How I’d like my init system / service supervisor to be
14 points by runxiyu
14 points by runxiyu
I like the outline of design criteria to sketch a better overall system. Really helpful would be an assessment of existing solutions, e.g. runit and, of course, OpenRC and systemd.
Interesting. You don't control services, you control states. You can't turn something on or off, you can only create a state where it is on or off. I really like the intention of this, but it would need to be the right implementation.
Speaking hypothetically and assuming my assumption is correct:
You can't stop something like a firewall unless you've created a state where the firewall is off. And if you created a rule that the network should be down if the firewall is off, you have no state where you can turn the firewall off and still have network access, unless you create a state with that dependency tree.
Effectively this system rigidly locks what state the system can be in. If you didn't create the state you need ahead of time, you have to create one, while accounting for any dependencies of all other states.
This creates a lot of incentive to bypass the supervisior, either by using things like monit, running services without supervision, or creating states for each individual services without specifying dependencies.
But, you can mitigate this by having an option on a service that creates an associated state without the user having to explicitly make one. (Call it independent control, which explains why other services have it off.) The services that come with the operating system wouldn't use this feature, because you know what states the user is likely to need. These defaults will nudge users in the correct direction.
Users are already going to try to work around the enforced structure of states. Instead of making them take extra steps, make it a feature. This doesn't require any compromises to the design of the system. Turning this option on for the system firewall won't be useful. If the network requires the firewall to be active, a firewall state cannot disable it.
You can't stop something like a firewall unless you've created a state where the firewall is off. And if you created a rule that the network should be down if the firewall is off, you have no state where you can turn the firewall off and still have network access, unless you create a state with that dependency tree.
I think that's probably a good idea. For reasons you point out in the next sentence.
Effectively this system rigidly locks what state the system can be in.
Yes. As opposed to running systems where you don't know what state it's in.
"What's it doing? I dunno. Something. But it's in an unknown state, so I can't be sure"
There are issues (as you point out) with users trying to bypass a state driven approach. But IMHO that's a reason to talk about overall pros and cons of each solution. Just focusing on one issue will bias any decision.
Your summary reminded me of this blog post about apk-tools.
So the apk of init systems, perhaps.
I think that building an init system on the concept of dependencies is a mistake.
Services might, of course, require certain other services to function correctly, but a malfunction of one service must not bring the system down.
In other words, there the init should not deal with dependencies, it should start all services at the same time, and the services that cannot start at the moment shall report that and be restarted after a short delay, until finally starting successfully.
I've found the "try and try again" method to be very problematic. It's almost always better to be state / event driven. Dependencies are event driven, and are almost always better.
e.g. if your web server requires access to the database, and the database is down, it doesn't help to try restarting the web server over and over. Instead, wait until the DB is up, and then start the web server.
If instead you have the web server "up", it has no benefit, and many downsides. Any external system monitoring the web server may think it's "up" because TCP connections can be made to it. But since the DB is required, the web server is actually "down" and unresponsive.
Rather than making an argument based on what the solution should look like, perhaps you could explain why the "try and try again" solution is better than a dependency-driven one.
It also doesn't generally hurt to try restarting the web server over and over.
Moreover, the database might be down due to some transient network issue which will resolve, letting it start up again.
edit: With the try-again approach, you can also fail early and retry in case your dependency is down, but fail without re-trying if the dependencies are there but your service itself fails. The possibilities are endless.
There is an argument to be made that logs will be filled up or whatever, which honestly these days isn't really that much of an argument. We have compression, and your logs of your service failing to restart once every N seconds are going to be less noisy than the logs during an actual attack. If your logs get filled up because your service is down for the N hours it takes you to realise, I have bad news for your incident responder.
Any external system monitoring the web server may think it's "up" because TCP connections can be made to it.
Any sane monitoring system should be designed to check if the database is up in some fail-safe way which doesn't rely on another component functioning correctly. Likewise, if your website monitoring thinks the website is healthy when it doesn't actually work, your monitoring is screwed.
Moreover, there is nothing stopping you from adding an sv down webserver in your runit finish , just like in systemd units you would add BindsTo (which is not common). Requires doesn't have the semantics you describe here (in case you didn't know).
Dependency driven init has the primary goal of speeding up startup by avoiding a slow, fail and try again cycle which may repeat a few times.
In practice, when using runit, there isn't any noticeable slowness for booting. Certainly nothing like the era of sysvinit. In my experience it's faster than most default systemd configurations for systemd distros. I don't believe this is the fault of systemd, I think it's the fault of the configuration, but who knows. Maybe I am giving systemd too much benefit of doubt.
Dependency handling requires a reliable way for a service to notify the supervisor that it is in fact running and healthy. If your service has this feature you can also use it as a weak monitoring signal. If your concern is monitoring, you don't really need dependency driven init, you need a reliable way of checking the health of a service.
This is not to say that there aren't cases where you would want to do something special for the failure of a service, but that's kind of why runit added the finish script.
It also doesn't generally hurt to try restarting the web server over and over.
In my view, it's at minimum unnecessary work. In real life, you wouldn't drive to your friends house 5x a day to see if he's there. You'd call. Or you'd ask him to call you when he's home. If dependencies are good enough for real life, why not for computers, too?
Except that in real life, the "try and try again" approach has a cost. With computers, you don't see that cost. But it's still there.
On top of that, the "try and try again" approach leads to cascading chains of failure. Multiple systems are up, or down, depending on complex and unpredictable interactions. That's a nightmare to debug.
Any sane monitoring system should be designed to check if the database is up in some fail-safe way which doesn't rely on another component functioning correctly.
That missed my point entirely.
Likewise, if your website monitoring thinks the website is healthy when it doesn't actually work, your monitoring is screwed.
I think you're switching topics here. Checks for the web site being healthy are functional tests. And functional tests go substantially beyond "is it up". So that's not really the same topic.
Functional tests can fail even if both the web site and DB are up, because some other configuration changed. I'm not aware of any init-style system which does functional tests. Running a "finish" script is nice, but it's not a good solution. Any functional tests for live services needs to run continuously.
I think the disagreement here is that I see problems in the approach you advocate, while you see problems in the approach I advocate. I believe the problems in my approach are minimal and easily managed, In contrast, I think the problems with the "try and try again" approach have unknown, and cascading, chains of failure.
Except that in real life, the "try and try again" approach has a cost. With computers, you don't see that cost. But it's still there.
The cost can be trivially made microscopic compared to the computational cost of dependency resolution. If cost is all you care about, there's that to consider.
On top of that, the "try and try again" approach leads to cascading chains of failure. Multiple systems are up, or down, depending on complex and unpredictable interactions. That's a nightmare to debug.
You need to provide a concrete example because there is no failure situation where try and try again doesn't behave identically to dependency chains given that on failure of a true dependency, dependants will fail regardless of how you started them.
That missed my point entirely.
Then make it again, using different words.
I think you're switching topics here. Checks for the web site being healthy are functional tests. And functional tests go substantially beyond "is it up". So that's not really the same topic.
"Is it up" isn't a well defined concept which is why this topic is entirely relevant.
If you use a dependency driven init and a service depends on a specific API being up, and you are hosting this API behind nginx using fastcgi or whatever, systemd will be informed by nginx that nginx is up regardless of whether the actual API is working. You could make nginx depend on the fastcgi server, but then you're delaying startup, and even then the fastcgi server being able to accept requests doesn't necessarily mean the API is working.
In real cases the thing you are talking about is a hard problem to universally solve, and systemd doesn't really offer any broad solutions for it. The "try again" approach is strictly more reliable and easier to solve. You can literally have your service's run script make a request to the API's health-check endpoint before it lets your dependant service start if you want.
Without retries all you're going to end up with is transient situations where your dependant service fails to start because it started too soon, crashed, systemd decided it wasn't working, and decided to stop restarting it. This is why systemd still defaults to some retries (although a finite number which I strictly think is a dumb default).
I think the disagreement here is that I see problems in the approach you advocate, while you see problems in the approach I advocate. I believe the problems in my approach are minimal and easily managed, In contrast, I think the problems with the "try and try again" approach have unknown, and cascading, chains of failure.
I recommend you actually read the documentation for how systemd determines "is it up" because it's not some magical process which reads your mind to determine what your idea of "up" is before coming up with a reliable mechanism for checking for that idea.
The chains of failure you talk about will happen in a dependency driven init. Dependency driven init is strictly designed for speeding up startup, and will not help you avoid cascades of failure or whatever without extra work from you. At which point the extra work is easier to do with runit than it is with systemd.
"Is it up" isn't a well defined concept which is why this topic is entirely relevant
I agree. Which is why I distinguished "up" from "functional". The counter-argument seems to confuse the two.
If you see the "functional" status as being the result of a series of complex state changes, then your view of "functional" changes. There's no need to "try and try and then suddenly decide that 1000 tests pass, so it's finally UP".
Instead, there are a sequence of states that the system has to move through before it's fully functional. Each state has pre-requisites, and each state has validation steps to ensure that the system is in that state. The final "fully functional" status is determined when all necessary state transitions have been made, with all validation steps pass.
Now, this isn't a perfect approach, as you point out. Things may change over time, systems may go up/down, etc.
But if you don't have a clear view of "A needs B, and I can check that A is running with test T", then you're left throwing spaghetti at the wall to see if it sticks. That's why a dependency / state-driven approach works for me, and why a "try a bunch of things" approach worries me.
But if you don't have a clear view of "A needs B, and I can check that A is running with test T", then you're left throwing spaghetti at the wall to see if it sticks. That's why a dependency / state-driven approach works for me, and why a "try a bunch of things" approach worries me.
You're describing a polling based, or retry based, approach.
To make this work in a dependency driven init, you need an event based approach where you don't "run a test" you "get told this thing is 'ready'" for some definition of ready. That's a much harder problem to solve and it's why the retry based approach solves your problem much more simply.
The problems you ascribe to a retry based approach are not problems with the retry based approach, they are problems with the ambiguity of what "up" means. And those ambiguities exist in both startup approaches. The solution just becomes easier and simpler in the "retry" approach compared to the dependency driven approach.
The "retry" approach still supports dependencies, it just doesn't support events like the dependency driven approach that systemd uses.
Some programs start with different features based on start-time feature checks. I believe dbus dependency is a common one, as is avahi/zeroconf.
Though if we wanted to build an init system around dependencies, we should seriously trial make as a service supervisor. Imagine different init scripts, one that's systemd compatible, one that's OpenRC compatible... all based on the file extension and inter-dependent in your /etc/init/Makefile.run
You can handle this explicitly in the run script for a daemontools style service by ... explicitly asking the supervisor if dbus is up and failing early if not.
This falls firmly into the category of "know your service".
There's also a TOCTOU here, but guess what, there's no non-TOCTOU way of solving this problem even using systemd.
In this case I would say that if the service has no "--expect-dbus-or-fail" option then it's broken.
The target based approach is a clean conceptual model but as a system administrator, I think it would create a significant amount of operational complexity. At work we temporarily stop and start services on a regular basis; sometimes a service must be down for some minutes while something is updated, sometimes we need something running only briefly, etc. None of this fits in a deterministic target based model, and 'make a new target that depends on your current target' creates a bunch of fragile complexity in operation that I can easily imagine blowing your foot off (among other things '... that depends on ...' is extremely load bearing, if you accidentally omit it your system explodes) and even if it works gives you targets that have the vibe of "document draft 2 edit 3 final 7 really'.
Since units may be in failed/refused to start states, the target doesn't really specify the current state, it describes an aspirational state. As with today's systems, the actual state of the system can only be determined by examining the set of currently active/started units. In practice units can fail at arbitrary times, including when reloaded or restarted (another operation that happens regularly, which is different from a reload; a restart is a re-exec, a reload is an internal operation in the service).
Thanks a lot for the feedback here — I'm thinking whether I could solve this by using transient targets. So I provide imperative commands: restart doesn't change whether a service state is enabled<->disabled (although it could turn into enabled but failed). But start/stop do change how the graph looks, so the new target is described in a transient target, and linitctl status (let's assume that status without arguments means check whole system status or something) could tell the user that they're in a dirty/transient state and perhaps show the diff from their last-configured fixed state
You may be interested to take a look at how Sortix init does it. service foo start and service foo stop modify an in-memory version of a daemon (by default, the "local" virtual daemon) to add/remove dependencies on "foo". Enabling/disabling daemons with service foo enable / service foo disable modify the on-disk /etc/init/local too.
Frankly, I’d prefer the optionally multi-machine init and supervisor. Controlled using thrift or similar RPC mechanism over specialised protocol with noise-style encryption. Something like this is a thing I’m genuinely missing right now.
I'm not sure if Noise makes sense, mTLS is likely more deployable in most environments where this would be useful.
There are some good ideas in https://aurae.io -- which sadly hasn't seen much development, because of https://lobste.rs/s/8y4xnk/announcement_regarding_kris_nova :(
I made modernised and stripped noise at work, we're using it for mobile/desktop app backend. I'm not a fan of TLS because of how bloated and tied to browsers it usually is.
Interesting — if I were to make this (unlikely), I'd likely base it on some variant of SCTP and Noise