Ending 15 years of subprocess polling
24 points by sanxiyn
24 points by sanxiyn
I wonder why they went for polling here? From what I know, the self-pipe trick would have worked (ie: register a pipe endpoint with your event loop, register a SIGCHILD signal handler that writes to the other end of the pipe). But presumably, there's a good reason that nobody did that, so I'd love to hear what that is.
That would definitely work in a given program. As a library however, it may want to allow the program to use its own SIGCHLD handler. This is possible, since you can get the prior handler and delegate to it, but this gets hairier if there are multiple threads. Even if you don’t want to allow the program to have its own handler (which you can’t really prevent), if you have multiple threads running processes concurrently you probably can’t use the same pipe for each process, because the wrong thread could wake up. You could spin up a thread to centralize the polling/waiting, but in an otherwise single threaded process you probably don’t want that.
Also complicating the issue python always runs signal handlers in the main thread. So if you're waiting on the main thread (which would be the norm for most scripts) the entire thing has to be in C to bypass the runtime's behaviour, and even then I wouldn't be surprised if there were more footguns.
I think it's because the self-pipe trick needs an event loop
Python 3 asyncio indeed uses the self-pipe trick for async process completion notification
But this appears to be talking about the regular subprocess module, which uses the normal blocking model
And yeah also as mentioned in a sibling comment -- signal handlers are basically process-global variables, so registering SIGCHLD behind the back of every program that needs to run a process with a timeout is problematic
BTW a funny thing I found in https://oils.pub/ is that it doesn't need to register SIGCHLD, while other shells do?
Because Oils is basically a waitpid(-1) loop, which gives you the next process that terminated. It's like an event loop, but only for processes, not for file descriptor I/O.
I kept thinking we would eventually need to register SIGCHLD like some other shells, but we never did!
I sketched a blog post about that awhile back, but didn't write about it in detail: https://www.oilshell.org/blog/2022/01/notes-themes.html#a-model-of-the-runtime
BTW a funny thing I found in https://oils.pub/ is that it doesn't need to register SIGCHLD, while other shells do?
Hmm, curious. I had a quick look at FreeBSD’s version of ash, and it messes around with SIGCHLD so that its normal wait() job control logic isn’t broken when the user catches SIGCHLD with the trap command.