Binding port 0 to avoid port collisions
17 points by hwayne
17 points by hwayne
I find it sad how little used Unix domain sockets are. Not only are they more efficient than TCP loopback, but you can name them whatever you want! And put permissions on them! It completely avoids the issue of remembering which port is for what, or picking an unused port. I try to use them wherever possible in my homelab, but lots of software doesn't support them or treats them like a weird edge case. I wish I could go back in time and advocate to (1) make them more portable (Windows, I guess), (2) raise the annoying ~100 char path limit, and (3) standardize some way of specifying them to servers like unix:///path/to/socket?mode=660.
Another fun thing you can do with Unix domain sockets is pass Unix domain sockets over them (actually, any file descriptor)! This is called "ancillary data" in POSIX-speak (SCM_RIGHTS in the socket API) and is one of the more cursed BSD socket APIs, but it's very useful.
Or any other FD – passing PIDFDs over them is a great way to do cross-process, race-free collaborative process management.
And you can have the kernel insert the connecting client's UID as ancillary data, so that you have a built-in way to authenticate clients without any additional credentials scheme beyond normal user accounts on the system.
And also make ACLs work on them so that your server process doesn't have to manipulate umask() or fchmod() the socket for it to be accessible to processes running as another user.
IMO, the cleanest way to get around this on a Linux system is systemd socket activation, but not everything supports it, and it's awkward in a test scenario.
Nowadays Windows does support Unix sockets to some degree, but software using that can be even more lacking. For example, I've been waiting for Rust to support this for a while. (There seems to be some progress recently.)
I use caddy as the universal unix socket translator to mux from poor clients like Firefox into dynamic services that listen on unix sockets. Works great!
Is there a similar trick that extends to more than one process/port? Let’s say I want to start three processes that need to talk to each other, and each process accepts ports of the peers on the CLI. Each process can listen on port zero, and learn its own port number which can be passed onto peers, but this is too late to pass it on the CLI argument.
The trick is called Bonjour. It's not simple, and it's not the only solution in that space, but it's a good starting point for looking at.
Otherwise, the trick is called "hacking something together by writing port numbers to files in well known locations".
In our team, we have a library used in testing for allocating ports ahead of time (including keeping track of reserved ones), and then a test runner sets env vars for system components to know which ports to use for what.
https://github.com/fedimint/fedimint/blob/master/utils/portalloc/src/lib.rs
Modify the programs to support fd input, then an orchestrator can create three sockets, and to each program pass the fd for its socket and the ports of the other programs.
An other alternative is to support unix domain since they’re names.
If you're already learning things in this area, I highly recommend investigating listenfd and systemfd Rust crates. They are very handy for working on network projects alongside cargo-watch, etc. Also listenfd can be used to implement systemd socket activation handover.
Is this... uncommon knowledge? Random ports are used everywhere for everything. Do people just import get-port from NPM without knowing how it works?
You might want to take a closer look at the example, specifically these two lines:
launch_webserver("localhost", 0, app(&config));
let listener = TcpListener::bind(format!("{host}:{port}")).await?;
It’s indeed true that everyone is using random ports, or picks a port number by opening and then closing a socket. And this is the wrong (inelegant, incorrect in edge cases, promoting flackiness) approach. In particular, npm’s `get-port’ is wrong.
What almost everyone should be doing instead is binding port 0, and using the resulting file descriptor. If you close fd, and return port number, like get-port does, you create a race condition.
I definitely knew about this at some point and then forgot about it, as just last week I wrote code that used a random port and hoped that the random number generator was lucky, which I will now update to use this :)
I think it's more that while this is a correct answer to "how do I launch something without a port collision", it's not always a sufficient answer.
Suppose your work involves two web services, which we'll call A and B, and there is at least some level of dependency between them, such that one of them needs to make HTTP requests to the other. It is true that you could run them both locally on ephemeral ports chosen for you by the system, but if, say, A needs to make a request to B, how does it find out which port B is running on?
You could keep going with the ephemeral port solution and invent some sort of signaling that they can use to communicate their ports to each other. But you could also be forgiven for thinking maybe if you have to invent such a protocol this might not be the right approach for the use case, and for turning to something else.
One approach I've used a couple times is to set up a reverse proxy like Traefik with automatic service discovery/registration. The proxy can run on a single standard predictable port, and do path-based routing to whichever services happen to be running. If you do this with, say, autodiscovery based on joining a Docker network, the services don't even have to have ports on the host machine. And since it's likely that deployed environments will route through some sort of centralized gateway/entrypoint, it also helps local dev better resemble deployment.