Dependency cooldowns turn you into a free-rider
18 points by carlana
18 points by carlana
I wrote the original (?) cooldown post that’s linked in this response.
I think this post is directionally accurate (cooldowns are a form of free-riding, which is the goal for mostly unpaid open source maintainers), but misses a key part of the original argument: you’re not free-riding on other maintainers, but instead on a number of “supply chain security” companies that are financially incentivized to find malware as quickly as possible.
The recent wave of open source malware demonstrates this (as originally speculated in my post): Trivy, LiteLLM, etc. were detected by scanning parties, not by users being victimized; victimization also happened, but wasn’t actually necessary for timely discovery at all. That’s the core premise behind cooldowns.
I agree with the points about configurability and variance, however. It’s not clear to me that different tools within an ecosystem (much less different ecosystems) will ever align on cooldowns besides the high-level idea. I’m also not sure it’s a good use of anyone’s time to fight that battle; I’ve mostly thought of cooldowns are a layer atop lockfiles, so the “goal” is to lock the cooled-down dependencies once and practice discipline when updating them. Easier said than done!
Edit: I should also say, I essentially agree with the idea that an upload queue is more correct than a client side cooldown. But it’s also a harder political problem: it requires indices to become more active participants in overriding the queue for vulnerability releases, for example. This is, in my experience, a hard sell for the maintainers of these indices (insofar as it’s more work).
I guess centralising the decision to circumvent the delay for a security update is the main benefit over everyone tracking the security news and trying to work around their own cooldowns (then buying fake vulnerability news and installing the attacker's release anyway).
Variance itself might not even be that bad. It is certainly more convenient to know in advance when your backdoor either gets unnoticed and widely deployed or not, than to do per-target investigation about cooldown policies.
Also, I wonder if some of the package managers decides that grabbing from upload queue for-review access is a feature, and then variability is back…
Yep. I think there’s a strong argument that package age consistency via the index is best. The problem is that getting there requires solving a significantly harder coordination problem than I think the author realizes, which is why cooldowns have largely prioritized the client side. That makes them not the best technical solution, but a known workable one.
(Specifically, the coordination problem is that people will need to override/peek the queue for various legitimate reasons, and that process will require code, labor, and standardization work that ecosystems like Python aren’t really organizationally equipped to provide on current attacker timescales.)
Peeking itself is not that bad: if the queue is visible to anyone claiming to intend to do security research on it, people will peek via that channel. And if there is vetting, this will sooner or later have both leaking scandals and oligopoly scandals, and possibly revert to the previous option…
Forward overrides are horrible, of course: the repository owning just the default cooldowns already makes it own the decision on emergency security fixes for all packages, and indeed it seems doubtful that anything will scale well.
I think I failed to explain that I think most important (and undesirable) free riding in my view is of eg, commercial users of, say, the litellm package waiting 2 weeks to adopt it when personal users do not. In that case, as you say: the victimisation is happening and serves pretty much no purpose. We just need to wait longer for scanners to run pre-distribution. Hence queues instead.
One of the examples I give in the article is Debian, who effectively are basically an upload queue for broken and buggy FOSS projects. Debian are old and run on a much smaaller budget than I bet NPM ever have done.
I concede there would be cost in switching to the debian-kind of orientation for them, but I think most of the work is in the switchover and not in doing it once you've switched. Package indexes are necessarily headed for a future of social co-ordination ("yanking releases, maintaining embargoes, dealing with typosquatting and coordinating 0days"). I think managing a slight delay to package distribution is a small but very worthwhile addition to those responsibilities.
It's kind of unfortunate that all the language package managers have been "instant distribution" for so long as that is a mechanism which is so specifically vulnerable to supply chain attacks. It's worth at least some pause to think about why that didn't happen with linux distributions, but did with PyPI.
Debian is a good example, I think, of how much benefit you can get with a very different publisher/index relationship than PyPI, RubyGems, etc., have -- I agree with you that "instant distribution" is the underlying problem here, but it's also not something that's easy to take away now that it's been established as the norm for publishing language specific packages.
It's worth at least some pause to think about why that didn't happen with linux distributions, but did with PyPI.
I actually think there's a more quotidian reason for this: PyPI (and npm, etc.) has an "anybody can publish" model, which means that there are way more credentials (and vulnerable CI/CD workflows) floating around the internet than Debian or another distro has. Instant distribution/consumption made it worse, but the thing that makes PyPI a valuable target to begin with is the fact that anyone can upload to it, so securing uploads to PyPI is more of a traditional "protect users against themselves" problem than Debian's trusted-set topology.
Of course I'm a free rider, I can't conceivably be engaged with every package I use. Such is life. What I don't see is how waiting and seeing before using releases makes me any more of a free rider.
Package cooldowns approximate a gradual rollout, which is used by many apps and web services to limit the blast radius when releases have problems. I don't see it as a bug that users get to choose which group they're in, it's a nice feature that you can set your own risk profile. And the best part is that no coordination is needed at all, it requires no cooperation from package registries or other users. Now that the idea is out there I doubt that this genie can be put back into the bottle. No matter how many epithets for it you can come up with.
I don't see it as a bug that users get to choose which group they're in
People who were unlucky enough to run pip install litellm at an unfortunate moment were not consciously selecting a higher risk profile for themselves. They were just naive and unlucky. I think adopting a security posture for the whole ecosystem that relies on such a person biting into the cherry before you is anti-social in the extreme
Wouldn't wide adoption of cooldowns lead to pip install also eventually defaulting to a non-zero cooldown? Different tools will end up with different defaults, yes; but clock-manipulating honeypots from security-compliance companies trying to boost portfolios will run on cooldown 0 while pretending to be at a large cooldown.
It's not like anyone can rely on timeliness of the reports from naive users for anything anyway.
What I don't understand is how delaying dependency updates by default is a good policy. What happens if you have a dependency that has a security vulnerability or a bug affecting reliability? Would you delay installing the fix two weeks because the cooldown?
Managing dependencies is hard and the industry has been ignoring the problem for a long time. The "cooldown" sounds a bit like one of those "life hacks" more than an actual strategy.
EDIT: let me clarify that my comment was not about high visibility issues with a known CVE, but to the issues that if you are lucky your tests will pick up before hitting production.
You can override the cooldown for a specific package if you need.
I work in academia. In my lab, everybody does conda install xxx or pip install xxx where xxx in an obscure package with 18,149 transitive dependencies every day. It is hard to quantify, but I am pretty sure a one-week minimal package publishing date policy would accomplish a lot already. Definitely not a silver bullet. Definitely not addressing the OP concerns here. But it would have prevented the recent llm-something package takeover.
But that means you are managing your dependencies, if you override your settings for a specific package because there's an issue. It means you know about the issue and you are paying attention.
Obviously you will know about serious and "famous" CVEs, but depending on your attack surface, I'm not sure people go beyond that.
It's not a problem. You can just update it explicitly.
In every thread about vulnerabilities in language package repositories, there are always someone claiming that we should go back to getting all the packages from distros. There is truth in that that is more secure, and the reason for it being more secure is not the vetting. Distro maintainers can't vet packages more than cursorily (if you don't believe me, ask a maintainer), there is no added security there. But they usually have an extreme cooldown period (and even rolling-release distros delay packages for a little time), that helps them avoid a lot of issues with freshly baked software.
Do they backport fixes? Doesn't need to be a rolling release. RH support model means frozen ABI and API.
I wonder if getting packages from distros is that terrible. As things stand, each dependency has its own release process (some have a stable branch, some rolling release, some don't care). If you are using a dependency in production, updating too often means introducing new features; which means bugs and regressions. Yet the industry is happy about that.
Well, first of all there are many deep-deps that are nontrivial to exploit from network (through all the layers that might fail and cancel the operation on funny input…) but can access network if actively and overtly malicious code is injected into them. You'd probably prefer to keep those somewhat buggy and vulnerable to exotic attacks rather than a bit less buggy most of the time but sometimes fully taken over.
What I don't understand is how delaying dependency updates by default is a good policy. What happens if you have a dependency that has a security vulnerability or a bug affecting reliability? Would you delay installing the fix two weeks because the cooldown?
It's a good default because allowing a little time prior to adopting a package allows people (and automated scanners, who are improving quite quickly) to notice security issues.
And yes, I agree: there will certainly be overrides that you want for that. Deliberately delaying the adoption, for example, of OpenSSL 1.0.1g (which fixed heartbleed) would be counter-productive.
Part of the question here, imo, is how and where you achieve those two goals. Dependency cooldowns are an anti-social way to achieve the first goal and are totally counter-productive to the second.
This reminds me of cron jobs that aren't really about time. Unless I have a date based thing, let's say a birthday, then I don't want time in it. I know I'm over-fitting here but I'm working on a tool that replaces these cron fallback situations so it's kind of been a theme (likely not original or discovered) for me.
I mean, 2 days of cooldown is trying to proxy for "it's probably reviewed, probably vetted"? I don't know what the real signal is or could be but I don't think it is the 2 days part. So my hunch, and I understand that this is not easy, is that there's something else hidden in there. When the article talks about a queue and basically promotion, that's closer to what I mean. Then it's not about time or the cooldown, it's about a logical event: "publication -> distribution"
The idea of having an upload queue is an interesting one, though I think applying some kind of jitter to when you take up a dependency is probably still a good thing. Consumers probably don't want shove their noses in the trough the moment a dependency update comes available regardless of the presence of an upload queue. Now, the question is whether you apply that randomly when the update is detected or use some kind of deterministic mechanism based on some project metadata, a secret of some kind, and dependency metadata.
In terms of «free-riding» or not, if I am installing new releases quickly anyway, I might actually prefer that other people delay: economics of attacking everyone is better for the attackers than getting some installs on VPSes hosting personal blogs, so for the same amount of carelessness I should be attacked less not more. If everyone installs at once, any auto-updating setup is not getting any protection from the others being victimised!
I don't think dependency cooldowns are it. I don't have a great source, but my gut says that compromises take longer to find than most people are willing to cool down for. There are some sites which say that the average supply chain attack takes 267 days to find, so if you cool down for that long you'd skip on the order of half of the attacks (yeah, yeah, average isn't median), but I don't trust them -- they feel a bit sloppy, and don't cite properly.
What's the 95th percentile? Do you need to stick to dependencies older than a year?
[1] https://deepstrike.io/blog/supply-chain-attack-statistics-2025
[2] https://www.breachsense.com/blog/supply-chain-attack-examples/
Also, anyone actually have numbers I can trust?
I think another issue with such reports is that they classify threats from paying customers' point of view, which is always a large enough setup to be worth targeting/exploiting one-by-one. 4 million dollars median damages in one of the links seems a revealing scale-setting number. Cooldowns are relevant rather in case of lower-effort wide-spectrum attacks.