How to deal with Rust dependencies
6 points by emschwartz
6 points by emschwartz
Looking at just the number of cargo crates gives an incomplete and misleading picture. The “unit” of dependency in the Rust/Cargo ecosystem is not the same as in many other package managers.
Many projects split themselves into multiple crates, but it’s effectively one and the same project, just delivered into multiple parts. If you buy a sofa from Ikea, and it comes in five boxes, you don’t moan you got five sofas.
Especially since splitting a project into smaller packages means you can sometimes avoid pulling in parts you don’t need, thus eliminating my least favorite type of dependency: transitive dependencies that are only pulled in to support features I don’t use.
That’s a good point, especially given how much splitting a project into smaller crates can help speed up the compilation (at the very least while working on that project. I’m not sure if or how it affects dependents).
it will be a comfort for you to know that it’s the same benefit whether it’s a dependency or not!
But what gives a more complete picture then?
Just looked into a very small project of mine. 5 direct dependencies in Cargo.toml and yet that one-liner from the post spits out 73 and by going through the list with some goodwill I could pare it down to 50ish.
The units of the numbers are incomparable, so how do you compare them? Well, it’s complicated…
If a C program has a direct dependency on libcurl
, how many dependencies is that?
curl
has way more code than reqwest
. Waay more. It has everything that reqwest
has, all of the functionality of all of reqwest
’s dependencies, and then, like ten times more. curl
supports three different email protocols.
But in the C world, curl
counts as 1 dependency, and in the Rust world reqwest
will shock you with 250 crates.
From the perspective of the amount of code that is there, the 250-dependency reqwest
is smaller and leaner! reqwest
has crates for HTTP/1, H/2, and gzip, brotli, and URL parsing, punycode, base64, multipar, MIME, and interfaces to TLS libraries, CA cert stores, and hashmaps, loggers, buffering, DNS, socket handling… there’s a lot, and everything is delivered as a unit of a “crate”, but every HTTP library has all of that, whether it’s split into 250 crates, or 250 .c
files delivered in a handful of .so
blobs.
In a typical Linux distro, the build-time deps are not presented to you (Cargo shows them). You won’t see deps for other operating systems either, obviously (but Cargo.lock
has them). Many transitive/shared packages are already installed with the OS, or were pulled in by another package, so you only see few brand new ones installing (Cargo shows them all, separately for each project).
https://wiki.alopex.li/LetsBeRealAboutDependencies
Hypothetically, if Rust was shipped with an OS, and you’d have /usr/cargo
like you have /usr/include
, then probably most of the common crates would have been in the base OS, and your 50-crate project would look like a 0-crate project.
I asked a similar question in the past. I can’t find it, but I believe the answer was cargo-vet, cargo-crev, and possibly even some built-in cargo command to list contributors of all transitive dependencies. Not a comprehensive answer, but hope that points you in the riding direction.
While in theory I understand the split points of some of these libs, I do wonder what the dependency situation looks like if people were to just say “you want to use this neat regex-y datastructure ripgrep created for regexs? You want to use ripgrep’s fast regex engine? You include ripgrep as a dependency”.
I also am once again grateful for the Python standard library offering me stuff like basic random number generators. The rand crate existing makes sense to me in principle but I would love to have a bit more of a usability bar with “just” the standard library. Unfortunately I think we’re going the other direction on this front.
Project/user value mismatch is tricky I guess!
A “standard library” as a concept bundles together multiple things:
The first two are great. The last two have consequences that are a major pain.
Since there’s one version of std for everyone, it can’t evolve its API. It’s hard to remove even mistakes and obsolete functionality, e.g. PEP 594. OTOH rand
can improve and release new major versions, while your project can continue to use old versions of rand
and doesn’t get broken when you update the compiler.
Since std
is one unit, it has to work for everyone on every platform. But this one-size-fits-all is already failing in Rust: std
is too big for embedded targets, so there’s a whole no-std
flavor of the ecosystem. The whole std::fs
is broken and useless on WASM targets. It doesn’t work even in browsers that have a filesystem API, because web Filesystem API works differently than what std::fs
API expects. Since std
is unversioned, it can’t update its std::fs
to support browser WASM, nor capability-based OSes. At best it can add std::fs2
. With rand
, we’d have std::rand1
, std::rand2
, std::rand3
, std::rand4
, std::rand5
, std::rand6
, std::rand7
, std::rand8
, and std::rand9
already.
The first two benefits can be achieved without literally putting stuff in the standard library. Rust/Cargo can do more to designate official crates, and make them easier to discover and enable.
Since std is one unit, it has to work for everyone on every platform.
A choice
Since there’s one version of std for everyone, it can’t evolve its API.
A choice
With rand, we’d have std::rand1, std::rand2, std::rand3, std::rand4, std::rand5, std::rand6
A choice/downstream of Rust
Since std is unversioned
A choice!
I think that when you prioritize very high levels of stability in functionality you make these choices. I do not like it when people make it sound inevitable, as if there is no other choice.
Node.js has a standard library with API stability levels. When it’s in experimental all bets are off. At one point people figure out “ok this is good enough” and go with that, and lock things in.
Now node’s standard library isn’t exactly vast, but Python’s is fairly vast. I am happy with rand
in Python. Python has similar sort of “standard library woes” as Rust, but granting that Python is more comfortable with things like algo changes between Python versions.
I would be very happy with leaning way more into APIs that could be experimental, that could have less rigid stability constraints in the standard library. But having said that it’s definitely easier if there’s consistency across the board.