How to deal with Rust dependencies

6 points by emschwartz

kornel

Looking at just the number of cargo crates gives an incomplete and misleading picture. The “unit” of dependency in the Rust/Cargo ecosystem is not the same as in many other package managers.

Many projects split themselves into multiple crates, but it’s effectively one and the same project, just delivered into multiple parts. If you buy a sofa from Ikea, and it comes in five boxes, you don’t moan you got five sofas.

jaredkrinke

Especially since splitting a project into smaller packages means you can sometimes avoid pulling in parts you don’t need, thus eliminating my least favorite type of dependency: transitive dependencies that are only pulled in to support features I don’t use.
emschwartz

That’s a good point, especially given how much splitting a project into smaller crates can help speed up the compilation (at the very least while working on that project. I’m not sure if or how it affects dependents).
- dijit
  
  it will be a comfort for you to know that it’s the same benefit whether it’s a dependency or not!
wink

But what gives a more complete picture then?

Just looked into a very small project of mine. 5 direct dependencies in Cargo.toml and yet that one-liner from the post spits out 73 and by going through the list with some goodwill I could pare it down to 50ish.
- kornel
  
  The units of the numbers are incomparable, so how do you compare them? Well, it’s complicated…
  
  If a C program has a direct dependency on libcurl, how many dependencies is that?
  
  curl has way more code than reqwest. Waay more. It has everything that reqwest has, all of the functionality of all of reqwest’s dependencies, and then, like ten times more. curl supports three different email protocols.
  
  But in the C world, curl counts as 1 dependency, and in the Rust world reqwest will shock you with 250 crates.
  
  From the perspective of the amount of code that is there, the 250-dependency reqwest is smaller and leaner! reqwest has crates for HTTP/1, H/2, and gzip, brotli, and URL parsing, punycode, base64, multipar, MIME, and interfaces to TLS libraries, CA cert stores, and hashmaps, loggers, buffering, DNS, socket handling… there’s a lot, and everything is delivered as a unit of a “crate”, but every HTTP library has all of that, whether it’s split into 250 crates, or 250 .c files delivered in a handful of .so blobs.
  
  In a typical Linux distro, the build-time deps are not presented to you (Cargo shows them). You won’t see deps for other operating systems either, obviously (but Cargo.lock has them). Many transitive/shared packages are already installed with the OS, or were pulled in by another package, so you only see few brand new ones installing (Cargo shows them all, separately for each project).
  
  https://wiki.alopex.li/LetsBeRealAboutDependencies
  
  Hypothetically, if Rust was shipped with an OS, and you’d have /usr/cargo like you have /usr/include, then probably most of the common crates would have been in the base OS, and your 50-crate project would look like a 0-crate project.
- jaredkrinke
  
  I asked a similar question in the past. I can’t find it, but I believe the answer was cargo-vet, cargo-crev, and possibly even some built-in cargo command to list contributors of all transitive dependencies. Not a comprehensive answer, but hope that points you in the riding direction.
rtpg

While in theory I understand the split points of some of these libs, I do wonder what the dependency situation looks like if people were to just say “you want to use this neat regex-y datastructure ripgrep created for regexs? You want to use ripgrep’s fast regex engine? You include ripgrep as a dependency”.

I also am once again grateful for the Python standard library offering me stuff like basic random number generators. The rand crate existing makes sense to me in principle but I would love to have a bit more of a usability bar with “just” the standard library. Unfortunately I think we’re going the other direction on this front.

Project/user value mismatch is tricky I guess!
- kornel
  A “standard library” as a concept bundles together multiple things:
  
  Convenience of having the functionality always available by default
  
  Trust that it comes from the official source
  
  Not having separate versions of the library
  
  All the functions sharing the same namespace, and being linked/imported together
  
  The first two are great. The last two have consequences that are a major pain.
  
  Since there’s one version of std for everyone, it can’t evolve its API. It’s hard to remove even mistakes and obsolete functionality, e.g. PEP 594. OTOH rand can improve and release new major versions, while your project can continue to use old versions of rand and doesn’t get broken when you update the compiler.
  
  Since std is one unit, it has to work for everyone on every platform. But this one-size-fits-all is already failing in Rust: std is too big for embedded targets, so there’s a whole no-std flavor of the ecosystem. The whole std::fs is broken and useless on WASM targets. It doesn’t work even in browsers that have a filesystem API, because web Filesystem API works differently than what std::fs API expects. Since std is unversioned, it can’t update its std::fs to support browser WASM, nor capability-based OSes. At best it can add std::fs2. With rand, we’d have std::rand1, std::rand2, std::rand3, std::rand4, std::rand5, std::rand6, std::rand7, std::rand8, and std::rand9 already.
  
  The first two benefits can be achieved without literally putting stuff in the standard library. Rust/Cargo can do more to designate official crates, and make them easier to discover and enable.
  - rtpg
    
    Since std is one unit, it has to work for everyone on every platform.
    
    A choice
    
    Since there’s one version of std for everyone, it can’t evolve its API.
    
    A choice
    
    With rand, we’d have std::rand1, std::rand2, std::rand3, std::rand4, std::rand5, std::rand6
    
    A choice/downstream of Rust
    
    Since std is unversioned
    
    A choice!
    
    I think that when you prioritize very high levels of stability in functionality you make these choices. I do not like it when people make it sound inevitable, as if there is no other choice.
    
    Node.js has a standard library with API stability levels. When it’s in experimental all bets are off. At one point people figure out “ok this is good enough” and go with that, and lock things in.
    
    Now node’s standard library isn’t exactly vast, but Python’s is fairly vast. I am happy with rand in Python. Python has similar sort of “standard library woes” as Rust, but granting that Python is more comfortable with things like algo changes between Python versions.
    
    I would be very happy with leaning way more into APIs that could be experimental, that could have less rigid stability constraints in the standard library. But having said that it’s definitely easier if there’s consistency across the board.