Rust Dependencies scare Me

70 points by vaguelytagged

msfjarvis

The solution to having too many dependencies is to get more comfortable writing your own code: https://lucumr.pocoo.org/2025/1/24/build-it-yourself/. You’ve already stumbled upon it with the dotenv crate, you’ll just have to go through the same decision tree before every cargo add in the future.

While I agree that Rust can easily lend itself to large dependency trees, I don’t find these rants about Rust dependencies particularly helpful. The problem of “I can’t possibly review all this third-party code” applies to all languages, more so to ones like Java where you generally consume compiled artifacts rather than source code or to Golang where Google is effectively MITM’ing all dependency requests by default and the source you pull can differ from what you saw in the Git repo. I also rarely see mention of cargo-vet or cargo-crev which are trying to work on this very difficult problem using distributed trust.

BenjaminRi

The problem of “I can’t possibly review all this third-party code” applies to all languages

If you are in the C# ecosystem, many libraries are provided by Microsoft. These libraries (hopefully) adhere to a certain standard of quality and security. So no, the problem is not the same in every language.
- msfjarvis
  
  And in Rust many libraries are provided by the rust-lang organisation, how does that change anything about every other library that doesn’t come from the stewards of the language?
  - agocke
    
    In theory this is true, but in practice I think the .NET framework is much larger than the rust standard lib + everything owned by rust-lang. Just as an example, the .NET framework has an equivalent for: serde_json, regex, syn, base64, chrono, uuid, tokio, reqwest, rayon, and axum.
  - alexforster
    
    I’d like to see the Rust equivalent of golang.org/x/
    
    ekuber
    
    https://blessed.rs/crates
    
    mxey
    
    Nice page, but it literally says “unofficial”.
    
    mcherm
    
    golang.org/x/ is ALSO unofficial – it is not operated or endorsed by the United States government or any other duly elected sovereign government.
    
    I expect you will protest that governments are not the only organizations that can effectively run things, that having a corporation like Google perform the vetting rather than a national government does not mean that the vetting is meaningless. And you would be correct – in exactly the same way that blessed.rs is run by a different organization than the one that produces the Rust compiler but still provides a valuable service.
    
    orib
    
    golang.org/x/ is ALSO unofficial – it is not operated or endorsed by the United States government or any other duly elected sovereign government.
    
    Weird shift in the goalposts there.
    
    habibalamin
    
    The point here is that the vetting is provided by an organisation you’re already trusting. If you’re using Rust, official vetting would be done by the Rust team.
    
    The whole discussion is about limiting the number of entities you have to trust beyond the ones you already have to trust just by using Rust (edit: couldn’t help myself and had to summarise this; must trust just Rust).
    
    So, as the other commenter said, it’s a nice page, since at least it consolidates third party blessings to one organisation, which you can hopefully trust.
    
    But it’s not equivalent to golang.org/x/, since it’s not by the same team that already made the language.
    
    In cool terms, it improves the algorithm for trust of third party packages that are on the list from trusting O(group(n)) entities to O(1), but not zero-cost.
    
    ssokolow
    
    But it’s not equivalent to golang.org/x/, since it’s not by the same team that already made the language.
    
    That statement assumes that “the same team that already made the language” still works on Rust, and that they sit above all other teams in the org chart.
    
    Even Rust itself is split into different teams for the compiler, standard library, etc.
    
    habibalamin
    
    Isn’t there an authority that appoints people to those teams? You’re not individually trusting the compiler team and the stdlib team and the infra team, or however it’s split up, are you? You trust them all implicitly because you trust the umbrella entity.
    
    ssokolow
    
    My point is that your statement assumes that “the team that already made the language” is the “authority that appoints people to those teams”.
    
    habibalamin
    
    Change that to, “the entity that currently works on the language”, then. It doesn’t really change much.
    
    In the same way you trust the compiler team appointed by the leadership through transitivity, you could choose to trust the creating entity to appoint their successors in various roles.
    
    Or you could choose to reevaluate whether you should trust the umbrella entity responsible for Rust every time a complete transfer occurs.
    
    But that’s a separate discussion. If you’ve decided not to trust the entity currently responsible for the language and stdlib and infra and so on, which the creators appointed as successors, then there’s no reason to even get to the question of whether you can trust the important packages of the ecosystem.
    
    However, if you’ve decided you can trust them in all of that, you can probably also trust their blessing of vetted packages (or even extend stronger trust if they actually work on the codebases of those important packages themselves).
    
    ekuber
    
    The people already on the teams appoint new people to the respective teams.
    
    habibalamin
    
    Yes, I talked about the risk of transitivity of trust in the appointment of new people. I don’t see how your comment changes anything.
    
    mcpherrinm
    
    https://github.com/rust-lang/rfcs/pull/3810
    
    alper
    
    to a certain standard of quality and security
    
    Microsoft? The org that leaks their own root keys on the regular?
    
    alper
    
    Just because this has been flagged as a troll comment, here are the references to back it up:
    
    https://www.security-insider.de/gestohlener-master-key-von-microsoft-a-e13eabefaeb7354292ddd63dfae063d0/
    
    https://www.zdnet.com/article/microsoft-secure-boot-key-debacle-causes-security-panic/
    
    https://techcrunch.com/2024/04/09/microsoft-employees-exposed-internal-passwords-security-lapse/
    
    https://www.golem.de/news/microsoft-dynamics-365-wildcard-certificate-with-a-private-key-for-everyone-1712-131544.html
    
    vaguelytagged
    
    Interesting I’m assuming because micosoft dogfoods alot of their own stuff. I wonder if Google will do something similar given their recent agressive adoption.
    
    vaguelytagged
    
    I guess it is just a little more apparent in Rust than even C / Cpp since they leverage more system libraries… For things like Axum / Tokio I don’t see myself ever re writing that but then again that’s the point and the tradeoff I make. It just seems like ALOT of lines, but maybe I should investigate other languages and really try to see how many lines it takes there (in go std library itself). Just started using cargo vet but it brought up more unchecked dependencies than I expected, and I’m using pretty popular ones. Ones I know for sure Discord, Cloudflare, and even AWS used / are using in production… Maybe I’ll check out crev
    
    proctrap
    
    since they leverage more system libraries
    
    I get what you mean. But last I checked you don’t use a system lib for the missing batteries in C (HashMap & co). Instead you simply copy code - adding a dependency. Same with C++ and libs like re2 and ctre to get fast regex..
    
    vaguelytagged
    
    Copy pasting has the issues of you never can get updates or see possible security advisories. I think a more fair future comparison is to try to build a 1:1 server with best practices in cpp and c and try to count lines, functions, crates, packages, headers ect..
    
    Even including things like libc
    
    proctrap
    
    Might want to start here:
    
    https://lobste.rs/s/5h1jrs/
    
    https://lobste.rs/s/uqanba/lets_be_real_about_dependencies
    
    jlarocco
    
    This doesn’t match my experience.
    
    Admittedly, it depends a LOT on the project and the developer’s opinions on project organization, but most of the C++ and C projects I’ve worked on do use the system libraries. In the open source world it’s the default way of doing it.
    
    The exceptions are libraries we have to modify, and then we make it a submodule, or setup our own apt source and publish our version there for internal use.
    
    But it’s nonsense to fetch and build your own version of every dependency. C++’s package management is primitive, but not that primitive.
    
    ubernostrum
    
    Rust doesn’t involve relying on more code-you-didn’t-personally-write than other languages. It just makes the code-you-didn’t-personally-write more visible.
    
    And counting only the lines of code in the Linux kernel isn’t a great comparison, because you rely on more than just the kernel – you rely on an entire running system, you rely on the Rust compiler, you rely on all the runtime and build-time dependencies of the full system and Rust compiler, etc. etc.
    
    If you wrote your project in a language which makes the dependencies more opaque, it probably would not meaningfully change the quantity of code you’re relying on without personally reviewing line-by-line. It would only change whether you know you’re relying on all that code.
    
    Or, more bluntly: no matter what, you are going to rely on at the very least millions of lines of code you have not personally reviewed, and there is no strategy you can adopt which will cause that number to shrink by any significant fraction. All you can do is adopt things which give you better or worse views into what you’re depending on.
    
    vaguelytagged
    
    This was definitely something I was thinking about after writing. I guess it’s just a little shocking that there’s so much that I rely on for something that I expected to be sorta trivial.
    
    ssokolow
    
    https://wiki.alopex.li/LetsBeRealAboutDependencies goes into more detail on that if you want more detail.
    
    As for being safer, I have five suggestions:
    
    Use https://lib.rs/ instead of crates.io for web-based repo browsing, because it’s focused on making it easier to do first-pass evaluation of potential dependencies. (https://lib.rs/about)
    
    Use cargo-supply-chain to evaluate packages based on how many people you’re trusting rather than how many pieces they cleaved your dependencies into.
    
    Keep an eye on tools like cargo-crev and cargo-vet. (lib.rs incorporates reviews from them via an Audit tab when available.)
    
    Build your code in some kind of sandbox so you’re not the low-hanging fruit for build-time attacks (I really need to get back to working on my Firejail wrapper for that, which I got cold feet on because it would make attacking me to get at others more appealing if I publish it. I haven’t tried or audited any of them, but others have been working on similar concepts like cargo-green.)
    
    Put as much of your code as possible into WebAssembly modules so runtime attacks are constrained by capability-based APIs and you can approach the Bytecode Alliance’s nanoprocess isolation concept.
    
    (I’ve been working https://benw.is/posts/plugins-with-rust-and-wasi into one of my projects for the runtime-installable plugins side of things, but it does also feel good to be able to know that things like svgbob with not-insignificant trees of transitive dependencies can only operate on explicitly passed function arguments and act by returning data which will then be fed through Ammonia… things which will require Emscripten at best, like Graphviz or Mscgen are much further down the TODO list… though I just discovered layout-rs so dot diagrams may be joining svgbob soon.)
    
    (It’s called render-wishlist… originally because it rendered Markdown into the Christmas wishlist my brother asked us all to make but, now, because it’s moving toward becoming an implementation of everything on my wishlist for a Markdown renderer and static site generator… including a rich grammar of shortcodes for depictions of game controller buttons and keyboard keys with symbols like ⌘, ⌥, ⎋, and ↹.)
    
    Diana
    
    Something to keep in mind. In C you depend on your libc (glibc is definitely large) but also the compiler, make, etc.
    
    On top of it, you probably end up needing a lot of dependencies or end up doing a lot of this stuff yourself, but badly.
    
    See dependencies as a way to get experts to do part of your code. For free. And to get access to the hive mind of contributors.
    
    Seen like that, these dependency trees look liberating. I do not have to be an expert in every domains to get safe, efficient and fast stuff. I share bug fixing with everyone. If anything, i would trust it more than what I produce.
    
    I maintain a hyperloglog reference implementation for erlang and the float to string implementation. It taught me a lot about the amount of work and expertise needed to implement this stuff at a good enough level. Revel in it.
    
    david_chisnall
    
    See dependencies as a way to get experts to do part of your code
    
    This is a nice theory. In practice, dependencies are a way to get other people to do part of your code. Sometimes they’re experts, sometimes they’re muppets. It’s often hard to tell which they are in advance. I’ve used libraries that seem to have three users that are clearly written by incredible programmers and others that are the most popular in their domain and are full of the kinds of mistakes I’d fail a first-year undergrad for making in coursework.
    
    jamesnvc
    
    I think people also tend to underestimate how much of the complexity of making a solution comes from making it generalizable. Often you can make a specialized version of exactly what you need instead of pulling in a general-purpose library and have not only less code, but faster and simpler code.
    
    david_chisnall
    
    In my experience, that varies a lot across languages. If you have help from the type system, either with generics or dynamic dispatch, the specialised and generic versions end up being almost the same. In a language like C, the generic version is much harder to write.
    
    jamesnvc
    
    Oh, that’s true, but I was thinking not necessarily “generic” as in parametric polymorphism, but more along the lines of “in this application, we know there will never be more than $N elements, so we can use a ring buffer instead of a resizable tree” or “we only care about this particular feature, so we don’t need a generic tree traversal + “visitor” but can use a simple recursive descent algorithm” - more specific on the higher-level algorithm choice itself, not just the implementation
    
    david_chisnall
    
    in this application, we know there will never be more than $N elements, so we can use a ring buffer instead of a resizable tree
    
    Right, but a language can make it easy to write a ring buffer of N items of type T, or it can make it easy to write one that is 32 items of a specific concrete type. If the language favours the former, you can put that in a library and the next time you need a fixed-size ring buffer, you don’t need to reimplement it, you just instantiate it with the desired size and type.
    
    I’m biased in this example because I’ve implemented the lockless ring buffer design that Tim Harris and Keir Fraser created for Xen in a few languages, but I’ve never needed to do it more than once in anything other than C. Their implementation in C is macro hell and was much harder to write than a version specialised for a single type. Xen actually has a nice demonstration of this: the early-boot console uses the same core data structure but specialised for characters and not using the same implementation as other PV devices and it’s much easier to read (or, at least, was in the Xen 3 days, I haven’t looked for a long time) and was probably much easier to write. Yet in C++, the complexity of the two is almost the same and the templated C++ version is about as complex as the specialised C version (both to read and write).
    
    If you’re in a language like Haskell or Lisp, it’s very easy to create the generic version of a data structure and often as easy as creating a specialised version. This makes it easy to build up a big library of off-the-shelf data structures so you never say ‘I need to implement this data structure’, you just say ‘I need this, specialised for X, Y, Z, what is it called?’.
    
    Diana
    
    Sure. But the thing is. There are nearly no other way to get the experts in your code.
    
    So they always are the way to do it. Doesn’t means they are all experts. But you have to start from the pov of dependencies being assets. If you start from the pov that they are problems, then you would nearly never use them and as such lose the benefit.
    
    david_chisnall
    
    There are nearly no other way to get the experts in your code.
    
    Well, except hiring experts and paying them to work on your code. Which is what we do with our first-party code.
    
    Diana
    
    I highly doubt you are hiring the expert on everything, from cpu microcode to GUI through data structures, compilers, linkers, drivers, cryptography, string and fonts rendering, etc etc.
    
    ssokolow
    
    This whole fiasco led me tho think …. do I even need this crate at all? 35 lines later I had the parts of dotenv I needed. Packages become un-mainted in every language and it was my choice to pull in an arguably trivial dependency.
    
    Now you’ve re-invented the worst-case (header-only libraries or bespoke implementations) of the reason Linux distro maintainers try so hard to un-vendor dependencies when building distro packages and containers are a problem.
    
    What if a bug is discovered? Now being aware that you might be subject to it, identifying if you are, and fixing it are all completely manual.
    
    Among other things, you’ve broken the ability for cargo audit to surface RUSTSEC advisories and GitHub Dependabot to offer up PRs to fix things. (I’d be so far behind on this sort of thing without Dependabot.)
    
    (In practical terms, given real-world realities, you decided “I’m scared of this whole package becoming unmaintained, so, to jettison the code I never invoke anyway and hedge against the risk of new vulnerabilities being introduced later, I’m going to immediately force-unmaintain the ‘discover exploitable logic bugs’ side of things for the code I do invoke.)
    
    Human value calculus-wise, you’re essentially enacting what I call the C programmer’s fallacy. That oh-so-human tendency to to operate on the principle that “I don’t trust other people’s code, but the code that I wrote/reviewed is flawless” because it’s so damn easy to fool oneself. (In essence, the whole reason the scientific method is so focused on falsifiability and peer review to minimize the chances of another cold fusion or N-rays or polywater.)
    
    Out of curiosity I ran toeki a tool for counting lines of code, and found a staggering 3.6 million lines of rust. Removing the vendored packages reduces this to 11136 lines of rust.
    
    The link is correct, but you typo’d “tokei” in the visible text.
    
    Also, while I haven’t used cargo vendor myself, my understanding is that, like Cargo.lock, it pins down down the version of every package you might need across every conditional compilation switch and platforms-specific dependency.
    
    (eg. Never building something for Windows? Tokei’s still gonna see and count the giant mass of Windows platform API bindings that got vendored so that if someone on your team does audit things, surprise un-audited dependencies won’t creep into a build later just from adding --target or --features to your build command. Project won’t build for WebAssembly because WASI doesn’t have a required API yet? You’ll still vendor the WebAssembly support stuff for any portions of your tree that do support it. Using tokio but some of your dependencies have feature flags for async-std and smol? I’m honestly not sure what cargo vendor does for feature flags not routed to the top-level package.)
    
    I think you can see how that would cause the number of lines tokei sees to shoot into the stratosphere.
    
    How could I ever audit all of that code?
    
    That’s where projects like cargo-crev and cargo-vet come in to help decentralize the work. For example, if Google has audited an earlier version of one of your big dependencies, and you trust Google to do their job on packages they use, then you can just audit the changelog since that version.
    
    https://lib.rs/crates/tokio/audit
    https://lib.rs/crates/tower/audit
    https://lib.rs/crates/axum/audit
    …
    
    I have no idea… Many call for adding more to the rust standard library much like Go, however this causes its own set of issues. Rust is positioned as a high performance, safe, and modular language meaning to compete with CPP and C. That means it targets things like embedded devices. Every new thing that gets added to the std library is one more thing for the rust team to have to manage and work on. Just Tokio itself has one of the most active Github and programming discords Ive seen.
    
    …plus, Rust’s current approach was designed based on experience with Python and Java… especially Python, where the rich standard library is treated by the developer community as a graveyard of obsolete and/or flawed designs that you should prefer out-of-stdlib replacements for. (eg. Don’t use urllib (or, before Python 3 merged them, urllib2), use Requests and its internal never-to-be-in-stdlib urllib3.)
    
    What I will say that Rust needs to do better is making it clear which crates are de facto “Part of the stdlib which is packaged externally so it and the toolchain can be versioned independently”.
    
    jitl
    
    Reading key value pairs from a text file is not rocket surgery.
    
    ssokolow
    
    Honestly, that strikes me as exactly the attitude that has produced so many of the papercut bugs I’ve seen and some security exploits too.
    
    How do you know you haven’t missed some edge case in the spec? (eg. relating to quoting or whitespace handling.. if nothing else, possibly in how other people’s things that generate dotenv files expect you to implement them. This reminds me of the big warnings about shell quoting on Windows where argv is just an abstraction fiction over the specific, not-universal quoting semantics that MSVC libc wraps around the underlying “the command-line is an un-parsed string” behaviour of the Win32 process spawning API.)
    
    How do you know you haven’t missed some “everyone does this” workaround? (so much of both IRC and X11, as implemented today, are “everyone breaks the spec the same way” and, for drag-and-drop and copy-paste, I needed to wrap QMimeData in a bunch of compatibility hacks to work around other people’s non-compliant implementations of DnD/copy-paste of files and compile a big test corpus to verify them.)
    
    How do you know you haven’t written a “good enough” implementation that misses some feature that other people are used to? (I’ve seen this so many times with hand-rolled command-line argument parsers that don’t implement short option condensing or don’t implement --long=option instead of --long option or only implement --long=option or implement single-dash long args or assume that the first non-option argument implies -- or don’t implement -- or only accept -? for help or don’t implement any kind of help or…)
    
    It’s the same attitude that makes me want to chain webtech app devs to a school desk and force them to read the entirety of the HIGs for all the platforms they intend to target before they’re allowed to omit or reinvent things like drag-and-drop or lists which should have multi-select. (Spoiler: The Windows 98/2000/ME HIG is a 594-page dead-tree book or MSDN Library CHM file, both of which I own, and the Windows XP and Vista/7 HIGs are HTML or PDF-format addendums to it… don’t get me started on how Windows 8 and beyond have been decaying.)
    
    …and then force them to ~~write lines~~ drag-and-drop stuff in the browser a few hundred times until they crack and rip out their implementation based on native HTML5 drag-and-drop (i.e. native OS-level inter-window drag-and-drop) and do as Qt or GTK did and implement their own intra-window DnD which isn’t prone to cancelling the operation you spent 10 or 20 seconds scrolling for if you forget to wait a second or two for the source and destination widget to negotiate first before releasing the button… on a Zen 4 CPU from 2023 with 64GiB of RAM in Firefox or Chrome.)
    
    No, it’s not that simple. You’re just externalizing the costs!
    
    https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/
    
    In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.
    
    – G.K. Chesterton, The Thing (1929)
    
    kornel
    
    The way to deal with this is shared code reviews. See cargo-vet
    
    Some people feel safer by vendoring their dependencies, but that’s typically a security theater, and at best a wasted duplicated effort. If you just “LGTM” and merge vendored code, you’re just making an expensive backup (that protects against crates-io downtime, but not against malicious or vulnerable code. crates-io backup is better done via caching HTTP proxy). If you’re actually reviewing the code you’re vendoring that’s great, but we don’t need people re-reviewing the same deps over and over again. Any single project can’t realistically review millions lines of code, but the community as a whole can.
    
    srtcd424
    
    There’s also https://github.com/crev-dev/cargo-crev, but I have no experience of it.
    
    vaguelytagged
    
    Personally vendoring was to not have to call out to crates.io every time and gives me a peek into my dependencies (more for interest than an actual audit). Starting to use cargo-vet now but I found it to not have audits for many of the crates I use even when importing Google and Mozilla’s lists. Maybe there’s a better tool somwhere? It would be nice if cargo included some sort of health metric.
    
    intarga
    
    Personally vendoring was to not have to call out to crates.io every time
    
    What do you mean? You only call out to crates.io when you update a dependency, and vendoring doesn’t save you from that
    
    hsivonen
    
    As someone who works on a codebase that vendors dependencies, I think your comment is a bit too dismissive of vendoring as security theater. I think there is value in having the actual code of the dependencies in your version control so that you can investigate what state things were in by browsing monorepo history instead of having the indirection of having to download dependencies based on just a lock file in version control history.
    
    But, indeed, if you vendor stuff, you shouldn’t just merge stuff without looking at the diff.
    
    For added concreteness regarding cargo-vet: The most practical way of addressing the OP’s concern is using cargo-vet with at least the five imports seen at the top of https://github.com/mozilla-firefox/firefox/blob/main/supply-chain/config.toml . In principle/theory, it may feel deeply unsatisfactory to concede that people who are committers for the 5 orgs get to self-certify what they wrote, but in practice, you aren’t going to audit everything, so focusing you own audit efforts on what’s not already covered by these imports makes the remaining problem tractable.
    
    kornel
    
    browsing monorepo history instead of having the indirection of having to download dependencies based on just a lock file in version control history.
    
    This seems like merely a UI/convenience issue, not a security aspect? The lock file has checksums, the checksums are verified, so you’re able to get the same code with the same consistency guarantees in both cases (apart from edge cases like crates.io disappearing completely without anyone having a backup, but I don’t think you have that in mind).
    
    hsivonen
    
    It is merely that, but that’s quite a load-bearing “merely” when git repo browsing tooling exist but integrating the display of crates.io crate content would require developing more tooling features.
    
    kornel
    
    I’m already using such tooling (cargo crev open $name), so to me the git solution is completely inferior — worse UI (gitlab web at work, instead of my local editor), worse performance (CI clones make vendoring cost exponential over time), and worse security (doesn’t prevent code execution in checked out copy).
    
    dlisboa
    
    I’ll add one caveat to the line counting strategy that is: Rust has more lines that don’t express anything really useful than most languages. This is mostly because of the syntax but also the common formatting style.
    
    In an impl Trait with functions that has lots of trait bounds you’ll likely get 4-5 lines before the body of the first function even starts, simply due to the where being indented and the heavy use of generics.
    
    Likewise because of the granularity of dependencies most files will start with 10-15 use ... lines. Yet another compounding issue is the zero-abstraction approach to iterators with method chaining where each method is put into its own line. Furthermore, cfg! and similar macros for OS-specific code add even more lines.
    
    However sheer number of lines does make code harder to deal with so it’s not altogether a bad strategy to get an overview of complexity. Syntax matters more than people give it credit.
    
    vaguelytagged
    
    You make good point rust is definitely verbose at times especially when expanding out the macros. However the sheet numbers make me shiver. I’d count crate numbers but this is also flawed since packages like Tokio are split into 20-30 packages, all by the tokio team just split up. Makes it hard to track transitive dependencies.
    
    repnop
    
    I think your conception of tokio is pretty outdated, the code that makes up tokio the crate hasn’t been split up into many crates for years now (since 2019, actually), look at the repo yourself: https://github.com/tokio-rs/tokio
    
    it used to be split out because of compilation benefits but made managing the codebase a lot more painful and people often complained about needing to add more crates for functionality, etc.
    
    vaguelytagged
    
    Hm I’ll need to revisit this thanks for pointing that out! I wonder if it hurt their compile time at all.
    
    strongoose
    
    I’m pretty firmly in the “stdlib should be batteries included” camp. I like writing rust a lot more than writing Go, but generally end up using Go for my own projects because I know I can get 90% of the way there without ever worrying about choosing from a mishmash of 3rd party dependencies.
    
    Obviously this is very specific to my usage - I’m generally writing small, lightweight web backends, where Go’s stdlib really shines - and it’s not really a core usecase for rust. But still, it would be nice to have the option to pull in a chunkier std somehow.
    
    tno
    
    On the other hand, Python offers an opposite example. It used to be the flagship of “batteries included” languages, but being bundled with the language made it very hard to evolve. Every library is pretty much stuck in it’s 2010 state. So you have the worst of both worlds: You need to depend on lots of third party libraries because they are far superior to the standard offerings but lots of things depend on compatibility with the standard version anyway so the ecosystem can never really move on (prime example: datetimes)
    
    rtpg
    
    One good thing about Pythons stdlib is that third party libs can wrap the standard lib to iterate on API design.
    
    In a way the standard library can just provide very low level functionality that third party libs can rely on. And sometimes the low level functionality (things like basic json parsing) are good enough… and for people who need “better” implementations can special case on their end
    
    The Rust world means that just for parsing a configuration file I now need to make a decision around what to use, look at a bunch of comparisons between libs, when at the end of the day anything could work and I should really take the “lightest” solution
    
    wink
    
    I’m not buying that argument even after all those years. I’ve build dozens of small things in python and I have usually used request as an external dependency and nothing else. And only if I needed some advanced HTTP stuff. It very much depends on what you are building, and sometimes you need a better argv parser, but not that often.
    
    tno
    
    Well, let’s look at requests then.
    
    It depends on five projects:
    
    charset_normalizer, a chardet replacement that simplifies multi-encoding handling.
    
    idna, a drop-in replacement for the built-in encodings.idna library that supports the current standards
    
    urllib3, which as the name implies is a modernized, actively maintained version of the included urllib
    
    certifi, which is needed because the bare-bones python SSL module does not provide a cross-platform way to access an up-to-date CA store.
    
    pytest, because the builtin unittest framework is unloved and outdated
    
    All but charset_normalizer have alternatives in the standard library which requests does not use because they’re bad. Two are even actively maintained forks of standard library modules. Everything but pytest would be expected functionality for a modern batteries included language. Perhaps requests, even.
    
    So here we have the arguably #1 most popular python library and it avoids using the standard library at every place. The stuff in there is kind of usable, in the “we have food at home” sense, that is true. But practice shows that in python, everyone just ditches the standard library as soon as they can. That is not the mark of a successful standard library to me.
    
    mikedorf
    
    aiui the big reason to keep things out of std is so APIs and implementations can evolve without being tied to Rust versions/editions.
    
    That and no one can really agree which batteries to include.
    
    adamshaylor
    
    It depends. The more batteries, the greater the burden on the language designers. Go is a language designed specifically for building web services, so it makes sense for it to include those particular batteries. If language is funded by a FAANG and the batteries are aligned with the business model, it’s easier to justify the costs of building and maintaining them.
    
    On the other hand, as a developer, I know what you mean. I spent most of my career in JavaScript. About midway through found myself in a project where I needed features and bug fixes in a .NET service my Vue code was relying on. I could either learn C# and write a bunch of it myself or wait for someone to do it for me. I was never a Microsoft fan and even less a fan of inheritance-crazy OOP languages, but I decided to suck it up and become a full stack .NET dev for that project. It was refreshingly easy. It helped that the lead .NET dev kept things clean, but just as important was the fact that there was almost always one idiomatic way to build something and often times it was already a feature built into the framework. If you took the time to read the docs, you could build a lot of features in a short time and usually leave code other developers could easily read and understand. It was only after this experience that I realized that communities with fragmented ecosystems tend to suffer from a lot more confusion, reinvented wheels, and churn. Not everything Microsoft makes is brilliant, but there are definitely days when I’m building something fairly conventional in JavaScript and there are still a thousand ways to do it. I generally like my own code because I take a lot of care to critically read my commits before I push them, but I’m still never totally sure that the choices I’m making will be as clear to other developers who come in behind me as they were to me.
    
    ab5tract
    
    Not sure whether this overlaps wth Rust’s core use case (I’d be curious to hear what that is for you @strongoose), but just an anecdote that we’ve included as many batteries as possible in Raku. It’s such a pleasure to be able to do so many things without needing any dependencies. I guess it’s pretty common to have a robust standard library in “scripting” languages, though some take the Perl approach and pull in frequently used/deemed important third-party modules
    
    This is why I’m a bit confused by so many people claiming here that “it’s the same in all languages” when it is easily verifiable that thsi is not, in fact, remotely true. Zig is a good contrast here too, demonstrating that it’s not just dynamic “scripting” languages that disprove the claim.
    
    strongoose
    
    Yeah, the most common “core use” that I see cited as a reason for the thin stdlib is embedded systems - but I don’t know very much about that kind of development so I’m not well placed to comment on it tbh.
    
    zesterer
    
    Now count the number of lines of code your average non-trivial C program has inserted into its address space when the dynamic linker gets involved.
    
    I think a much more interesting metric would be ‘LoRC’ (Lines of Reachable Code). Strip the dependency tree at the function level, aggressively remove dead code, and now tell us how many lines you’re pulling in. That’s a more useful number to work with for the sake of security, performance, binary size, etc.
    
    4ad
    
    An exploitable bug in your program could very well make previously unreachable code now reachable.
    
    munksgaard
    
    No. Unlike shared libraries imported into C programs by the dynamic linker, Rust will not include any dead code in the compiled artifact. Besides, Rust has much better protections against those kinds of bugs than e.g. C.
    
    4ad
    
    But the GP was talking about C.
    
    munksgaard
    
    You’re right, thanks for pointing that out.
    
    rele
    
    After Rust does dead code removal, there is a resulting minimal binary. Working in reverse, would it be possible for the Rust compiler to do a transitive closure over the source code that is necessary to generate the minimum binary? If so, that subset of source code could be output and thus evaluated more easily.
    
    jmillikin
    
    In theory yes, but in practice the mapping from source code to object code is too complex.
    
    The closest real-world equivalents are debug symbols (such as DWARF) and code coverage instrumentation. I think Rust currently emits coverage data at the line level, but if it had Haskell-style per-expression coverage data then you could work backwards to find all the reachable expressions, and then do some sort of parse-to-AST-and-diff operation to emit the subset of source code that ended up in the final binary.
    
    The challenge would be implementing such a design without accidentally building a full-blown decompilation framework ala Ghidra.
    
    munksgaard
    
    Interesting idea! From what I can read though, dead code elimination is done primarily by LLVM. You would also need a way to go from whatever IR you’re in back to the frontend language, which I’m not sure is generally possible.
    
    zesterer
    
    This just how incremental compilation works. Sorta.
    
    singpolyma
    
    At this point this is a cultural/generational divide.
    
    Some people were raised reimplementing lists and strings from scratch in every new program and are terrified of a library that does anything they deem trivial.
    
    Some people were raised with a library ecosystem that had the kitchen sink and don’t see a reason to implement something twice.
    
    As someone who pretty much straddled that timeline I came to abhor the implement data structures every time experience and have mostly joined the other side.
    
    Cloudef
    
    Rust compiles slowly, so the ecosystem encourages small crates as a way to reduce build times.
    
    jez
    
    That may be, but 3.6M lines of code is still 3.6M lines regardless of whether it’s in one or 3,000 packages.
    
    ssokolow
    
    Given what I remember about cargo vendor, I’m fairly certain that 3.6M lines includes every possible dependency for every possible build configuration on every possible target platform. (i.e. I believe cargo vendor downloads everything that Cargo.lock pins.)
    
    Hell, it wouldn’t surprise me if you told me that most of those 3.6M lines were inside the machine-generated windows-sys crate that Tokio depends on if building for a Windows target. The size of “the Windows API”‘s surface makes Qt’s stable of bundled functionality look quaint.
    
    (Because getting people to depend on built-in functionality instead of portable libraries was their original vendor lock-in strategy, so, before they conceded the API war, they need to have a lot of built-in functionality and the camp used to doing that still exists within Microsoft.)
    
    The IDL files windows-sys was generated from total 30.87MB in size! Even assuming they’re using something as verbose as XML, that’s still just interface definitions!
    
    kyrias
    
    Hell, it wouldn’t surprise me if you told me that most of those 3.6M lines were inside the machine-generated windows-sys crate that Tokio depends on if building for a Windows target. The size of “the Windows API”‘s surface makes Qt’s stable of bundled functionality look quaint.
    
    I was wondering how big of a difference it would actually be and throwing together a quick project with the dependencies they listed and comparing the regular vendor result with cargo vendor-filterer --platform=x86_64-unknown-linux-gnu results in 1,746,560 fewer lines of Rust as reported by tokei (3,671,590 → 1,925,030) which seems like a pretty decent reduction. Filtering to only normal dependencies brings it down another 96,676 (1,925,030 → 1,828,354)
    
    The remaining top 20 crates by lines of Rust as reported by tokei are:
    
    linux-raw-sys 367306 encoding_rs 134587 libc 126917 tokio 85543 syn 58915 rustix 56410 regex-syntax 53901 rustls 40387 regex-automata 40064 openssl 30132 ring 25974 portable-atomic 24512 hyper-0.14.32 23877 rayon 23797 unicode-width 22489 h2 22427 h2-0.3.26 21943 reqwest 21776 tracing-subscriber 20243 serde_json 20159
    
    With linux-raw-sys making up a full 20% of the remaining lines of code. After that there is a pretty long tail of dependencies and a lot of the big ones could probably be removed by disabling features one doesn’t need. E.g. encoding_rs which is pulled in by reqwest default feature chardet for supporting browser-like encoding detection and decoding which they quite likely don’t need for their use-case is 7% (134,587) of what remains.
    
    Looking into the dependency tree in a bit more detail it seems quite a lot of dependencies are pulled in by their choice of unzipping tool because it doesn’t split the library and binary up and so you end up pulling in things like clap, env_logger, and indicatif. Replacing it with just a dependency on the zip crate itself which is what ripunzip is built on top of gets rid of 895,803 lines. That’s almost half of the remaining lines of Rust!
    
    ki9
    
    Its in every language.
    Write more code yourself is the only solution. Except even if you replace 1000 dependency LoC with 10 written LoC… Thats still +10 LoC that you have to maintain.
    Packaged code can break but so can your own code.
    
    vaguelytagged
    
    Yeah that’s what most have been saying, at least with my own LOC I wrote it so it’s easier for me to debug and reason about.
    
    franta
    
    You are looking for Sane dependencies. However there is no silver bullet. It is about finding a balance between NIH syndrome and dependency hell.
    
    What helps is the modular design: 1) of you dependencies, that allows you to pick only small pieces you want to depend on instead of bulky packages; 2) of your own code, that allows your users to pick only needed parts of your software and thus only related third-party dependencies.
    
    And it is bit more complex, because it is not only about LoC and numbers of dependencies but also about their quality. Depending on a package from a random unknown author is more risky than depending on a package from a well-known company or organization (a package used by many others where are chances that somebody else did the audit or somebody else will fix the bug when found). On the other hand, the library from a well-known author can be a bulky package that does much more than you need and carries with it a historical burden. While the library from unknown author may do just the one thing you need and have so little code that you can do the audit yourself and fix bugs yourself if necessary. This is quite a multidimensional problem.
    
    vaguelytagged
    
    Great link, I’ve never seen this one before! I’m hoping I get better at picking out dependencies over time and that the larger companies have an established ecosystem of crates they use so I feel a little less hesitant to use them (like Tokio has become)
    
    rele
    
    In general I considered the project to be trivial, a webserver that handles requests, unzips files, and has logs
    
    People have very different ideas of what trivial means. What are some existing points of comparison? What are some languages/libraries that demonstrate a smaller (or even minimal) set of dependencies to solve the author’s need? How do they achieve this?
    
    LAC-Tech
    
    The main thing that concerns me about the rust ecosystem is centralisation.
    
    Let’s say the Rust Software foundation suddenly decided that they did not want an open source developer participating in their community, for non-technical reasons. Do they have the power to make crates authored by them unavailable to the wider rust community, by dint of them controlling crates.io?
    
    msfjarvis
    
    cargo supports alternate registries as well as git dependencies, not being on crates.io is an inconvenience not a damnation.
    
    jlarocco
    
    I don’t see what the problem is.
    
    Nobody’s forcing you to use any of those crates, and if you want to “audit” them, you can. (What does that even mean, btw?) It will be a big task, but there’s just no getting around that nowadays. Your Rust application is running on an operating system with millions of lines of code, so to fully trust the system you’ll be reading through all of that at a minimum.
    
    Using popular and published libraries means there are other people working with them and finding problems. A lot of people are looking at tokio, nobody is looking at your home grown alternative.
    
    There’s no getting around the fact that you have to research your dependencies and decide which ones to trust.
    
    rele
    
    To chime in: I’d rather depend on a well tested, well used comprehensive library that does more than I need if it covers all the edge cases. There are numerous situations where the edge cases seem rare — until you do fuzz testing.
    
    wofo
    
    There’s a relevant RFC just opened today, trying to address the issue of so much “table-stakes functionality” being sourced from different places.
    
    lorddimwit
    
    I do feel like the Rust crate ecosystem is getting a little leftpaddy.
    
    msfjarvis
    
    Do you have an example you would like to share? I keep hearing this sentiment but I rarely come across any crate that is heavily depended on and also very trivial, like the original left-pad was.
    
    jmillikin
    
    https://crates.io/crates/cfg-if is a classic of the genre.
    
    gpm
    
    This is maintained by the rust team, used in the rust compiler and standard library… it’s code your already depending on by the time you’re compiling rust at all. This doesn’t seem to fall into the genre at all.
    
    jmillikin
    
    The issue of left-pad isn’t about who maintains it or who uses it, it’s a matter of whether very small dependencies are encouraged or discouraged.
    
    NPM is an example of a dependency culture that encourages very small packages, with left-pad being the cited example because (1) it’s only ~30 lines of code, and (2) it’s famous for that time its removal from NPM caused mass build failures due to how widespread it is in transitive dependency graphs.
    
    Conversely, programmers in C/C++ or Go do not typically use very small dependencies – either they depend on a few large dependencies that each contain lots of functionality, or they just write the code themselves. So you end up with libraries like GLib that are absolutely huge – GLib contains an event loop, a test framework, an XML parser, and more.
    
    The crates.io dependency culture is caught in the middle between people used to working in JavaScript (NPM), who write small libraries like cfg-if or …, well, like left-pad … and people used to working in C/C++, who write large all-in-one libraries like rustix or winapi.
    
    msfjarvis
    
    I agree that cfg-if is an easily avoidable dependency, but I also doubt a 70 line macro is something the average Rust developer can bang out on their own. I sure can’t.
    
    jmillikin
    
    First, a 70-line macro should be well within the capabilities of every Rust developer. It’s not even a proc-macro, it’s just plain old macro_rules!() with some trivial adjustment of the if-else expression tree.
    
    Second, the syntactic sugar it enables is IMO so minor that it doesn’t justify adding a dependency in the first place. If cfg_if!() were part of the Rust standard library then that would be one thing, but pulling in a third-party dependency just for slightly terser #[cfg(...)] attributes is very reminiscent of left-pad.
    
    Diti
    
    For starters, I don’t think anyone should be using version 0 of crates they find interesting. Here be dragons.
    
    hsivonen
    
    That doesn’t work in practice. There are crates that got to a very mature state with 0.x numbers and releasing 1.0 would be unnecessarily distruptive to the ecosysytem when there is no pressing need for an actual breaking change.
    
    Sanity
    
    That vendoring feature sounds nice, I’d love to do the same count with my projects – is there a way to do this with stack/cabal in Haskell? (Surely there’s a way with nix?)
    
    vaguelytagged
    
    Seems like there was some movement about vendoring in stack here (https://github.com/commercialhaskell/stack/issues/3813). Definitely one of my favorite features of cargo. If any rust foundation people are here, it would be amazing if cargo could filter dependencies based on the platform you currently are on (i.e. only linux).
    
    hjr3
    
    I understand the concern regarding dotenv vs dotenvy. Tokio is a pretty mature library though. Is the author concerned that libc is a dependency?
    
    vaguelytagged
    
    More that rust has alot of fan out dependencies and the sheer magnitude of some of them. Imo Tokio is basically part of rust for the sake of what I do (mostly servers).
    
    AndrewStephens
    
    This is such a trade-off in any language. I don’t use rust but even in C++ any project of sufficient size will probably pull in some non-system source repos. And the situation for javascript and python is famously chaotic, with outright malicious packages masquerading as well-known packages with similar names.
    
    The only real solution is to write the code yourself but that is more work both initially and for maintenance, let alone that I would find it very difficult to replace things like libcurl or libssl.
    
    mperham
    
    I don’t particularly care for Go’s modules, I like Ruby’s Bundler packaging but I’ll take either over Rust and JS’s “million little packages” approach to dependencies. Dep problems seem to grow like O(n^2) (quadratically?) so apps with more, smaller dependencies quickly grow unmanageable.
    
    peter-leonov
    
    It seems every popular language has to go through the same stage of maturity. It’s too late to cling to the existing ecosystem but a bit too early to have a battle tested own one. Just give Rust some time to gain enough trust. In a blink you’re gonna see articles like “Why X has so many LoCs to maintain, why not just use Rust?”.
    
    Just to name a few that I remember:
    
    why rewrite in C if there is ASM (and/or kernel primitive), also C is slow
    
    keep your C++ for fancy experiments and instead use C, and look how slow C++ is!
    
    Java? It’s just a lil’ bit different C++ with smart pointers and is slow, use C++ everywhere; btw Java is slow
    
    Ruby is a toy language, don’t (re)write anything big in it; and it’s is slow
    
    JavaScript?!? Ok, at least it’s a bit faster, but just use Ruby [I was this guy] it has everything
    
    Rust is just a compiled version of Ruby with types, no? Ruby and a bit of C is what you need
    
    Go?
    
    And so on and on. Until we invent The Language (C++ tried to be the one) that everyone agrees on using for everything there’s gonna be change and with change come uncertainty, be it speed, safety or adoption. Just keep managing the risks with proper tooling and processes and keep going!