"Vendoring" is a vile anti-pattern (2014)
5 points by marchuk
5 points by marchuk
Use git submodules
Absolutely not. They don't even get cloned by default. With vendoring you have the files. With git submodules you may or may not have the files, and your users will create bug reports asking where the files are. And they'll be right to do that, because git submodules have such shitty UX that nobody should be expected waste brain cycles using them or even thinking about using them.
Yeah, never use git submodules. I’ve linked to the Lobsters discussion because it’s informative, tho I agree with the article – especially the long list of common git features that no longer work properly in the presence of submodules. Infuriating.
We really should fix submodules. Honestly, they also violate the git as a distributed VCS paradigm.
Are subtrees not that? Or by fixing submodules, do you mean transparently improving the behaviour of existing submodules that many repositories have already defined?
My take:
Those are probably covered by the comments in the Gist, but I lost interest by then.
I'm personally positive about the practice of vendoring the source code since it gives more control and options in long-term
However, I'm also curious about the progress of dependency vendoring over the past ten years and current solutions for transitive dependencies
Honestly? I love vendoring. Like with anything else, there are good and bad ways to do it. But in my experience, it can greatly simplify builds and dev. If you pull your repo and have a decent compiler, it'll build -- none of this "semver ~3.1.1 pulling in 3.1.8 which isn't compatible," no download surprises at deploy time. I've gone back to repos where we've checked in generated artifacts (Thrift classes) or vendored deps, and they still worked on a clean clone a decade later.
We've done a lot in the last 12 years since this was published to improve reproducible builds (and build systems generally), but after my 5th company in JS/Python codebases which had a more opaque and unpredictable "build" step than the Flash Runtime, you know something's gone horribly wrong.
Just like it's possible to over-DRY a codebase, I think it's possible to configure your builds and CI too cleverly. Hating on vendoring is one such example, others:
Insisting on immutable architecture, all the time (great post here). Especially if your team is small and pre-PMF, you can save yourself a lot of headaches by letting your build system use concrete, mutable infrastructure when appropriate.
Demanding "cattle, not pets" too early. This goes hand-in-hand with the above: you can manage a few pets. "Your pre-PMF company now has to build a crappy clone of Heroku and the whatever the company is."
Infrastructure-as-code too early. I'm the least attached to this one, but IME unless your company is doing heavy heavy lifting, using Terraform or Pulumi introduces extra steps and those codebases grow big and tangled, fast. I'm reminded of that brief period when people were insisting that using Cucumber/BDD was an improvement over normal unit testing, and feeling like they were confusing "typing" with value.
I agree that vendoring is often overused. "Why is vendoring bad" reason 1 and 2 are good reasons. 3 is a good reason for the most of time, but sometimes it is the point (and 5 is basically 3). And 4 is IMO irrelevant unless for very large dependencies.
There are also some practical considerations. In the C and C++ ecosystem, many libraries actively encourage vendoring, and vendoring is often the easy path of integration (e.g., dear imgui, glad). Further, in an ecosystem where most people do not use a package manager, libraries vendoring transitive dependencies makes much more sense. And since many developers dislike git submodules, it’s understandable that libraries prefer not to force that it onto their users.
Also, there is another more evil variant of vendoring (binary) that this post didn't cover. There are many valid reasons to do that too, though