I Don't Like Imports
33 points by jo3_l
33 points by jo3_l
Personally, I’ve seen significantly more code where the problem of understandability comes from being overly verbose than code which I can’t understand due to unknown external imports. I’d advise the opposite approach if you want easy to maintain software.
The “it’s too long” argument is a reasonable stopping point that makes me disagree with this advice. The purpose of all abstractions in programming is to make it quick to easily read and understand code. An import is one of these ways. Adding repeated / unnecessary namespaces to every place that names are used makes code more difficult to read easily at scale (as a general rule). As a general rule, verbosity often harms intelligibility.
Some rust specific things I personally like:
- group imports by module, not by crate (this is a good balance that makes lines of code short, and diffs easy to grok)
Alternatively, but also either way really, difftastic is as good its name suggests.
While the tool may be great, that only solves one use case for where diffs appear, but there are many (github PRs, emailed patches, command line, various IDE diff views, …). I’d generally solve the problem of noisy diffs by avoiding it rather than impose tooling that only works for one of those use cases.
Personally, I’ve seen significantly more code where the problem of understandability comes from being overly verbose than code which I can’t understand due to unknown external imports. I’d advise the opposite approach if you want easy to maintain software.
Verbosity impacts the lexical parsing of code (by human eyeballs) for sure, but understandability is a higher-order property, and maintainability higher-order still. There are of course levels and limits to the game, unnecessarily verbose code will be harder to understand than terse code that’s smartly written. But, to crib the article’s example: mentally de-referencing the implicit identifier get
to reqwest::get
according to file- and/or context-dependent import declarations, will surely incur a higher cognitive burden than reading/parsing the explicit identifier reqwest::get
in the first place.
I would almost certainly lump reqwest::get()
in the “use full names where there’s ambiguity (e.g. custom Result / Error types)” point of my comment.
I’d advise the opposite approach if you want easy to maintain software.
A quick clarification on this: I’m opposite of the author’s advice to expand all refs. But I agree with the author’s view of flat APIs. I’d phrase it as “shallow and wide” rather than “flat”. There are use cases where it’s more useful to group certain parts of a system as vertical slices. E.g. a foo module with FooController, FooView, FooModel, FooQueries rather than putting these into horizontal controller/view/model modules. This is one of those “Code that fits in your head” type heuristics.
The real problem is that there is just a single way of grouping. That simply doesn’t make any sense. Databases have secondary unique indexes for a reason.
Another way of phrasing what you just wrote is that code generally has a single identifier which has a single namespace. Technically that’s actually wrong in many languages due to aliasing. E.g. Rust has aliasing and re-exports
// in mycrate/src/lib.rs
pub use foo::Bar as FooBar;
pub use baz;
// in some other place
use mycrate::FooBar;
use mycrate::baz;
The hassle with this is that this multiple identifiers that mean the same thing can make life more confusing. There are good use cases for this sort of thing (e.g. re-exporting libraries that appear in your public interface to make it easier to import the correct library versions even when there are multiple semver incompatible libraries in use). This should be used sparingly though IMO.
That’s not what I meant. You are talking about aliases. That would be doing “as” in the sql query. I’m talking about the join/filter part and on which column it works on. So that would be the “use x::y” thing. It’s basically filtering by primary key and that’s all it can do.
This post advocates for fully qualifying external symbols by the crate/module name, instead of pulling them into the local namespace. The code examples are in Rust, but the point is general.
Personally, I largely agree with this stylistic preference, but don’t care about it to the same extent as the author: I’ll follow standard ecosystem style in languages where pulling in external package names is idiomatic.
I do like the approach that Go takes here: there is no way to import specific symbols within packages, so you are forced to use the package name at all times[^1]. Moreover, there is no nesting of package names; subpackages must choose their own name. This design choice, along with a strong convention in the Go ecosystem to avoid repetition when choosing names for package contents, helps avoid long names. From the Go dev blog on naming package contents:
Avoid repetition. Since client code uses the package name as a prefix when referring to the package contents, the names for those contents need not repeat the package name. The HTTP server provided by the
http
package is calledServer
, notHTTPServer
. Client code refers to this type ashttp.Server
, so there is no ambiguity.
[^1]: unless a dot import is used, which is heavily discouraged and a rare occurrence
The Go guideline for picking names for prefixed-use works well if you’re reading code which largely uses third-party libraries (or more generally, the code outside the package, not inside it). It doesn’t work so well if you’re yourself jumping between internals across code with similarly named types.
At work, we have a ~1M SLOC Go monorepo. At some point, we had dozens of types all called store/Store (for the DB layer) and service/Service (for the business logic layer). This made reading code more painful; you’d have to constantly reorient yourself when switching between packages (“I’m in this folder, so Store here is referring to this kind of store”).
Direct text search for type names becomes useless because of this; you need to either use semantic symbol search, or do text search for the correct prefixed use, then do Go to definition.
So for certain parts of the code, we’ve actually started going against the official Go guideline and started following the pattern in the article (FmtWrite vs DebugWrite).
At work, we have a ~1M SLOC Go monorepo. At some point, we had dozens of types all called store/Store (for the DB layer) and service/Service (for the business logic layer). This made reading code more painful; you’d have to constantly reorient yourself when switching between packages (“I’m in this folder, so Store here is referring to this kind of store”).
Usually for this you use import aliases dbstore "myrepo/db/store"
That helps with fixing collisions in downstream packages. My point is about reading the implementation code in the ‘store’ package itself.
One thing that annoys me about the Go style is package names (and type names) collide with variable names. So you end up with awkward names to work around it. And capitalization to export makes it worse, e.g. if you decide to unexport Foo you have to rename lots of foo variables.
At work, we have a ~1M SLOC Go monorepo. At some point, we had dozens of types all called store/Store (for the DB layer) and service/Service (for the business logic layer). This made reading code more painful; you’d have to constantly reorient yourself when switching between packages (“I’m in this folder, so Store here is referring to this kind of store”).
This is one of the most baffling cases of “I don’t understand how you’re using your tools” I’ve encountered. I constantly use global lookup for classes (I do the bulk of my work in Java), and having each class name be unique is such a speedup. Having multiple classes with the same name wastes a lot of my time.
I do think this way of working presupposes a lot of familiarity with the code base. If you’re not navigating quickly in the first place, then not being able to immediately jump to FooService is probably less important. But if you can achieve that familiarity, it can speed up finding code so much.
P.S. Though, I just realize I’ve broken this rule. One of my libraries has a com.justinblank.strings.Pattern
class that can often replace the java.util.regex.Pattern
class in code. Now I’m considering renaming it before the API gets used and fossilizes.
I mean a 1M SLOC code base is going to be a disaster more or less by definition, but for my puny 20K LOC Go project, I would do things like imagestore, filestore, etc. instead of just naming a package store by itself. Nesting directories doesn’t do much in Go (it does control the /internal/ mechanism but not much else), so typically you wouldn’t want to do ourorg/project/image/store, just ourorg/project/imagestore.
My experience is based on a java codebase that’s > 3 million raw lines (just wc -l, not semantically aware) that largely held to the “unique name per class” rule, until turnover led to more work by developers who didn’t know the codebase, broke the rule, and made things a lot less convenient.
Interesting perspective, thanks! Your example is pretty compelling; it’s true that the Go style somewhat degrades readability in package implementation code.
Thinking about this problem a bit more, I think it’s actually easy enough to observe these kinds of readability issues even in projects that don’t have packages exporting similar things as in your example. Many packages export a New()
constructor, which reads nicely from the outside when prefixed but is just invoked as New()
internally. Figuring out what that call does is mentally taxing (“new what?”) if you haven’t context-switched into the package.
On balance, I still like the Go style because it reads so nicely from a package client perspective, but there is a real problem from the package implementer perspective. I don’t really know how to fix it in a satisfying way either. Language design is hard :-)
This would be a great editor feature. If I am copying from one file to another in the same editor it should be able to also pull in the required imports.
JetBrains IDEs have been doing this (at least for Java and C#) for many years now.
Generally agree with the sentiment, a big offender in Rust specifically is std
vs tokio
– authors of Tokio decided to use the same type and method names as std
, so now you see, e.g., a Command
and have no idea whether it’s sync or async until you check usages or see the full type using intellisense.
While I personally prefer languages that nudge you to specify the namespace, I feel like even type imports can work reasonably well, as long as the ecosystem follows a convention of naming the types in a “unique-enough” way, as e.g. the .NET ecosystem tends to do (and as long as I never have to manage the imports manually and rely on the IDE to do it for me).
std::sync::Mutex
vs tokio::sync::Mutex
is the bane of my existence 😛. And of course the name used in each file is dependant on which one was first used in the file. Most of the time the new import gets a local alias rather than touching all of the code that uses the other one already.
An argument for explicit imports on the top is that you actually have an easier way to get an overview of the dependencies of your module. Now Rust doesn’t offer that guarantee (as fully-qualified and local use
statements exist), but thinking about dependencies is often useful when you think about abstractions and loose coupling. Being able to eliminate an import in many places might be a signal your new abstraction is working out, for instance.
Besides importing the name or fully qualifying things, a nice compromise can be to import the module use std::collections
and then refer to collections::HashMap
. Maybe I missed it, but if not, I’m a bit surprised the original post doesn’t mention this. Importing a module uses an import but retains a some local qualification and thus can help resolve ambiguity while still being short.
On nesting: I tend to start with a flat hierarchy and then build up some nesting as the separations become more clear. But nesting is fine with me if it’s an emergent grouping rather than some kind of premature organization.
I wonder how much tooling makes a different in this preference. If a tool makes it easy to trace something back to its original location, perhaps that can affect preferences?
use std::collections and then refer to collections::HashMap
(Original author here.) I would maybe consider this for std
as you can sort of view it as a collection of libraries. But to me this adds back collisions for 5 characters, which doesn’t seem worth it to me. Sure, the collections
crate doesn’t exist but many other std modules do such as num
. Honestly if I was typing out std::collections::HashMap
a lot my preferred approach would be to just import it directly rather than use std::collections
. Basically if I am going to take all of the downsides of a local name I may as well make it short.
I’ve worked on codebases (in Rust) with an annoying mix of legacy code written by “I refuse to actually import the types I’m going to use” folk, it never, not for a single moment made the code more understandable. Just made it verbose.
I’ve had the opposite experience. Having bunch of imports with similar names just makes it very confusing where they actually come from.
In my editor when I hover over these types it tells me immediately where they are imported from. It might be useful to you to have such a feature available. There are multiple options you should check them out.
I don’t feel as strongly about importing vs. qualifying, but I agree with the author – I don’t like the Rust convention of exposing an API with many nested modules. There are very few crates that need more than one flat namespace. It’s easy to present a flat API with pub use
in the root, but the path of least resistance is to skip that and conflate implementation modularity with public API namespacing.
The copy/paste feature they ask for works pretty well in the move semantics in intellij/rustrover, with regards to respecting imports when moving a function.
Otherwise, I don’t get the point. Naming things is literally the entire project of programming.
I do think if more libs offered top level re-exporting of the important bits that this would not feel like the worst pattern for many libraries.
I do agree with the sentiment that code shouldn’t be “too long” (subjective) because that does affect legibility. But having the single name (reqwest::get
) really helps highlight the domain separation and what subsystems you’re poking at clearly.
Overnesting is a plague that permeates programming everywhere. It’s a special case of premature abstraction.
Typescript has a hand of it as well putting everything into tiny files in subfolders of subfolders. It’s never necessary but a certain kind of person doesn’t deal well with seeing a root folder with more than 4-5 files in it.
I use a similar pattern in Java in select circumstances.
Specifically, when using Guice (a dependency-injection framework). Guice has “modules”, units of composition of dependency-injection bindings. A module can “install” other modules, creating a hierarchy.
A best practice is to keep your hierarchy shallow, and to have a top-level module for your binary that installs as many of your cross-cutting, independent modules as possible in one giant list.
If you import all of the modules you install at your top level, then two things can happen:
install(new net.foocompany.bar.ClientModule());
install(new com.quxcompany.quux.ClientModule());
install(new ClientModule());
install(new ApiClientModule());
So my preference is to use fully qualified install lines for all modules.
(I haven’t found teammates who agree with me yet)
As a disclaimer, I haven’t done anything with Rust. I’m reading this as a general take on imports.
This, to me, only seems to be an issue when you are not using an IDE to begin with. I know there are entire tribes swearing by TUI editors and all that, so I suppose that is a real possibility. But, I rarely ever get confused about where things originate from. The same with moving code around, a good IDE will also help you with that.
To be fair, the author does acknowledge this at the start, though from a slightly different perspective.
I find when I am working on codebases where the convention is to import things I am very frequently tracking down the definition of some type or function as I don’t know whether it is local or from which library. This is especially painful when reading or reviewing code in a web interface.
I can see the argument. But, personally, I am not comfortable prioritizing my code based on the notion that someone might view it in a web interface. I’d also argue that in a diff it is actually better to have your imports at the top as it makes it extremely clear when you introduce or remove these dependencies.
And also here, a good IDE will help. At the bare minimum, you can check out the branch under review and diff it with main. Many code forges like GitLab and GitHub also offer extensions for IDEs allowing you to open the entire merge request complete with working imports.
The only thing worse than bikeshedding is bikeshedding by people that produce unnecessarily verbose and difficult to read code
Elixir is the worst offender I’ve encountered when it comes to imports as there are at least 3 different ways to do so and when you’re in the middle of a large file it’s nearly impossible to know whether the function is
import
use
require
without reading the top of the file. It could even be coming from a module defined within the same file!
I don’t have that much experience importing names because an unqualified name is very different in a C++ context vs a Python context. In Python, the most complicated behavior you’re likely to see is that your code consistently calls the wrong function. As a ridiculous example:
from math import log
from logging import getLogger as log
log(2)
But, the behavior here is quick enough to debug compared to the experience of unqualified name lookup from C++. In C++, I would rather type a lot in order to avoid name lookup rules that interact with overload resolution if it isn’t intended as a customization point.
I just read the article complaining about how Go is a garbage language for garbage people, and lol this problem is completely impossible in Go. It’s almost like different languages have different pros and cons.