Compiler reminders
37 points by jfmengels
37 points by jfmengels
Even though we don’t call it that, the Rust compiler errors are written in this way. If we notice or get a report of a common incorrect code transformation or subtle mistake (like a typo of :: or ; to :, or writing an enum variant where a type was expected), we look at what the steps necessary to get to the intended effect (when one or two are obvious), and provide suggestions in that direction. When the compiler stage architecture is amenable, we skip intermediary code transformation steps and suggest the target. But if the compiler doesn’t yet have enough information (like needing type information during parsing), and changing the error to be emitted later is infeasible for the benefit (like the turbo fish recovery, it is purely syntactical to avoid implementation complexity), we make it so that the suggested code results in an error that does suggest you the right code, in no more than 3 steps. So you end up having to implement the same or similar suggestion in 2 or 3 places, but in the end it means that no matter what the user tried to write, we have several chances of “catching” the user straying from the right path and nudging then in the right direction. The other benefit of this is that when the transformation is of high confidence, then the compiler can recover and continue emitting errors from later stages that are actually useful. When they aren’t, we try to poison the smallest possible element and mark it as “don’t emit more errors from this”. So errors should always be relevant. Periodic reminder that if they aren’t accurate, relevant or useful, tickets are welcome.
This is why we often advise Elm developers to list out all the branches in case expressions rather than using a wildcard. Even if it sometimes feels tedious, it increases the number of cases where making a change leads to getting compiler reminders.
I find it unfortunate that this kind of recommendation keeps going around – this is really a tooling failure that ought to be fixable.
E.g. imagine if you could annotate the new case with something (be it a magic comment or something else) indicating that it is new. Then a compiler or linter with access to pattern-matching information could tell you all places where the new case is being matched against using a wildcard pattern. You could then go audit them one-by-one.
At the same time, it’s understandable that this recommendation is made to users, because most users don’t write or contribute directly to tools…
That would only work if you have control of the type definition. It doesn’t work for libraries, which is where you care about the most.
I’ve worked in ecosystems that have the same convention as Elm and those that have the inverse, and overall I found the Elm one to result in fewer bugs. The extra keyboard-typing is solved through or patterns and the language server doing it for you.
Types having version numbers (and every use site declaring which version they meant) might solve the issue, though I can’t imagine them being too practical to use in traditional text-based languages.
That would only work if you have control of the type definition
The idea generalizes – the “problem” is that the annotation/information for the case needs some other place.
I’ve worked in ecosystems that have the same convention as Elm and those that have the inverse, and overall I found the Elm one to result in fewer bugs
My broader point is not about convention; it’s about usability. If a language supports wildcard patterns, I think it’s valuable to make tooling work well with them, because some people are going to use wildcard patterns (especially in nested positions).
The extra keyboard-typing is solved through or patterns and the language server doing it for you.
It’s not really about typing. If you have lots of cases all of which are handled identically, having a wildcard pattern can make the code easier to read.
Often a catch-all/wildcard pattern gets used because the RHS is annoying to repeat, or because there’s a lot of constructors and listing them all consumes a lot of vertical space. “Or-patterns”, introduced in GHC 9.12.1, give you a way to deal with this. They probably wouldn’t line up with Elm’s design philosophies, though.
Defining “new” to a compiler sounds like a major headache. Compilers like to be stateless, incremental recompilation/etc aside.
IMO a better tooling improvement would be just letting your LSP generate the branches for you, with placeholders or such. rust-analyzer
can already autocomplete match arms for you.
The idea I’m suggesting is purely static; the point is to identify all places in the code where a wildcard pattern “covers” a particular case which has some special annotation.
I admit the framing as “new” is a tad confusing though; something like “warn_on_wildcard_match” would be clearer.
I don’t know if it currently supports it, but this kind of structured change management is what unison is great at.
That is an interesting idea for tooling. In the case of Elm, the linter is sufficiently powerful to support such a linter rule. Though as lpil said in another comment, it would only work for types you have control of (that said, it’s do-able by configuring the linter). And I definitely see the value in finding where a type is pattern matched.
But what you’re describing is a pro-active action by the developer. Some developers will do it most of the time, some will never do it. And even then, one might only review parts of the usages (and then remove the annotation) or only one of the types that have been modified.
The value in a compiler reminder is that it is passive, you can’t forget to do it and you have to do it consistently. I agree that it is more tedious than even I like, but overall its still the better approach for maintainable code.
very nice post! small problem though, the link in the last paragraph gets to a non-existent page due to it linking to https://jfmengels.net/compiler-reminders/constraints-and-guarantees instead of https://jfmengels.net/constraints-and-guarantees