Can jank beat Clojure's error reporting?
47 points by jeaye
47 points by jeaye
This is really good stuff, and I love to see it. Beating Clojure's error reporting is admittedly a very low bar, but Jank does a good job here.
That said, there's one mistake here that is very common: using the ^^^^ markers to highlight the code elements that caused the problem does not work reliably: https://reedmullanix.com/posts/unicode-source-spans.html (and fundamentally cannot work reliably)
This technique is based on the mistaken assumption that characters are always rendered on a monospace grid. As soon as you have code that incorporates wider characters you can see where it breaks down, but ironically the blog post demonstrates this breakdown in an even more obvious way, by displaying its code samples in a proportional-width font. This causes the marker characters to be wildly out of place.
Thanks for the tip, dude! Makes sense to me.
The blog is using a dedicated font for the error output blocks which makes everything look very pretty. Perhaps your browser disallowed it from being used.
Perhaps your browser disallowed it from being used.
Yeah, I don't have any rules for blocking fonts, but it looks like the fonts aren't being used directly but rather somehow injected using 3rd-party JS that is being blocked, and the fallback fonts are not set up correctly.
Just FYI my ad blocker is also blocking those fonts and the code blocks look bad. I didn’t realise that this was why until I read your comment :)
It might be worth adding a fallback that doesn’t rely on JS.
What would work better? Color? Emoji inserted as delimiters?
Ah shoot; I was in a hurry and relied on my memory instead of reading thru the article I linked; I thought it mentioned this but I must be getting it confused with another. Anyway, the solution is your markers have to go inline with the text being marked.
I believe the easiest way to do this is using colors; we ended up using reverse background ansi escape codes in the Fennel compiler: https://technomancy.us/198 As a nice bonus, this also works for code written in right-to-left languages like Arabic, as demonstrated at the bottom of the post.
If you can't use colors, then some kind of character that is unlikely to appear in the text should be used. For code that could be something like «this» or 「this」.
This is what the linked article has to say:
Instead, the best option is to take the advice of the Unicode Consortium themselves, and rely on markup languages or platform-specific features to handle styling of unicode text. For terminal display, this is best handled by SGR codes for underlining, and other platforms/editors have their own mechanisms.
(and fundamentally cannot work reliably)
Isn’t this saying you fundamentally can’t align arbitrary Unicode text in terminals? CLIs with tabular output do it all the time, e.g. SQL clients, git blame. If you use a library like unicode_width it’s 99% solved. If you want to go the extra mile you can have an environment variable to configure whether you interpret ambiguous width characters in a CJK context or not. Maybe there are still unsolvable edge cases but I don’t see why that means it’s a mistake to use caret markers.
These are some nice looking error messages, congrats.
One disadvantage I can see is that those kind of error messages are not easily parsed by other tools. In my compiler, I try to stick to GNU's suggestions about error messages. The error messages start with the source file and the line number, like
sourcefile:lineno: message.
In Emac's compilation buffer, those errors are automatically parsed and I can jump to the source location with one button. It's also quite easy to parse the source location for something like flycheck to indicate errors inside the editor, without resorting to heavy-weight solutions like LSP.
Thanks for taking a look and for leaving this note. I considered this, too.
I think that yours will be a popular opinion, but I think it's misled. This error output is designed to be read from a terminal. jank will support different output modes, to aid in editor/IDE integration. However, we shouldn't impact the UX of our terminal output just because Emacs users want to use a different workflow. Our error reporting shouldn't be stuck in the 90s because GNU standardized on how to report messages then. I think that appealing to peoples' workflows is important, which is why jank will support different error output modes, but the errors shown in this post are meant to be consumed by a human at a terminal. That's also an incredibly popular use case.
I explained, in the post, that this approach to error reporting may feel different. I'm gathering all of the relevant bits into code snippets and showing it inline. This is optimizing a particular workflow for someone compiling from a terminal. If I instead followed GCC, as GNU would have me do, I would instead be optimizing for a different workflow, leaving the better terminal experience on the table. So, sure, your Emacs workflow hasn't been addressed yet. I dig that; I actually use a similar thing, sometimes, in Vim. But I think that trying to have every error message be both easily human and machine readable is trying to do too much. They should be separate systems.
Error reporting and diagnostics are also a deep interest of mine and I wrote some articles about it, having said that, this one really goes in depth, loved the read.
https://xnacly.me/posts/2025/syntax-highlight-for-errors/ https://xnacly.me/posts/2024/rust-pldev/#fancy-error-display