Steering Zig Fmt
75 points by vi_mi
75 points by vi_mi
IIRC gofmt has a similar steering behavior, and I agree that I prefer those kinds of formatters over rustfmt. But any kind of formatting beats no formatter at all.
But any kind of formatting beats no formatter at all.
I don’t want to let this pass without remark.
Autoformatters enforce mediocrity. They drag up a lot of bad writers, but they also drag down a lot of good writers.
Me, I’m willing to use them when collaborating with others, if said others wish (or if my opinion of their personal formatting discipline is poor enough). But when working solo, I never use an autoformatter: I have opinions; so do they; but they differ irreconcilably.
Can’t say I’m not impressed with what this article shows, though.
(Your line reminded me of one from Fiddler on the Roof: Yente the matchmaker says: “But even a bad husband—God forbid—is better than no husband at all!”)
From what I remember Elm's formatter does something similar, and I found it really nice in comparison to formatters that don't take the original formatting into account.
I use clang-format for some C++ projects. It's horrible. It's incredibly unstable across versions so clang-format upgrades mean formatting commits which touch every line of code. I'm seriously unsure whether it's better than not having a formatter.
There is git-clang-format that can be used to only format a particular change. That's how every formatter should work IMHO.
Yes! It's so bad. I feel the need to version-pin it in CI or with pre-commit or it's just unusable. There are so many options for controlling how you want your code to look, but the instability is so frustrating.
My solution was to grab a versioned clang-format from PyPI using virtualenv and pip in a make target, hehe.
Regarding options: yeah there are a ton of options, but I never quite managed to get it to be how I want. I had to fight hard to make it use indentation rather than alignment (I especially hate aligning function parameters to the opening parenthesis). I never managed to get it to output tabs, so I relented and used spaces (yes there's an option to use tabs, but it'll still indent using spaces in some places even with that setting).
But any kind of formatting beats no formatter at all.
That was my thinking for many years and I have completely come around it recently. Auto formatters mostly solve a people problem in that they get rid of bikeshedding on pull requests. But now that we're moving on to agentic engineering that becomes less and less of an issue.
I have multiple projects now where machines do most of the work and once that happens it turns out preferable (in my experience at the moment) to not have formatters run.
In my agent-managed projects I prefer to enforce as much static checking as possible and I include formatting in that - I want to have it readable for me and also as regular as possible for the agent/ other people.
Have you found the agent struggling with the formatting?
For command line arguments in Python I like splatting tuples into a list, so the final example from the article would be:
[
"aws",
"s3",
"sync",
path,
url,
*("--include", "*.html"),
*("--include", "*.xml"),
*("--metadata-directive", "REPLACE"),
*("--cache-control", "max-age=0"),
]
For people who like formatters that don't enforce a line length, limit, what do you do when you're viewing code in a narrow space where long lines get soft wrapped? I find having a line of code overflow onto the next line to be extremely ugly, distracting, and hard to read. So much so that many details about different formatting conventions seem insignificant by comparison.
Do you (and your teammates) always view code in wide spaces? Do you manually ensure that lines aren't too long? Does soft wrapping seem OK to you?
Last time I looked, there was no way to configure zig fmt to use an 80 col limit rather than a 100 col limit. Is that still the case?
I bump up the font size in my terminal because I find it strains my eyes less if I’m working for many hours a day. Eighty vs 100 columns is the difference between whether I can fit two vim splits (plus nerd tree) next to each other or not.
As the guy who introduced a rigid formatter into a team that was using none, I do sometimes miss the ability to influence formatting manually. Zig being flexible in that regard is very awesome!
That is splendid!
Is there a TS/JS formatter like this? I have a project using maplibre-gl and style spec expressions sometimes get too formatted and I can't see anything. For now I stopped using a formatter, but the code is getting dirty, because I'm debugging and copying around and commenting out.
Maybe zig formatter could also be made to format other languages. :)
Prettier has this, but specifically for object literals, and specifically only to allow you to choose between "all on one line" and "every element on a different line". There's no way of telling the formatter to have, say, four elements per line.
Also if the object literal gets too long, then Prettier formats it into "every element on a different line" anyway, regardless of how the input text is formatted.
Interesting design, but not sure if I like it. In my formatter I do it differently: the formatter decides on a formatting ignoring trailing commas. Then, if it splits into multiple lines, it always adds a trailing comma. If it formats on one line, it always drops the trailing comma. So you can't "steer" it, but it consistently formats f(1, 2, 3) regardless of how you wrote it (with or without a trailing comma, with different amount/type of whitespace between tokens etc.).
Some amount of "steering" is necessary, e.g. if I have a long list literal [<expr1>, <expr2>, ..., <expr100>] most formatters will put each expr on a line, but you may want to fit as many as possible on each line. I think it would be strange to decide between the two based on a trailing comma, and in general you may have N formatting choices instead of 2. I think attributes work better for this purpose. E.g. we could have (maybe we already do?) #[rustfmt::list_layout(flow)] to add before a statement to influence formatting of a list literal in the statement, or similar.
Lots of "steering" also defeats the purpose of formatters (creating consistent formatting of all code across the whole ecosystem, making code reviews easier etc.) so it should be done only in limited cases. I think long list literals are one example where you really need it. I also have examples in a project of mine where formatting helps review the test expectations, e.g. here.
Edit: just remembered another "steering" behavior in another language: in Dart's formatter, you can add comment lines to group lines together in long list literals. E.g. if you have [1, 2, 3, ..., 1000], it'll put each element into one line. You can manually group by adding comments:
[1, 2, 3, 4, 5, //
6, 7, 8, 9, 10, //
...]
I don't know if this was a deliberate feature they added to allow this kind of thing or some artifact of its comment handling.
In my formatter I do it differently: the formatter decides on a formatting ignoring trailing commas. Then, if it splits into multiple lines, it always adds a trailing comma. If it formats on one line, it always drops the trailing comma.
This is precisely how rustfmt behaves, and it drives me nuts. Sometimes there are cases where going beyond the line limit and not breaking a function call yields more readable code, and I would prefer to be able to have an opinion on that.
One particular case I can think of is OpenGL, where you routinely do a bunch of gl.* calls in succession, modifying or using a single resource (e.g. initialising a texture) and rustfmt will bulldoze over them without any sense of purpose, other than robotic "LINE TOO LONG. LINE MUST BE BROKEN."
(This is a contrived example just to illustrate the behaviour, not actually how rustfmt behaves. The lines are not that long. I'm typing this on a phone though so I don't really have access to better tools to make this 100% correct)
gl.bind_texture(gl::TEXTURE_2D, tex);
gl.tex_parameteri(gl::TEXTURE_2D, gl::TEXTURE_MIN_FILTER, gl::NEAREST);
gl.tex_parameteri(gl::TEXTURE_2D, gl::TEXTURE_MAG_FILTER, gl::NEAREST);
// -->
gl.bind_texture(gl::TEXTURE_2D, tex);
gl.tex_parameteri(
gl::TEXTURE_2D,
gl::TEXTURE_MIN_FILTER,
gl::NEAREST,
);
gl.tex_parameteri(
gl::TEXTURE_2D,
gl::TEXTURE_MAG_FILTER,
gl::NEAREST,
);
Like, it'll do stuff like break apart these subsequent calls to gl.tex_parameteri across multiple lines, even though it would do each more justice to be laid out fully on a single line... because them aligning in columns lets you more easily spot the difference between the two lines.
The broken version has less optical locality and it's harder to read. Your eyes can't diff the two lines as easily anymore.
This also causes silly things such as it failing completely when it can't format a line to fit in the character limit. This happens to me routinely when writing compiler code, where I construct diagnostic messages from string literals, which can get pretty long. Rustfmt does not know how to break them, and so it'll give up and not format the statement at all. Often times it's a case like
match something {
// ... match arms above this one ...
_ => emit_diagnostic(&mut state, "This is a very long message to try and illustrate the problem. Help: please consult a doctor.")
}
where due to the emit_diagnostic call only being an expression, it'll give up formatting the whole match statement—which is plain silly.
All could be avoided if it didn't try to bulldoze my code to be 100 columns at most.
I've been disappointed by a format-ers with a light-touch (basically line-wrapping), and appreciate with envy this notion of flexibility in a more rigorous example :p
I recently answered a question on fedi, about writing & formatting lisp with proportional fonts, by highlighting significant-whitespace variants of s-exps (ie. wisp, Readable/Sweet exps, and SRFIs 119 & 110), and made the relevant observation that this syntax-family leverages optional infix syntax-extensions to give-back some control over line-breaks
On the comment at the end, for people who had to look it up like me, ++ is the array concatenation operator.
So by splitting your array into two arrays, you can format them differently.