Opinion piece: On Zig (and the design choices within)
42 points by clerno
42 points by clerno
I'd like to say generally, the utility of "language reviews" by anyone who hasn't built and maintained some real artifact in that language is minimal to nothing. The only utility is understanding what people latch onto as first experience takeaways and perhaps addressing those for adoptability concerns, but it doesn't really say anything about the reality of building or maintaining real world software in any language.
Disclaimer that I obviously like Zig, but I apply this framework to anything and its not out of being defensive about the OPs dislike of Zig, which I have no problem with on an individual basis.
One thing I think is valuable is combining these articles with commentary.
A lot of the issues described in the post make me not want to use Zig. I appreciate having extra comments and context for expressing why something is the way it is (it's interesting to hear about the reasoning!), but it's a good sort of slice into the non-priorities of a language. From which you might be able to inform the priorities.
I think a lot of people end up thinking in terms of wanting to use a language where they don't have to worry about X/Y/Z (or have to deal with issue A/B/C as much or as tediously), so this can be helpful selection criteria.
I think a lot about Elm, a language that has all this stuff going for it but a lot of people being turned off by the "let's make FFI harder" wave. I think that it made sense for many people to not engage with the language when seeing that, and that critique didn't really... didn't really need to go deep into the language to make it, and people who cared would avoid engaging too much in a thing that would disappoint
(For this piece I find most of the stuff to be from "yeah I don't care about this either" to "huh that seems like an unfortunate consequence of other decisions" (the result location semantics thing) to "come on really" (no tabs in comments seems like a choice). The blanket ban on language proposals thing is pretty funny, didn't know about it and reminds me of Clojure's vibe)
The piece is an interesting cross section into issues one person has, and I can project a bit of myself into it. My opinion of Zig is like 2% lower than it was before, but all modulated by the idea that the language is still young. And mostly it's about the language having some things I care about but missing others.
The blanket ban on language proposals thing is pretty funny, didn't know about it and reminds me of Clojure's vibe)
That ban is mostly to keep maintainance down, it does not hinder people to write up proposals that solve defects in the language.
I do have a pretty good track record of language proposals in Zig, which most times address several issues that i encountered, and i think the last issue about "would be nice to have" from my side is over 3 years old.
Thing is that often unexperienced people that didnt use Zig at all, or only after a tutorial immediatly made proposals. That blanket ban suppresses this by a good margin, and the Zig issue tracker since then has much better bug reports and improvement proposals.
Andrew is really open to address real-world issues with Zig, as these directly contradict the language mission.
If a proposal contains a real-world problem that would be solved with that proposal and the alternatives are so bad that it's worth increasing the language surface and complexity, it's likely being accepted.
I think a lot about Elm, a language that has all this stuff going for it but a lot of people being turned off by the "let's make FFI harder" wave.
The thing with this is that "all the things Elm have going for" and the lack of FFI are two sides of the same coin, you can't have one without the other. People rejecting Elm on the basis of missing FFI miss out on the opportunity to learn why that might be worth it.
I think the same applies to people coming to Zig from the perspective of Rust. To give an informed opinion about Zig, positive or negative, you at least have to understand what fans of the language like about it. I do think it's hard to get an understanding of a language from reading about it alone.
The blanket ban on language proposals thing is pretty funny
Just learned about this now, kudos to the Zig team for getting it right. You don't want language design proposals from users, ever.
The amount of users in any programming language who have the experience and language design expertise to be able to make proper proposals is close to zero. The only few people who are qualified to do it will find a way to reach out to the language team and I'm sure they'll be taken seriously by the team.
I've been working on language design and implementation for nearly a decade now (including my hobby work, projects that I just silently follow etc.) and the signal/noise rate of these proposals are always extremely low. There are much better ways of getting feedback from users and incorporating them into your language design.
Instead the users should just let the language team know about the pain points, issues, limitations, etc. and let them do their job.
Proposals are still interesting to read though. But Zig was swamped with proposals early on, so it's understandable.
Proposals are still interesting to read though
I agree and I certainly learned a lot of some of the early public design discussions for Rust.
I think the discussions can be made online but you shouldn't let random users influence them by adding comments, thumbs ups/downs etc. and most of the back-and-forth discussions are probably best done privately.
I don't have an opinion on what would be the best way to let the community know about the ongoing discussions, current status of the proposals/development efforts etc. but some regularly updated public design documents (no discussions, no thumbs ups/downs etc.) might be good enough.
Yes, no one can design the language for you. Ultimately a language is a curated set of features held together by a grammar. If people like the original curated set, they won't be happy if the continued evolution becomes "anything goes" without curation. Of course, choosing one feature over the other will always alienate some persons.
But at least personally it works for me to allow discussions. What might make it easier for me personally is that for better or worse I don't rely so much on hunches and instincts when I do language design. Instead I try to explore the entire design space and pick the least bad trade-off. This lets me preempt many arguments for features or syntax, and it makes me more interested in making sure I haven't overlooked anything. (I have - by the way - been ridiculed for doing such thorough exploration. With better instincts for language design I might had the option to do it differently. But nonetheless I feel it works for me)
For an example of this, take a look at issues filed against Go's GitHub repository with the LanguageProposal tag, or the fact that there have now been over 200 failed proposals around error handling.
Right, this post reads like someone who never heard of Lisp taking a few minutes to look and saying, "Ugh, parentheses!"
I mean, yeah, a lot of people do reject Lisp for that. But there's a whole lot more to see there, even if you do wind up choosing to reject the language.
I understand the author's perspective, but some of these points are a bit...incomplete, e.g.
Language Server
Here are some quotes I have heard from people who have used it far more than I:
"Deeply horrid"
"It is one of the worst LSPs I have ever used"
ZLS is certainly not perfect, but I would hardly describe it as horrid. I've definitely used plenty of language servers that I drastically prefer ZLS to, functionality-wise.
I particularly recently noticed that Zig cannot currently catch use-after-realloc errors [...] Address sanitizer with Clang can catch this, so theoretically Zig should be able to catch it (perhaps with a similar system) too.
I...don't think anyone directly uses the page allocator for typical allocations? The equivalent to ASAN would probably be the debug allocator, and indeed, swapping it in catches this:
var gpa = std.heap.DebugAllocator(.{}){};
const allocator = gpa.allocator();
// rest of the code as-is
An error occurred:
Segmentation fault at address 0x7fc8015e0000
/tmp/playground3262542671/play.zig:14:11: 0x113d62a in main (play.zig)
buffer[0] = 99;
^
/home/play/.zvm/0.15.1/lib/std/start.zig:627:37: 0x113df69 in posixCallMainAndExit (std.zig)
const result = root.main() catch |err| {
^
/home/play/.zvm/0.15.1/lib/std/start.zig:232:5: 0x113d271 in _start (std.zig)
asm volatile (switch (native_arch) {
^
???:?:?: 0x0 in ??? (???)
A bit ugly, but it works in the very least.
I particularly recently noticed that Zig cannot currently catch use-after-realloc errors [...] Address sanitizer with Clang can catch this, so theoretically Zig should be able to catch it (perhaps with a similar system) too.
I...don't think anyone directly uses the page allocator for typical allocations? The equivalent to ASAN would probably be the debug allocator, and indeed, swapping it in catches this: (...)
Right, the page allocator isn't used this way often AFAIK. Weird.
I understood that the author meant that the compiler could prevent these errors, not segfault because of incorrect memory access at runtime.
I think this depends on the allocator you chose. For instance, the arena allocator won't segfault here… and maybe it shouldn't if you want programs to access previously-valid memory addresses within the bounds of your allocated arena. That seems a bit wild to me, but YMMV. :)
If different allocators can modify memory access rules, then the compiler will have a hard time deciding whether or not accessing realloc-ed memory is valid. So, I guess you could say that Zig isn't a memory-safe language, even if it features some guardrails. (zig 0.15 anyway)
Small disclaimer: I do like zig. :)
Yeah I think it's certainly fair to say that this isn't statically detectable and thus Zig isn't entirely memory-safe. It's just a bit weird that they explicitly cited an explicitly opt-in ability that doesn't fully cover all allocations as an example of what Clang can do, even though another explicitly opt-in ability is also available in Zig.
This doesn’t feel like a terribly productive rant and could have been a diary entry rather than what feels like a takedown of a language (especially a young one) for no particular reason.
Regarding memory safety, I think it’s been explained several times at this point what Zig’s stance on memory safety. If you don’t like it, don’t use it.
I think the comments about ZLS being “bad” are almost entirely from people that don’t know that you have to configure ZLS to build-on-save in order to get a complete set of diagnostics. I acknowledge that this is a pain to have to know, but it’s a young language and a young ecosystem. It’s nowhere near something like rust-analyzer, but it’s had way less work put into it, and I’m sure the Zig compiler is a moving target to integrate with.
I think the community is pretty much the same as other programming communities that I engage with, with the caveat that it’s just a small community at this point. That means the “first responders” on forums tend to be just a couple of individuals, and if you don’t like their communication style, then it’s probably going to feel bad.
if you don’t like their communication style, then it’s probably going to feel bad.
Active first responders do tend to set a template for the community's communication style in general, so it's perhaps not an entirely unfair thing to point out.
Absolutely. The experience of asking two questions on the caddy forums (and getting two completely boneheaded people responding by not helping) made me swear off the project for a couple years.
This reads like someone who doesn't really understand how and why manually controlling memory allocation and deallocation is important for certain software.
The gap between CPU and RAM performance only grows, so it's arguably more important than ever for performance sensitive software to control your memory usage patterns, like Zig enables you to do.
Not only does it make allocations explicit, but it does so with an added safety net detecting leaks and goodies like SoA via MultiArrayList. It's a nice language.
17% of lifetime issues reported in the Zig compiler are crashes. [snip] 26.5% (!!!) of lifetime issues reported in Bun are crashes.
Crashes aren’t unequivocally bad. I’d rather have a crash than wrong behavior. My own Zig code has liberal use of @panics to detect conditions I had not anticipated. As far as I can tell, such panics would be counted too.
It’d be good to see the statistics for crashes caused by memory issues specifically.
As I've only briefly looked at Zig I don't have an opinion on most of these things, but the docs for the build system are surprisingly worthless for their size and/or outdated. I have no problem with complicated things, if they are explained well, but so far I struggled every time I had to look for something that was not the provided example, verbatim.
Much of Zig seems to me like "wishful thinking"; if every programmer was 150% smarter and more capable, perhaps it would work. Alas, they are not; myself included.
Opposite perspective: Zig is just not for the kind of programmer you are, and that's okay. Different tools solve different use cases. Not every program is going to benefit from the low-level control Zig gives you, and as a programmer you certainly don't have to prefer writing programs in this style. (At least from a human perspective; low level control can enable greater performance, but it may not give you as much joy.)
I wouldn't necessarily say you have to be 150% smarter than the average programmer to understand effective techniques for manual memory management, though. To me a lot of it is a problem of education funneling you into a mindset of managing singular resources instead of thinking of them in groups. You have to shift your thinking to something more data-oriented. Like you're working with a database, where you work with lots of records, not single objects. I don't believe that's necessarily harder to understand or reason about than RAII or GC, just different.
I'm also unconvinced by the statistics presented in the expandable section of the post.
- The Rust compiler has had a lifetime 59,780 issues reported. Of these, 4,158 contain one of "crash", "segmentation fault", or "segfault".
- The Zig compiler has had a lifetime 13,269 issues reported. Of these, 2,260 contain one of "crash", "segmentation fault", or "segfault".
This means that, roughly:
- 7% of lifetime issues reported in the Rust compiler are crashes.
- 17% of lifetime issues reported in the Zig compiler are crashes.
Zig is still a fairly young project, which naturally means it's had less "stable time" in its lifetime than rustc has. Therefore a larger proportion of issues is going to be crash-related.
Some thoughts on these thoughts as a beginner Zig user.
Memory / Debug mode
I found the DebugAllocator to be a really handy tool. Getting a stack trace isn't always trivially helpful and you occasionally just have to make logical leaps to figure out the core issue, but it's much better than "Segfault :)".
Sure, if you're a memory safety absolutist, then yeah, Zig is probably needlessly difficult. Otherwise, I think it's still a decently convenient process.
Comptime
Begrudging agree. comptime is very cool for what it is, but I have occasionally felt constrained by it in a way I wouldn't have with a proper macro system. One example is compile-time generating functions that only differ in some slight way. As functions aren't first-class values, your only real option is repetition.
Turns out I was merely ignorant. See this comment.
Casting
I feel this is wholly a stylistic thing. The at-sign syntax was unusual at first, but I grew to like it. It's an (easily greppable!) sign of "hey, something important is happening here, pay attention".
RLS
I lack the necessary knowledge to comment on this. My gut feeling is that the reason why this isn't a hard error is erring on the side of caution to not blanket ban some potentially desirable behavior, even if I cannot think of an example.
Pointer reference optimization
I feel it's kind of unfair to dedicate an entire section to a bug that has been removed and that it's sad that the feature was removed, when the feature will be re-added in the future in a sound way.
Speed
IMO totally irrelevant.
Build system
Yeah, learning it was probably the biggest pain point I experienced as a beginner. The tutorials are not great and the inner workings are not trivial. Hope this will be iterated on in the future.
ZLS
The fact that it requires an extra step to work good and that this step is hidden in blogposts and a random section of the LSP's website is a great pain and I hope in the future it'll just work out of the box, because the difference between build-on-save-enabled and the default is earth and sky. Otherwise, it's fine, it catches a lot of stuff that otherwise misses my attention. Could be much better, but it's a young project for a rapidly changing language.
Warnings
I hated unused variables being errors at first, but frankly, it's not nearly as big as an issue. You can either comment them out if they're declared in the function or do _ = variable_to_discard and the compiler shuts up. I do think Sloppy mode would be a good addition, but there doesn't seem to be much support behind the proposal.
My only note on the community
No firsthand experience. The support forum, hell, even the author cursed subreddit seem like fine enough places and I found a lot of helpful comments for my own issues.
As functions aren't first-class values, your only real option is repetition.
They actually are and it's incredibly awesome, as no other native language i know has this:
fn () void is the type of a function and it must be comptime known, while *const fn() void is a function pointer, which can be runtime known.
Calling a function value is guaranteed to be the same as calling a regular function.
No repetition required:
fn makeArith(comptime op: enum {add,sub,mul,div}) fn ([]const u32 ,[]const u32) f32 {
const Impl = struct {
fn exec(lhs_s: []const u8, rhs_s: []const u8) f32 {
const lhs = std.fmt.parseFloat+f32, lhs_s) catxh ...;
const rhs = std.fmt.parseFloat+f32, rhs_s) catxh ...;
const out = switch(op) {
.add = lhs + rhs,
.sub = lhs - rhs,
.mul = lhs * rhs,
};
}
};
return Impl.exec;
}
const addStrings = makeArith(.add);
const subStrings = makeArith(.sub);
const mulStrings = makeArith(.mul);
The only thing that Zig has not is a anonymous function syntax (issue #1717)
Oh damn. I was actually looking for this, but couldn't find a good example anywhere. Thank you!
They actually are and it's incredibly awesome, as no other native language i know has this:
fn () void is the type of a function and it must be comptime known, while *const fn() void is a function pointer, which can be runtime known.
Huh, rust has a similar take on this. Each fn has its own unique type (which has size zero). Function pointers have their own different pointer sized type that function types implicitly convert to.
I've avoided zig (and this fine blog post basically sums up why pretty well) so I'm not sure but from your description it sounds like the difference here is that zig merged all of rust's "zero sized" function types (with the same signature) into the same type, but then demands the value is known at compile time so it can optimize it away?
That's a cool trick.
Zig actually went even one step further and defines the size of function types as undefined so invoking @sizeOf is just illegal.
but then demands the value is known at compile time so it can optimize it away?
there's no optimization going on. calling a function value will just always call the function.
The thing show it was defined in the initial proposal was that a function value is basically just an assembly label:
call fn_name
call %eax
But Zig can also optimize comptime known function pointers into direct calls
They actually are and it's incredibly awesome, as no other native language i know has this:
Calling a function value is guaranteed to be the same as calling a regular function.
so here you're saying that the comptime code here gets partially evaluated so you end up with the most precise version of this code, without any spurious branches in the result yeah? Neato
I would say that you can get things that are spiritually the same in Python (building up a string template and calling exec on the string, might not like it but...), and I feel like with Rust you can have something work similarly (but way less nicely than comptime! Though nicer than, say, the C++ preprocessor) with the macros
Hell, in Rust you could really just have a macro that takes the expression... I think? Maybe not given that you won't have access to the lhs and rhs names (very annoying thing in Rust macros IMO, but maybe there's a trick)
the comptime stuff is definitely the coolest version of this though
so here you're saying that the comptime code here gets partially evaluated so you end up with the most precise version of this code, without any spurious branches in the result yeah? Neato
Yep! Zig evaluates all code partially as long as branch conditions are comptime known.
This means you can switch on builtin.os.tag and just call your os specific code in a switch instead of using macros/defines/conditional code blocks.
the comptime stuff is definitely the coolest version of this though
yes, indeed. in zig, both call site and declaration site can decide over inlining functions, and you can use comptime unrolled loops. I've built a "comptime forth to zig compiler" once (doesn't work anymore nowadays) which took a string + parameter types and returned a function executing the code in the string.
with type safety, so it didn't compile when you passed bad types.
comptime in zig is amazing and an incredibly powerful tool, and similar to macros, you can go lengths and abuse it really hard. but in general, comptime code is way more readable than most macro code
They actually are and it's incredibly awesome, as no other native language i know has this:
Not to shill my own language but this is even easier in C3. So it does exist in other languages.
So C3 also has comptime only values you can pass around as "template params"?
Can you show a simple example how that works?
C3 macros have 4 types of parameters, they may be lazy - in which case they can grab pretty much any expression, constant, type, or regular. On top of vaargs.
C3 macros may then generate lambdas based on its parameters, since use of those are inlined into the code generated by inlining the macro. Lambdas that have the same external parameters passed in from a macro will be deduplicated. The only restriction for using the macro parameters for generating a lambda is that they are compile time resolved.
A simple example with const values
// Returning a lambda that multiplies with an int constant:
macro multiplier(int $some_constant)
{
return fn int(int value) => $some_constant * value;
}
// Returning a lambda which multiplies with some constant,
// where the type is inferred:
macro multiplier($some_constant)
{
var $Type = $typeof($some_constant); // For readability
return fn $Type($Type value) => $some_constant * value;
}
There is also generic templates, which might be more appropriate for the particular example you wanted:
module foo{ENUM_VAL};
fn void do_something(int x, int y)
{
$switch ENUM_VAL:
$case ADD: return x + y;
$case MUL: return x * y;
$default: $error "Unsupported op";
$endswitch;
}
module test;
import foo, std::io;
alias add = do_something{Op.ADD};
alias mul = do_something{Op.MUL};
fn void main()
{
io::printn(add(1, 2));
io::printn(mul(2, 3));
}
The ugly syntax of the compile time statements is deliberate to make it immediately clear when scanning the code where compile time folding occurs, and what the compile time variables are.
Thanks!
I don't necessarily agree on the design choices, but it looks semantically sane, so nothing to argue about.
I assume i can declare multiple modules per file? what does a module imply?
what types can ENUM_VALUE have?
You can have multiple modules per file, or spread a module across multiple files. Unlike Zig, C3 is very "open", and this reflects in the modules. You can add to a module anywhere, even if you didn't create it, similar to how namespaces in C++ can be extended. But is used in more ways than in C++. For example we can do this:
module foo;
fn void test() { } // Visibility is public by default
fn void test_private() @private {} // This is private
// Continue the module, but in this section,
// "@private" is the default
module foo @private;
fn void test_private2() {} // Visibility is private by default here!
fn void test2() @public {} // Override the default
Similarly we can do
module foo @if(env::WIN32);
... code here will only be compiled if env::WIN32 is true ...
module foo @if(env::POSIX);
... code for posix ...
A module is also a compilation unit when doing separate compilation. There is a lot to say about modules that can't be quickly summarized here. A starting point for reading about them is here: https://c3-lang.org/language-fundamentals/modules/
As for parameterized modules, currently, only int, enum and fault values, but this may be extended if it deemed that it would be valuable to have them for more constants.
One handy tip I saw that I've adopted when I need to do more rapid break-fix cycles is tuples with pointers to the variables. This is something of a nuclear option but it means being able to spend less time worrying about what is or isn't commented out until ready to clean up the code. Similar strategies work for languages with similar enforced constraints like Go.
_ = .{ &var1, &var2, &var3 };
Ohhh, thanks for that tip. I used to just define an ignore function so I didn't have to think about it.
However, I like your tuple with pointers trick as it doesn't litter my top level.
It's now been a little while and I have not used much of the two latest Zig versions, but doesn't zig fmt do all the _ = foo; insertion for you? Or was that something ZLS did without my knowledge? I'm sure I did not manually "use" unused variables, something between vim and bytes on disk did that for me.
As a non-user of Zig, one thing where I do think comptime offers a significant benefit over generics is the ability to produce types dynamically from an input type. As a specific example, this seems to make Struct-of-Arrays transformations significantly easier to produce reasonably.
I'm currently trying to produce a trait based Rust SoAVec crate and while it is not impossible, it is somewhat complicated and tends to easily leak internal details of the struct being stored in a SoA. Eg. If a SoAVec<T: SoAble> is exposed from your library's API, the SoAble trait becomes and it's implementation by T becomes public information. A part of that trait is (necessarily) a breakdown of T's fields and how to convert T into a collection of field values and back: even though the public API of SoAVec only exposes generated types like TRef and TMut (corresponding to &T and &mut T, but as a collection of field references) that don't leak any fields that T` itself didn't expose, the need for T and SoAVec to communicate the fields between one another leaks the fields as they are enumerated in a GAT tuple.
Maybe there's some way to avoid this, I haven't yet tried to find it. But there is definitely a reason why Zig can have a SoAVec in std while in Rust it is a complicated and fiddly thing to implement outside of the compiler itself.
I've seen the RLS critique pop up quite often, but the only issue that is ever shown is swapping the fields within a struct. Are there more practical issues that arise? Because I don't know when the last time I wrote code like that was, and if that's it then it seems like a minor issue that could be solved with a lint or hard error when assigning a field to another field in its source struct.
As I understand it, the more tricky cases occur when you have an object that is assigned as the result of a function call, and also passed as an argument. For large objects the argument can be pass-by-immutable-reference and RLS means the result is mutated in place. The effect is that Zig turns something that looks like what would in other languages be call-by-value into hidden aliasing. How much RLS aliasing affects the initial argument value depends on the whim of the optimizer. This doesn’t occur for small objects and the threshold between by-reference and by-value can vary.