Bun's Rust rewrite has been merged
85 points by tuananh
85 points by tuananh
I’m sad for multiple reasons:
What a shame.
The Rust rewrite apparently (though I haven’t confirmed) leans on a bunch of unsafe and unidiomatic stuff.
I'm in the middle of converting a few large C and C++ codebases to Rust, and this is always the initial state of the code. If the existing codebase already followed Rust principles then you could in theory skip the "unsafe everywhere" stage, but where do you find a non-Rust codebase that has single-mutable-borrow semantics, or properly annotates the nullability of every single pointer?
I've not done such a large conversion before, so genuine question: in my mind you'd make it so the bulk of unsafe is FFI'ing out to the as-yet-unconverted code, to allow you to do it piecemeal and without changing every variable at once, but are you saying there's a necessary intermediate step of transliterating to C-flavoured-unsafe?
I've got two approaches I've found work best depending on code complexity: line-by-line and function-by-function. Both of them involve a lot of unsafe.
Using some code from public-domain pikchr file lemon.c as an example:
int acttab_action_size(acttab *p){
int n = p->nAction;
while( n>0 && p->aAction[n-1].lookahead<0 ){ n--; }
return n;
}
I would prep a Rust function like this:
// int acttab_action_size(acttab *p){
pub unsafe fn acttab_action_size(p: Option<NonNull<acttab>>) -> c_int {
// int n = p->nAction;
// while( n>0 && p->aAction[n-1].lookahead<0 ){ n--; }
// return n;
todo!()
}
A line-by-line conversion turns that into (approximate, this is hand-typed in the Lobsters comment field, not compiled):
// int acttab_action_size(acttab *p){
pub unsafe fn acttab_action_size(p: NonNull<acttab>) -> c_int {
let p = p.as_ptr();
// int n = p->nAction;
let mut n: c_int = unsafe { (*p).nAction };
// while( n>0 && p->aAction[n-1].lookahead<0 ){ n--; }
while n > 0 && unsafe { (*((*p).aAction.add((n.wrapping_sub(1)) as usize))).lookahead } < 0 {
n = n.wrapping_sub(1);
}
// return n;
n
}
The goal is to have very small units of translation from the C/C++ (or if things are really bad, from the assembly) and then you just verify that each line does exactly what the comment above it says.
This is close to what c2rust produces, except c2rust operates on the compiler AST so it is both more accurate (every pointer is a *mut, every loop is transcribed from a CFG) and overly accurate (macros decompose instead of being pseudo-functions, types like FILE* get transcribed exactly.
LLMs don't do this style naturally but they can be coaxed into doing it if you use the right prompt and maybe an example of the expected style. Failure modes are generally mild and easy to detect (like unindenting the entire function). The result is just a straight-up better c2rust.
Then once you've got the whole program like this, you can start refactoring it from bad Rust to kinda-ok Rust, then add tests, then you keep refactoring until it is good Rust.
A whole-function conversion would start by adding a shim, so you can ensure your function is being called and you haven't missed any func pointers or macros or other sneaky business:
/* in C */
int acttab_action_size(acttab *p);
int acttab_action_size_OLD(acttab *p){
int n = p->nAction;
while( n>0 && p->aAction[n-1].lookahead<0 ){ n--; }
return n;
}
// in Rust
unsafe extern "C" {
fn acttab_action_size_OLD(p: NonNull<acttab>) -> c_int;
}
#[unsafe(no_mangle)]
pub unsafe acttab_action_size(p: Option<NonNull<acttab>>) -> c_int {
unsafe { acttab_action_size_OLD(p.unwrap()) }
// int n = p->nAction;
// while( n>0 && p->aAction[n-1].lookahead<0 ){ n--; }
// return n;
}
Then you just translate the whole thing in one go (or have an LLM do it) and delete the C version:
impl acttab {
pub fn action_size(&self) -> c_int {
// you get the point, it's just normal Rust
}
}
Saves time if the functions are simple. LLMs are great at this style, if there's nothing wild going on in the original then you can point even a low-powered local model at a file full of todo!() stubs and it'll plow through the whole thing automatically.
Either way you will have a period of time where the code is C/C++/whatever written in Rust syntax, with raw pointers and multiple mut to the same value and all sorts of things that make rustc very unhappy. The trick is compiling in debug mode will mostly let you get away with anything in terms of pointers, and you'll also get to find places where the original code relied on arithmetic overflow.
Oh okay, I think the whole-function version looks something like how I'd imagine the process, but couldn't you still do away with the unsafe shim wrapper if you essentially start from main() and work your way down? That's the bit I don't quite get; there seems to be still the final step of making it look like real Rust after all the work of making Rust that looks like C from the C that looks like C that you could use from Rust.
(I suppose there's the functions right at the bottom of the call tree that could really do with conversion as a priority, but I'd personally rewrite those from scratch and make sure they work as expected with a test that literally calls both versions and compares them.)
Here's an experience based talk on converting a C codebase to Rust: https://youtu.be/H0AUP2OgppE?is=VRC977IshsC4I6T4
They recommended starting at the leaves and working upward, producing more or less idiomatic Rust at the leaves with unsafe FFI wrappers for C to call into, and gradually changing from the unsafe FFI layers to safe APIs as you move upward the call stack.
I saw a really interesting talk from Lambda Days about doing this in Haskell to rewrite a Rails application, too: https://www.youtube.com/watch?v=ip0UFQbiOkQ . It's quite a broadly applicable pattern, rather like a reverse "strangler fig" migration.
It depends on the shape of the original program, and where its own unsafe semantics are. Remember that the unsafe in Rust is merely documenting pre-existing behavior in the original.
If you do a bulk migration from unsafe C to unsafe Rust then the codebase is easier to work with and more amenable to analysis than a mixture of unsafe C and safe Rust.
I see! I'm starting to get it. Perhaps still one of those things I'd have to feel the pain of attempting otherwise to appreciate, but that's a me thing. Thanks! (Also thanks for taking the time to write that longer reply earlier.)
Also, it's not like they have their own JS engine: they're still using JavaScriptCore, and unsafe in bindings to C is basically inevitable. I've seen an unsafe count comparison with Deno, which I think missed that Deno has their V8 bindings in a separate crate and repository, while bun is more of a monorepo.
There is probably a lot of unnecessary, and possibly unsound, unsafe in the new codebase, but I imagine most regressions will not be related to that.
Having done a bit of work in the rusty_v8 repository (and some paid work for Deno in general, so you know I'm biased), it's actually remarkable how little unsafe there is despite the bindings. Of course all the FFI requires unsafe but after that layer the Deno folks have gone to great lengths to get the unsafe out of the public APIs that they rely on.
I'm curious to see how the project will evolve now that there's a million completely AI-generated lines of code and nobody understands any part of the project anymore. This is the kind of disastrous loss of expertise you typically only see when all core developers leave a project, done completely voluntarily.
Maybe it'll be fine, but it seems risky.
I'm looking at the source code. It is a 1:1 port of the Zig code, and both actually exist side by side. If the maintainers know what the Zig code does, they already know what the Rust equivalent is doing... which is whatever the Zig counterpart did.
Well that's the hope. Nobody knows how true it is. But you're probably right that it's true enough to be a useful starting point.
On one hand side, there's thousands of unsafe regions. On the other, zig doesn't provide nearly as many memory safety features, so you can also see it as going from almost every function being unsafe to just a few thousands. So that part isn't that bad.
But will the maintainers take proper responsibility for understanding and working on the code that was mostly written by a LLM? I don't think so.
But will the maintainers take proper responsibility
But will the maintainers take proper responsibility to trust gcc/clang and etc? We will be there eventually, and but there are lesser folks insist writing ASM avoid C in 2026 than in 1996.
You trust that if gcc/clang/<compiler of choice> generates incorrect assembly from your code, that you can file a bug report and it'll be fixed, or that you can track it down and fix it yourself. Anthropic/OpenAI/Google don't want those bug reports, they don't care, because they don't care if their tools produce incorrect code at all. I can't go looking through an LLM's code to figure out why it produced incorrect code, because that's not how they work, they're a black box designed to be used in a non-deterministic fashion with no way for the average dev debugging them.
Someone could correct me on this, but I'd wager most of the use cases for ASM in 1996 were performance related, not because the developer didn't trust the compiler to be correct. And, there's no guarantee that LLMs will get any better, they could continue to produce incorrect at the same rate or worse forever, but considering how LLMs work, I wouldn't count on them getting that rate down to 0, or to fail in the same ways humans do.
There's pretty good reason to trust Clang and GCC. The lack of knowledge of why they are trustworthy doesn't make a good objection.
There are tools and tests that are engineering feats on the levels of the compiler themselves. The people who work on GCC and Clang have a sense of responsibility for their work. And, there are companies in regulated industries such as Toyota who want Clang to be certified as safety critical and therefore have a regulatory duty to work on stamping out miscompilation.
I spent most of my senior year in college on a project to find bugs in the ARM backend of LLVM. We ran (and still do) huge fuzzing campaigns generating random programs which have their compilation checked by a Alive2. We found some bugs, sure, many in programs that were nearly impossible to generate anyways (due to the use of undef for example or casting 54 bit integers to floats).
However, the fact that we were generating billions of mutant programs and we found less than 30 bugs is astonishing. Gcc had a similar project currently ongoing and finding similar bugs. These tools are seriously hardened. Not bug free, but some of the closest humanity has got.
The output of an LLM can't even get through a sentence, let alone a billion, before it hallucinates or bullshits me.
These bugs got fixed within a day and with attention to detail, and correct citations of the spec by engineers who knew what they were doing. LLM vendors haven't been able to solve hallucinations for how many years now?
One week from a guy whose only developer experience is web frontends doesn't scream "formal methods, fuzzing, and careful Rust spec reading" that would lead me to trust the output of his LLMs
I'm reminded, though, of a previous post that argued that Zig is safer than unsafe Rust.
Yeah, the spectrum is likely C - some zig - unsafe rust - rest of zig - default rust. That doesn't really change the main argument though.
I'm looking at the source code. It is a 1:1 port of the Zig code, and both actually exist side by side. If the maintainers know what the Zig code does, they already know what the Rust equivalent is doing... which is whatever the Zig counterpart did.
I feel the from unknown-unsafes to known-unsafes, is not really a bad thing.
it's not just that there's loads of unsafe code, the unsafe code is also incorrectly and unsoundly wrapped in safe code, causing UB:
https://github.com/oven-sh/bun/issues/30719#issuecomment-4453771886
adding a lifetime parameter to that struct and tying it to the &[u8] in the constructor is something you'd learn on the first day of Unsafe Rust 101
Given how much Zig has been bashed recently by the Bun team for being memory unsafe it’s probably good riddance.
yeah i find this news rather saddening for about the same reasons
just the second bullet point alone is enough to make me never want to touch bun again, which is a shame bc in the past i've had some very good experiences with it
[edit: was missing a newline my bad]
Meaning that TigerBeetle is not particular well written?
I deleted my first comment cause yeah that’s kind of what I meant but I haven’t really looked at the source for either, I just remember Andrew Kelley calling one of them not great but can’t remember which :)
Apart from the hand-written-async-state-machines, which are an unfortunate necessity, tigerbeetle is the highest quality codebase I ever worked. Still plenty of things I would change (eg the intrusive data-structures are a source of bugs and reading difficulty, and I don't think they are necessary for performance) but it's the only company I have worked at that really prioritized code quality and developer experience.
Makes sense! Yeah, that's why I deleted my first comment (not that it was actually saying TigerBeetle was bad either). I just remembered hearing some big Zig project wasn't idiomatic and couldn't remember if it was Bun or not, so I'd said I might have been confusing it.
To provide a different perspective on the correctness debate, Zig has two main projects that people know about:
TigerBeetle, "despite" being written in Zig, seems to be doing exceptionally well:
But ok, people will say that TB doesn't count because they do funky nasa stand-on-one-leg-and-do-a-backflip development that normal development teams don't do (although it's not like they couldn't, like for stuff that matters, but anyway). Fine.
Let's look at Ghostty then. Also written in Zig, no "ghost-style" that I know of, and Mitchell has been vocal about his usage of AI to develop the project (another hint that the project is run normally and not with the discipline of a Mongolian monastery). As far as I can tell as a user, Ghostty is fast and works well. Never crashed on me once. Now, Mitchell is known to be a good engineer, but I flat out reject any argument that claims that this only works "because it's him". Ghostty is an open source project, code comes from a variety of contributors, and most importantly, Mitchell is just a guy that puts effort in what he does, the same way that you can put effort in what you do.
For contrast, Bun got to the place it was in because of deliberate choices that often had very little to do with the language.
To bring just one example up, Bun had (has?) wrong asserts that would crash the executable in ReleaseSafe and potentially cause UB[1] in ReleaseFast. Since Bun shipped ReleaseFast executables (already a debatable choice for a node replacement), the result would be non-actionable bug reports that showed weird behavior and non-deterministic crashes.
The 'solution' to this problem was:
std.debug.assertallow_asserts is set false for ReleaseFast)This has the positive effect of removing potential UB caused by wrong asserts, unfortunately this does not help at all with the fact that your code is written with wrong assumptions in mind, with invariants and pre/post conditions that don't hold, meaning that you will still cause issues to your users, and that bug reports will still suck.
I find it a fairly objective statement that releasing ReleaseFast executables and then toggling off asserts instead of fixing them, shows that Bun has deliberately chosen a cavalier approach to software correctness.
In my opinion software correctness is the outcome of a process, not of any technological choice (those might support or hinder your process, but you gotta have one), although I do hope that the switch to Rust will give Jarred and Anthropic an opportunity to rethink the way they approach correctness.
[1] For clarity for those who are not familiar with how this works, in ReleaseFast, if an assert guarantees that a number is positive, for example, then the compiler will be able to remove any subsequent if (num < 0) {...} for example, removing a useless runtime operation. Unfortunately, if the assert is wrong, then the executable will be missing a piece of logic, resulting in wrong behavior.
This seems to frame the problem as "Jarred doesn't know Zig well enough, that's why there was a problem". But then the story changes to that of Rust being a language that by construction is better at yielding correct programs than Zig even if the programmer isn't that skilled?
You misunderstood my post, turning asserts off in release builds has nothing to do with Zig, and all to do with the decision-making of the company and their development process.
It's a telling example that shows that correctness was never a priority for Bun.
Case in point, the same exact logic is also present in the Rust code:
https://github.com/oven-sh/bun/blob/175f62ab1574fe47df5ab5e6ffb3be878b607e4c/src/bun.rs#L1515-L1519
https://github.com/oven-sh/bun/blob/175f62ab1574fe47df5ab5e6ffb3be878b607e4c/src/bun_core/env.rs#L48
Presumably the Rust code has the same exact falsifiable asserts that get turned off via that flag.
I think "disabling asserts in prod" is a pretty common technique, yeah?
I do feel like if you've lost confidence in the asserts being "right" and that leading to code ellision that you would not expect.... then at the very least disabling that optimization sounds like the right move. I don't know how possible that is though.
Yeah, Zig turning asserts into assumptions in ReleaseFast is one of the most absurd things I have ever seen. Disabling that behavior to get back to "disable asserts in prod" is extremely reasonable.
It's a legitimate programming pattern. They basically use assert like Rust's assert_unchecked or LLVM's __builtin_assume. You better be sure about your assertions though.
The problem is that the same statement is assert!() in debug mode and unsafe { assert_unchecked!() } in release mode, which is probably not ever what you want.
Either asserts are for debugging and simply get removed in release mode (similar to checks for arithmetic overflow), or they're part of the program and are allowed to affect control flow (like C's guarantee against NULL deref).
Turning off assertions in release mode is not what I would do given Bun's relatively low performance requirements, but turning them off is much better than Zig's behavior of silently transforming them into compiler-visible assumptions.
I would absolutely want my assert_unchecked to panic in debug. In fact, I would absolutely never want that thing not to check in debug in case the assertion is wrong!
Agreed, the question is mostly about what happens in release mode. Rust has assert!() and assert_unchecked!() and debug_assert!() so you can be fine-grained about your goal, C has ASSERT() so you can verify things look good in tests and then leak your TLS keys in prod (per tradition), Zig has ... discovered an even more dangerous option than C.
I guess I would want two flavors of this:
Can think of a lot of cases of the first one, the second one.... seems a bit tougher but I can see it being used for certain container types
Disabling asserts in prod is perfectly fine in general (maybe, I would actually put that into question as well), but Bun had/has plenty of known, wrong assertions. By completely turning them off (the if(!allow_assert) return branch in fn assert which gets evaluated at comptime) you are both:
You gotta fix your asserts, there's no way around it. Then once your asserts are fine you can choose if you want to keep them crashy (ReleaseSafe), leverage them for optimizations (ReleaseFast, more dangerous), or do something else.
For a node replacement that is meant to be exposed to the internet it might also be a good idea to make ReleaseSafe builds, not ReleaseFast. ReleaseFast makes more sense for stuff like games.
Things seem to have changed since this comment 9 days ago.
I work on Bun and this is my branch This whole thread is an overreaction. 302 comments about code that does not work. We haven’t committed to rewriting. There’s a very high chance all this code gets thrown out completely.
I’m curious to see what a working version of this looks, what it feels like, how it performs and if/how hard it’d be to get it to pass Bun’s test suite and be maintainable. I’d like to be able to compare a viable Rust version and a Zig version side by side.
I came here looking for this.
How one goes from so many big, open questions about a PR to merging it a few days later, is mysterious to me. I think most of us who doubted the lack of "commitment" to the work and the "very high chance" of the work not being merged will be forgiven for having had doubts about the author's intentions.
The open questions are not resolved by time. They're resolved by somebody finding the answer. Once you have the code, answering the questions about performance, passing the test suite, and the "how it looks/feels" is pretty easy.
You're just miscalibrated on the time needed for that "have the code" prerequisite these days.
There was a message from the author I believe 3-4 days after that where they had gotten more than 90% of tests working IIRC and were very positively surprised.
This is really sad to see. Not because of the language, I do not care, but how careless this is.
There used to be some pride in crafting, even in our profession. Now it's just.. This. More code, more fast, who cares about maintainability?
If I was using Bun, I would be scrambling to figure out where to move.
What makes you think Jarred isn't proud of this and doesn't care about maintainability?
I am sure he is proud of this.
+1,009,257 -4,024 says enough about maintainability, however.
If the Rust code is a direct port of the Zig code that he wrote himself, based on porting rules that he provided, I don't think it's safe to assume that it's not maintainable for him going forward.
Large parts of the Bun codebase in Zig were also AI assisted ports from other codebases, like the lexer/parser taken from esbuild for example.
The amount of eyes this story is getting probably the real win here for Bun. A runtime that already focused on memory safety or stability won't get the clout of Claude rewriting $thing in Rust, or a runtime that didn't need a 1m diff is of course not going to be blazing fast /s.
I'm not even sure what technical value things hold now thanks to vibecoding and slop.
I do know that I'm forming a strong detachment of these types of projects, there is 0 sense of coherence, technical value over existing projects and community.
Completely agree. This feels to me like another well crafted media stunt to garner exactly this kind of attention. Its just like the "we vibe coded a C compiler" project that totally works and can compile linux. Just nevermind that they had to reuse and have access to a very well crafted test suite and needed direct access to GCC.
I really fail to see what this buys them other than publicity. Like, how has bun actually changed as a result of this rewrite? If anything all I've learned here is that the common wisdom that network effects are less important now. Want to gain safety by rewriting something from c++ to haskell. Go for it. Want to generate a bunch of missing libraries for you favorite tech stack? Have at it. The old reason of "it has a library for everything under the sun" is a lot less meaningful when anyone can do these kinds of rewrites.
This feels to me like another well crafted media stunt to garner exactly this kind of attention.
Just for clarity I don't think the PR started as a marketing stunt, but it sure went viral.
The old reason of "it has a library for everything under the sun" is a lot less meaningful when anyone can do these kinds of rewrites.
Yeah, this! I guess somebody could spin this into "look what previously could not be done in such a timeframe! This is agentic engineering" but for me it reads as vibecoding. I don't care for such software and would rather not spend my time on it, OSS isn't about immediate value for me.
I don’t think you see the sentiment. Most people have a very negative view of this everywhere i see it discussed
I think the people that see this as a negative, are most likely to be vocal about it online (like me ha), but even if negative, it's still attention. People are interacting with a project they might previously not have and I'll take a gamble and say that a silent part wants to see how this up so it can see how far LLMs can be pushed.
Another shame is that Github is failing to load the comments on this PR. What kind of product can't load 1000 comments on a page?
Not a user of bun in production but if I was I'd be pretty nervous. Passing the test suite is one thing, but who knows how many things this is going to silently break? Guess it depends how comprehensive their tests are and what they're willing to commit to.
I don't think you become a user of bun in production if this kind of thing makes you nervous. It's always been a fairly YOLO project.
Probably better to link to the actual pull request, no?
Also, look at that diff count: +1,009,257 -4,024
For the "remove zig" pr that just got closed:
+22-639,678
So this is more of a part 1 of 2 situation. Still, impressive diffs.
Has got to be some kind of record for the ratio between PR description length and lines modified.
In the US, does that mean that Bun’s source code is now public domain?
I was wondering about this too. Curiously, while the repo claims that Bun's code is MIT licensed, I couldn't find any explicit copyright assertion anywhere, even looking at it before the rewrite. (There are some copyright claims, but they are on third party code which has been vendored into the codebase.)
Nor is the MIT license, which starts with a copyright notice, reproduced anywhere in the repository. How can you assign a license to code if you don't have copyright on it?
It's probably still too early to call it a success, but wow, really impressive stuff. Especially the speed of execution.
And also absolutely "a mess" of a workflow. Almost 7 thousand commits in less than 6 days. The GitHub UI can't even list them. Looking forward to the blog post to learn which review strategy they used and what gave them enough confidence to merge.
Looking forward to the blog post to learn which review strategy they used
I'm leaning towards "none" and they only checked the test suite then rammed it in.
Looking forward to the blog post to learn which review strategy they used and what gave them enough confidence to merge.
Given that this is about Bun… what review?
From the PR,
and most importantly, we now have compiler-assisted tools for catching & preventing memory bugs, which have costed the team an enormous amount of development & debugging time over the years.
At the risk of getting too speculative, the lead up to this point is something I don't fully understand.
According to Wikipedia, the initial release of Bun was in September of 2021, with the first stable 1.0 release being 2 years later in September of 2023. Then, we have a Rust rewrite roughly 2.5 years after that.
Is there any indication of the Bun team publicly mentioning the struggle to work through memory-related bugs prior to a few weeks ago when the rewrite was known to be a possibility? To the point where Zig was perhaps thought about as the wrong choice? I'd like to understand why they chose Zig in the first place, and how that was weighed against an increasing difficulty finding and resolving memory-related bugs.
From one angle, it kind of looks like there was a larger pressure to rewrite into a language that many have agreed is just "better" for LLM generation (presence in training data, borrow checking enforced by compiler), because that's the way they want to (and are likely being asked to) develop.
I don't doubt they struggled with memory-related bugs in the Zig implementation, just like I don't doubt the value of the borrow checker. I'm just hoping for more clarity around the decision-making timeline and if Rust was considered as an option at any point in the last 5 years. Was it something that was wanted for a long time and finally seen as a reasonable thing to do with the advent of LLM improvements?
(Of course, no explanation is required on their part. They can do whatever they want with their software! Just curious about how this is being presented and playing out.)
The entire point of memory safety and other guarantees from Rust feels like a post hoc justification for me. (Edit: especially considering the tremendous volume of unsafe in the merged PR)
Anthropic was likely embarrassed by depending on a project (Zig) that had a strong policy of rejecting LLM generated changes. The fact that Zig held strong on this policy despite Anthropic providing a PR to significantly improve compilation times likely felt like a slap in the face to them
From one angle, it kind of looks like there was a larger pressure to rewrite into a language that many have agreed is just "better" for LLM generation (presence in training data, borrow checking enforced by compiler), because that's the way they want to (and are likely being asked to) develop.
I'm only a spectator here, and I have no insider knowledge; I don't use Zig, only barely use bun, and only (so far) use Rust for experiments, not for code I rely on in production or need to maintain.
With that disclaimer, this looks to me a lot like the kind of experiment I might expect to happen when a team that likes LLM programming tools suddenly finds themselves with an unlimited budget for such tools.
Is there any indication of the Bun team publicly mentioning the struggle to work through memory-related bugs prior to a few weeks ago when the rewrite was known to be a possibility?
Bun was notorious for memory-safety bugs. The standard comparison here is that of Bun's 16k issues on Github, there are > 2500 which mention a segfault, compared to just over 400 of Node's 20k issues.
I don't know if they talked about the struggle publicly but it was obviously a problem, and has been for a long while.
Node, of course, is also primarily written in a memory-unsafe language (namely C++). I can't say why the difference is so large, although Node does make use of C++'s affordances for memory safety (RAII, shared_ptr, etc).
Might I also dare suggest that Node has a more robust development process? I haven't looked that much into either Node's or Bun's, but it's hard to imagine a more haphazard process than one where some guy can get Claude to spit out a +1,000,000/-600,000 diff which rewrites everything into another language and have it merged in a matter of days. Their recent stunt of "We used Claude to parallelize the Zig type checker and it got 4x faster", where the result turned out to make type checking nondeterministic, also doesn't inspire a ton of confidence.
I develop a lot of code in memory unsafe languages, and my experience is that you can absolutely manage it, but you need to be meticulous. You need to consider contracts between different parts of the system. You need to carefully consider things like, "does this function return a value which I own and am responsible for freeing, or does it return a value which I borrow and can only use for some specified amount of time until it becomes invalid?". Good C APIs carefully document the ownership semantics of every pointer passed to or returned from every function. And of course, in C++, stuffing things into RAII wrappers or unique/shared pointers helps a ton.
Haphazard code, where people just slap code together until it "works", tends to be rife with edge case segfaults caused by ownership confusion.
Sort of entirely unrelated, but by golly it seems to be hard to find thorough documentation of ownership semantics in C API libraries! Or rather, the one main time I remember is when I was making Deno bindings for libclang. For some reason libclang absolutely will not clearly document any of their ownership semantics basically anywhere! Maybe it's because it's just a C API exposed from a C++ library, and hence not very important? But it was annoying nonetheless! :)
I agree, it's terrible how uncommon it is! The best example I've seen is from GStreamer, where most pointer return values and many pointer parameters are tagged with either [transfer: full] or [transfer: none], and nullable pointers are tagged [nullable]. It's not perfectly consistent but it's miles better than most other libraries.
Google code is some of the worst. I've accidentally introduced memory leaks into WebRTC code because ownership was transferred by raw pointer in C++. I naïvely assumed that, because it's C++, ownership would be transferred via unique_ptr or shared_ptr, or in the very rare cases that it'd be transferred by raw pointer for historical reasons, that would be documented at least in some comment in a header file. But no, no mention in any documentation or comments, just a raw pointer passed from WebRTC code to user code as a function parameter which will leak memory unless your user code frees it.
i know bun wasn't ever really at the "production-ready" stage even before this (and comments elsewhere in this thread and on the value-to-shareholders-enthusiast site demonstrate various rather amusing examples of why), but i thought this was a joke when the pr was created and i'm still not convinced it's serious now.
the idea that someone, in three weeks, produced, reviewed, and tested a one million line diff is preposterous, and miri (rust's official undefined behaviour checker tool 3000) seems to be screaming at the most trivially identifiable things. it's a pretty simple tool to hold, and claude as claimed should have had no issues using it, so was it absentee development practices, forgotten, ignored, or simply unknown to the llm's jockey? is there an actual rust developer (literally any, just one is enough for this question), who actually knows rust, working on the bun team?
someone did the math so that i don't have to:
architector4@AGOGUS:/tmp$ git clone --depth=1 'https://github.com/oven-sh/bun' Cloning into 'bun'... … architector4@AGOGUS:/tmp$ cd bun/ architector4@AGOGUS:/tmp/bun$ find -type f -name '*.rs' -exec grep unsafe {} \; | grep -v '//' | wc -l 13255....Thirteen thousand two hundred and fifty five lines without comments with the word "unsafe" in them in Rust code files across this rewrite.
Sure, there's some appearances of C API interop going on, and perhaps there's some really ultra performance sensitive things here and there that warrant use of unsafe. But something tells me that for a proper safe Rust rewrite without such egregious soundness bugs littered like candy, this codebase would need to get ship of Theseus'd a second time over.
i guess this might have been some sort of super weird reverse marketing stunt by anthropic, given the sheer level of "huh?" to the original pr, but they've managed to turn the opinion dial back down from "it's a bit of a yolo project but it's alright for fun"—a position at which it had only really recently arrived—to "tinker project", so i'm not quite sure that worked as planned. nobody with money on the line is going to use this when completely-solved problems like constructing an owned string from a str reference causes ub because they don't know how lifetimes work.
we're all adults here so playing pretend about capabilities when you were effectively just a proxy harness for machine-generated content isn't particularly valuable long-term unless you're specifically vying for a job with the title "vice president" or "senior manager" in it. would you hire this to work on your product or service? would you accept a million-line (or, for unfairness, let's reduce it by an order of magnitude to 100,000 lines) diff that has been demonstrably barely tested? i'd watch a documentary that explains the sort of organisational culture that would make one proud of this sort of thing, because i truly genuinely don't understand ("am i out of touch?"), because i personally would be ill with stress if i submitted something even 1/100 of this, with tests. i updated a package for nixpkgs the other day and checked four separate times in different ways that my changes—a total of four lines—were correct. i don't know if that's just over-perfectionism.
while reading through the carnage i did catch this message from claude in another pr from a couple hours ago, that made me laugh (emphasis mine):
I didn't find correctness issues, but this touches 45 files including core syscall dispatch, spawn, and the crash handler, with a few non-mechanical bits (raw-syscall statx/memfd_create shims, hardcoded POSIX_SPAWN_SETSID) — and the Android build legs are still red in CI — so worth a human pass.
the commit was merged four minutes later.
i look forward to watching bun's progress over the next six months. it will be either an incredible redemption arc for the team, a revert to the zig version with a very quiet "we'll try it again in a year" sorta beat, or it'll be a solid double down and bun's development process will largely pivot to "fix the all the things we had once figured out, but then broken, and now in a new language that nobody on the team knows without using that language's most prominent paradigms" instead of fixing genuine discrepancies in the runtime for about 12 months.
but hey, the tokens were free right? that cost nobody, anywhere, anything, so no harm no foul... right?
Interesting messaging with this whole saga. It's just an experiment, but also its been shipped in short order? Like its their prerogative, but why is the posting about it so cagey?
It's an interesting experiment, but it seems like you'd want to take a more careful approach and have both versions exist in parallel for a while. So early adopters can use the rust version and more conservative adopters can stick with what they already have that already works.
Without any commentary or context from the authors, I don’t see the point in sharing this story. All we will get is speculation.
Even with commentary I'm not sure of the point. Only time will tell if this is an improvement or not. I'll check in in a year or two.
Jarred has been promising a blog post about this for a week or two.
Excited to run it through Claude and get a bullet point summary I can read over coffee.
I am not heavily trafficked in any of JavaScript, Rust, or Zig, though have dabbled and read snippets of each.
JavaScript is not a "safe" language. Depending on how you want to wield the definition of "safe", [Visual]Basic may be the only thing I can think of that has that same realm of "we'll see if it runs the way we expect or not".
Am I the only one that feels a massive sense of irony here? That Anthropic will have a "safe" VM, that executes code that is anything but safe, to run its tools that interact with their LLM. And that their LLM , what is essentially a giant probability net, purposefully seeded with "noise", will write the less-than-safe code to interact with itself on the otherwise "safe" VM.
JavaScript isn't being discussed at all here and is largely irrelevant. All of the JavaScript bits are done in javascriptcore which bun wraps (like node wraps V8).
Depends what meaning of "safe" you mean. The ECMAScript spec doesn't include undefined behaviour, not even for data races on SharedArrayBuffers. Every operation has defined behaviour. Some of them (like {} + {}) are defined as having behaviour which is not very good, but they still are defined.
ECMAScript mostly tries to make the language semantics deterministic, such as the iteration order for Map and Set objects being deterministic and WeakMap being defined so that you can't iterate the contents of a WeakMap so that you can't observe whether or not a particular object has been garbage collected yet.
In my experience it's fairly easy to stick to a subset of ECMAScript in which you don't routinely run into heinous problems caused by the language itself. Use TypeScript (only for checking, not for compilation), turn on its strict mode, lean on the type system at least a little bit and primarily use async/await for concurrency (wrap any callback based API in a Promise as soon as you can), use Map<string, Foo> rather than Record<string, Foo>, refrain from abusing getFoo() as any or getFoo()! syntax. In my experience code written this way tends to throw null pointer exceptions much less often than Java or C# code in the wild because TypeScript doesn't have pervasive nullability and it tends to have an easier time with domain modelling because TypeScript does have the ability to encode sum types.
The way this is going, the Bun code is also turning into a giant probability net. Instead of training the weights of a neural network against expected outputs, we're training lines of code against human feedback. In both cases, the artefact is an unreviewed black box. Someday Bun will start hallucinating the existence of JS promises, and we'll be told that it's an unfortunate but reasonable aspect of how modern software works.
I admit having exaggerated the previous paragraph, for dramatic effect. But the parallels between that vibe-coded Bun and an actual LLM are scary.