Bootstrapping Rust Considered Harmful
12 points by calvin
12 points by calvin
For Rust, there currently is no bootstrap compiler written in any other language than Rust
This isn't exactly true, there is mrustc which is written in C++, though it lags behind in what Rust version it supports, so you still have to go through multiple compiles of Rust.
If you write "minimalist" software - anything that runs on the terminal - please be aware of this "bloat" and consider the possibility of using a less dependency-rich and resource-hungry language - or wait for a more lightweight "gcc" version of Rust. Thank you!
You say your focus is DragonflyBSD. Are you going to pay me for the extra time and stress I incur by using a tool and ecosystem with weaker compile-time guarantees, less of a focus on "fearless upgrades" for dependencies, higher runtime startup times to work around for CLI tools, and/or more difficult deployment on platforms I do care about? ...just to make it easier to support a "weird" platform that I never use?
(TL;DR: That link is to a blog post named "Weird architectures weren't supported to begin with" about how it doesn't change what is reasonable to expect from upstream when you port their project to your platform that they never intended to provide that level of support for.)
...and, for the record, I do use FreeBSD via my OPNsense router and my NearlyFreeSpeech.NET shared web hosting... just not DragonflyBSD... and yes, I admittedly would prefer if a VM managed by Vagrant didn't appear to be the lowest-hassle way to cross-compile Rust stuff from Linux for them.
OCaml is comparably complex
OCaml is not a self-hosting, compile-to-machine-code toolchain with an optimizer stack competitive with GCC or LLVM. You're comparing apples to oranges.
In fact, of your entire chart of compile times, the only thing even close to an apples-to-apples comparison is GCC which isn't fully apples-to-apples because of its privileged role in the stack.
What the hell is "Pest" and "SnakeMake"??? Haha, at least there are 10 lines of Brainfuck!
https://pest.rs/ is Rust's second-oldest and second-best-known parser generator. (Think Yacc/Bison. Nom is probably #1, with LALRPOP probably taking #3)
Not giving at least the impression of trying to answer that question is likely to push Rust developers more in the direction of "Well, they clearly aren't trying to see things from my perspective at all. Why should I care what they think?"
It has its own DSL because it's based on PEG parsing, not LR(1) parsing like Yacc/Bison. (Nom does parser combinators defined in Rust source code, Pest does PEG, and LALRPOP does LR(1) like Yacc/Bison.)
Honestly, I'd never heard of SnakeMake and and I don't know why Rust would be using it, so the presence of Brainfuck suggests that both entries may be cloc mis-identifying something else. If it does any kind of non-filename-based detection, I could see it misdetecting some YAML files as Snakemake files.
How other languages bootstrap themselves
Your argument is regressing into "C and C++ hold special privileged status. They are the only self-hosted languages allowed to be the root of a bootstrapping tree".
This argument was made multiple times during the decade when Reddit had an RSS feed for /r/rust/. People found it unconvincing then and it remains unconvincing now.
How do you bootstrap GCC, given that it requires a C++ compiler and C++ is now so complex that Intel decided to pull an MSIE→Edge and redo ICC as an LLVM-based thing, leaving us with basically just three C++ compilers viable for real-world use-cases? (GCC, LLVM Clang, MSVC)
Rust also isn't alone in its approach. GHC (Haskell) is one example I know off the top of my head for a self-hosted language which used to have a Zig/OCaml-style approach to bootstrapping but they've now sunsetted the compile-to-C option and expect you to cross-compile from an existing platform to bootstrap a new one.
Zig and Go are the outliers here. (Philosophically so... to the point where zig cc exists for easy cross-compilation and cargo zigbuild exists to wrap it, while "cgo is not Go" and Go actively embraces prioritizing fast compiler runtimes over highly optimized output and wants to reinvent the world for easier cross-building so much that they had to backstep from bypassing libSystem.dylib... which is the official macOS kernel ABI stability boundary.)
Bootstrapping Rust without a binary
First, you're again giving GCC (which, remember, contains C++ these days) special privileged status.
Second, Rust has https://github.com/thepowersgang/mrustc so you can achieve a byte-identical result by bootstrapping 1.74.0 from C++ (Currently. It does occasionally bump the version it can re-bootstrap) via Diverse Double Compilation and then walk up to the current version to prove the mainline binaries are free from Trusting Trust attacks and match the published source.
(That's the primary purpose of mrustc. It's an auditing tool for proving that the published SHA256 hashes for the existing binaries signify a trustworthy base for future development and for bootstrapping new platforms via cross-compilation.)
Personally, I am not exactly convinced to download a 699 MB compressed tarball, extract it to become 1.9 GB, to then compile Rust for the next three and a half hours just to be able to use:
Sorry to break it to you but, for most people, your dedication to re-bootstrapping your entire package tree comes across as an XY Problem where your actual problem is your philosophical rejection of caching expensive intermediate artifacts (eg. the Rust toolchain) instead of addressing whatever problems with the current implementation of caching make you averse to depending on it.
Honestly, when I saw "Bootstrapping Rust Considered Harmful", that's the angle I expected to see being addressed.
My advice to you: Don't blindly use Rust for everything just because it's currently popular. Think carefully about whether Rust is really the right tool for the task at hand. There are many good alternatives:
Go is "simple" and you get world-class cross-compilation and statically-compiled binaries out of the box (it once saved my day)
Go's type system is primitive, it took far too long to accept that it should have generics, it's written by people who believe the answer to "Things should be as simple as possible but no simpler" is to dispute what is possible and brush edge cases under the rug, and it's fundamentally designed around the idea that it's wrong to want in-process FFI.
Honestly, back in the mid to late 2010s, having also seen Rust's ? (or, actually, I think it would have been try!() at the time), I took one look at it and was disgusted by all the if err != nil.
OCaml is an excellent systems programming language. It largely adheres to the UNIX philosophy and functional programming paradigm and is easy to set up. Some people use it to build type-safe unikernels
OCaml has a garbage collector. GCs are solitary creatures. With Rust, I can write code once and then share it between a CLI tool and compiled extensions for languages like Python. It even has helpers for doing so.
Being able to write once and reuse is core to one of the original four points of the UNIX philosophy.
Really cool kids use Zig
As far as memory-safety is concerned, Zig is C with dipping mustard.
Aside from a couple of combined C/C++ semesters in university that more or less stuck to the basics and game dev in Allegro 4, respectively, I didn't find any reason for C to be worth the effort until I started doing MS-DOS retro-hobby programming.
If it's big enough to be worth doing more than banging out a tiny little "shell script" (with a .py extension for access to less pathological primitives), it's big enough to benefit from the things Rust does much better than Zig. (including the ecosystem of useful-for-CLI-tools packages like ignore, Serde, Rayon, Clap or gumdrop, etc. and, most importantly, the Rust Stability Promise.)
For small tasks, why not use the Lingua franca C?
Given how barren the standard library is and how absent support for easily adding dependencies to supplement it is, even for retro-hobby programming, I rely on Free Pascal's DPMI target and its batteries-included standard library when the project isn't my on-hold one to write a BASIC interpreter and a real-mode x86 .zip SFX stub with a standard library geared toward Inno Setup/NSIS-style uses and fit it into 10KiB or less so it won't crowd the actual content off a floppy disk any more than actual installers from the period.
Among other problems...
What's special about UB is that it attacks your ability to find bugs, like a disease that attacks the immune system. Undefined behavior can have arbitrary, non-local and even non-causal effects that undermine the deterministic nature of programs. That's intolerable, and that's why it's so important that safe Rust rules out undefined behavior even if there are still classes of bugs that it doesn't eliminate.
-- trentj
Since Free Pascal and Lazarus (its Delphi-alike) dropped support for Win9x (specifically, for ANSI/non-Unicode Win32) out of a lack of maintainer interest, I'm keeping my eye on Rust9x in case I can use it for when I move on to Windows 9x retro-hobby projects that I can't do in the ancient version of Python 2.x + wxPython + py2exe that I first learned Python on after Visual Basic 6 and before my move to Linux.
(Or just use one of the copies of Delphi or Visual Basic that I eBay'd for my collecting hobby, if I care more about getting things done than living up to my standards for having my projects not be Free but Shackled.)
Hare is a simple, stable, and robust systems programming language
Hoo boy. Where to start...
Rather than going into detail on Drew Devault's interesting stances on various design decisions in Hare (here's one example), I'll just quote this:
It really has no legitimate reason for our attention. It addresses no actual problem not better addressed elsewhere. Even Zig is better justified, but isn't.
Hare is meant to replace C in the creator's affections. But any language meant to be adopted by C coders is doomed: remaining C coders are defined by having seen a thousand languages go by, and passed on all of them.
Both C++ and Rust address C shortcomings with powerful advances in productivity. Weak-sauce C updates are at best a distraction, at worst compound its problems by siphoning off coders from alternatives. The people they are for don't want them, and new programmers are much better off with actually-better languages.
-- ncm @ https://lwn.net/Articles/893392/
I bootstrap Go from source at least once a week on the 5 platforms I rebuild it on along with my entire dependency stack (Alpine, CachyOS, macOS, Ubuntu/x64 and Ubuntu/arm64). Even though it is a 6-stage bootstrap (1.4 with C →1.19.5→1.21.6→1.23.6→1.25.7→1.26.4), it happens in 4 minutes, or 1/4 the time it takes that monstrosity that is CMake to build, which is 2/3 of the time it takes freaking LLVM to build.
I have no problems whatsoever bootstrapping OCaml, despite my hatred for ML (Xavier Leroy, author of OCaml, was my CS TA in University).
I'd love to boostrap Rust, but today it's just not viable, neither is GHC.
Here's a treemap of compilation times:
I'm not sure what the relevance is of replying to my post specifically.
As I mentioned, Go made a philosophical decision to prioritize fast compilation times at the expense of almost everything else and that apparently extends to bootstrapping as well, and Ocaml is neither self-hosting nor the bearer of a big, complex optimizer stack.
Here's a treemap of compilation times:
And, as I also mentioned, the closest apples-to-apples comparison is between Rust and GCC, with GCC being able to rely on not needing to be fully bootstrapped because it can rely on the platform already having a C++ compiler.
How would you bootstrap gcc? That aside...
I'm not sure why bootstrapping rust from gcc was dismissed out of hand? People can and do do that. Yes there's not a full implementation of the rust compiler that runs on gcc but it is enough to build an older rust toolchain and bootstrap from there.
And yeah, most people should cross-compile rather than bootstrap. Or even if they really do need to bootstrap they can reuse one they made earlier on one platform and cross-compile to the new platform.
How would you bootstrap gcc?
hex0 to Mes to an older version of gcc; see e.g. https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/
If anyone is wondering about the removed comment, either Firefox or Lobsters re-filled my comment in the form and I accidentally fixed a typo in the new comment field and clicked Post instead of fixing it in the existing comment editor and clicking Update.