How (and why) we rewrote our production C++ frontend infrastructure in Rust

41 points by abareplace

But, come on, it’s 2026, and “lowercase a string” is still too much to ask from the language’s standard library?

The advantage, in theory, of the C++ approach is that it's agnostic to the string representation. That is a big deal: optimising the string representation in different ways is something I've seen give a 2x or more end-to-end performance improvement, in codebases that wanted completely different string representations.

The problem is: the C++ approach is staggeringly bad for this. Unless you do a lot of inlining in exactly the right order, the result may not be vectorisable. And the tolower function almost certainly isn't anyway (and is probably using the libc one, not the ICU one, so may or may not actually do the right thing for the correct locale. Oh, you did remember that this is a domain name, right? So you probably don't want the default locale behaviour (you did remember that in the Rust version too, I hope?).

Making strings both efficient across different use cases and ergonomic is really hard. C++ fails on both counts.

refi64

C++ fails on both counts.

I often feel like this is a repeated issue C++ runs into, where it tries to solve some problems but utterly fails the most basic parts. So you end up with stuff like std::random having lots of footguns, and people end up often relying on external libraries anyway.
fazalmajid

Indeed, Unicode is loaded with footguns. It's not as if Rust strings don't have their own issues, like the fact String::truncate() can panic...
- masklinn
  
  That's less unicode and more the combination of Rust requiring strings to be valid UTF8 and using byte indices. There are good reasons to do that, but rust could have done unicode some other way, with different tradeoffs.
  
  And so string manipulations which take an index (or several e.g. slicing) will panic if you index in the middle of a codepoint instead of at a codepoint boundary.
  
  TBF the easiest solution, in every programming language, is to do strings properly: treat string indexes as opaque, and only uses indexes you got from searches / iterations on that same string.
  
  For the cases where that's not an option (e.g. you specifically need to fit a string to a hard-coded storage location), 1.91 did stabilise str::floor_char_boundary and std::ceil_char_boundary to snap arbitrary indexes to codepoints. Still not sufficient for general purpose text manipulation (you probably want to snap to the grapheme cluster boundary), but useful for some edge cases.
  - mikedorf
    
    It's also worth pointing out that Rust did have utilities for grapheme clusters in std but removed it, based on very reasonable tradeoffs.
- zipy124
  
  At the end of the day I feel like it's not a good argument anyway. If you're using C++ you're generally concerned with performance not ease of use, otherwise you'd be using some higher-level language. It's fairly trivial to have your performance critical parts of code live in c++ and call from some higher level language for the boiler-plate. Like games have been doing with Lua for eons.
kornel

BTW, If the domains are still in punycode, there's str.make_ascii_lowercase(). It's a basic in-place modification, not far from a naive loop you'd write in C. It takes &mut str, a rare beast, which technically makes it usable by any string-like type that has contiguous storage.

There are many small nice things in Rust that aren't even hard, which C++ lacks.

It's a "modern art gallery problem". You can see a blank canvas with a splatter of paint in a gallery, and laugh "that's so simple! I could have painted that!" but you didn't. It's not your canvas hanging in the gallery. So many things that C++ could have done, but it didn't.
scanner_brightly
Rust is cool and all but IMO rewrites are never really worth it unless you're dealing with excruciatingly bad baked in decisions, or fundamental performance problems or something.

It would probably have been less expensive to just refactor the existing C++ all else being equal.

E.g. if you think the transform looks ugly why not implement your own to_lower?
```
void to_lower(std::string& s) {
  for (auto& c : s)
    c = (c >= 'A' && c <= 'Z') ? c + 32 : c;
}
```
You don't have to limit yourself to the std library and suffer if it's missing.
- junon
  
  "never" is quite absolute. Fish did a RiiR and it seemed worth it. To date I've never seen anyone complain and AFAIU it's helped development efforts. Along with all of the memory safety improvements, I think it was a win.
zaphar

I love Rust, and for new projects it is my goto. But the article didn't really explain what motivated the rewrite that well. I think it was because they were finding it harder over time to safely land new features. But they didn't really go into any detail on that, so you had to sort of infer it. A casual reading of the article might lead one to conclude they did it just because they wanted to say they did.

I don't think that was the intent through, so it's unfortunate.
- bitshift
  
  Ergonomics seems to be the main thing, given the central digression into lower-casing strings, which they say are emblematic of a thousand-cuts situation:
  
  The bottom line is that C++ has caused more than a few situations where we wanted to do something or add a feature and it’s just like… that’s a cool idea, but it’s just not worth the uphill battle against the language.
  
  I didn't get "…because Rust is popular" vibes from their article. I think there's a big difference between that versus, "Rust is popular because…" which sounds more like their situation: here's all these small things that are important to us, that Rust makes easier, and that add up.
  
  You're right though that they are somewhat light on details about what the rewrite is enabling them to do. They briefly mention some internal features they built. I got the sense that they have some user-facing features that are coming down the pipe, but that they're not ready to announce them just yet?
  - zaphar
    
    Yeah, I get the feeling that there were probably some real drivers behind the rewrite around actual developer pain but they shared one common C++ example without really tying it to their specific problems very well. Which unfortunately leaves the article weaker than it could have been.