Mitigating the Billion Dollar Mistake

12 points by rodef


This article is continuation to: Was it really a Billion Dollar Mistake?.

snej

Unfortunately existing languages like Java cannot have these problems solved, but newer languages that want to stylize themselves similar to that could solve them.

Typescript and Kotlin do all three of the mitigation options he lists, and IMO it works very well.

As I said, I think this works very well, and it’s rare that I have to use an escape hatch like “!!”.

typesanitizer

If I understand correctly, the underlying thesis may be re-stated as:

In languages which distinguish values and pointers:

  1. There is a performance cost to mandatory explicit initialization which makes it unsuitable for certain kinds of situations

  2. The additional code required to initialize everything explicitly obscures the underlying logic. On the other hand, one can get used to default zero initialization (with null pointers being a special case).

  3. The bugs introduced by virtue of null pointer dereferenced are not sufficiently numerous or serious to warrant paying the costs in 1 and 2.

If my understanding is correct, then (1) should have examples of programs in a language like Rust (which require mandatory initialization) and those should be observably slower than the equivalent Odin programs. (1) probably needs some more evidence that such programs are representative of the ~typical program in the language of interest. Understandably, this is hard to supply/prove.

If you look at sources, zero initialization itself has ~zero overhead for stack variables (see: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2723r0.html). Zero initialization overall has overhead of maybe 1-2% at most (https://users.elis.ugent.be/~jsartor/researchDocs/OOPSLA2011Zero-submit.pdf), if you implement some optimizations (e.g. batching). This is relative to no initialization. IIRC, JF Bastien measured the overhead of pattern-initialization and that was <5% for most cases but I don't have a source for it.

Re (2): This is a bit hard to refute modulo doing concrete measurements on the speed of code understanding. IMO, it seems plausible that this doesn't matter, or it's heavily dependent on the coding patterns in use.

Re (3): I think this is where the readers of the precious article disagreed with the author. However, the author seems to dismiss them:

you don’t actually understand the costs fully if you are answering the way that you do

For this, the author cites personal experience:

I’ve been in projects where a lot of the time in a program in spent in the destructors/Drop traits of individual elements, when all they are doing is trivial things which could have been trivially done in bulk.

Sure, citing personal experience is valid, but then if you're dismissing other people's personal experience as invalid, surely that's not reasonable, is it?

Then there's also the more obvious counter that Zig is a modern performance-oriented memory unsafe language (similar to Odin), with native support for SoA and multiple talks by Andy Kelley on data-oriented design. But even Zig doesn't allow pointers to be null by default, you have to opt-in to that. IIUC Zig also requires explicit initialization but you can opt-out of it via undefined. This calls into question both claims (1) and (3).

Corbin

I know a lot of people view the explicit individual initialization of every element everywhere approach as the “obvious solution”, as it seems like low-hanging fruit. As a kid, I was told to not pick low-hanging fruit, especially anything below my waist. Just because it looks easy to pick, a lot of it might not be unpicked for a reason. It does not mean that you should or should not pick that fruit, but rather you need to consider the trade-offs.

As an adult, I know that I can shape a fruit tree such that it primarily gives low-hanging fruit. I also know that I can let the tree grow tall, so that its branches are leafy and hold many bird nests and offer high-hanging fruit for the tree-dwellers, while still offering plenty of low-hanging fruit.