Newtype Index Pattern In Zig

45 points by Johz

weberc2

Note well that memory savings are evenly spread out. Using indexes makes every data structure slightly more compact, which improves performance across the board, regardless of hotspot distribution. It’s hard to notice a potential for such saving in a profiler, and even harder to test out.

This should be pretty easy to test out—if you have any end-to-end benchmarks or performance measurement support for your application, you should be able to just change your index types from 32 to 64 bit and see how much performance degrades.

peter-leonov

First, they save memory. Typically a 32-bit index is enough, a saving of four bytes per pointer on 64-bit architectures. I haven’t seen this measured, but my gut feeling is that this is much more impactful than it might initially seem.

v8 team at google posted on pointer compression (with some stats), which is specifically impactful for "everything is a reference" languages like JS.

aapoalas

I've recently found one reason to prefer index + count in storing a collection of fields that can either have constants or an AST: for the AST case I'd store a slice of the Nodes array as index and count, and I'd detect the constant case by a zero count: the index would then be an index to a separate constants array instead.

Ambroisie

Checking for start == end doesn't sound like a worse condition TBH.
- aapoalas
  
  Point isn't condition checking, it's that in the start + end case you need both values to check the "no children" case (although you could encode it as start == u32::MAX as well) whereas in the index + count case only the count is needed to detect that case, which means that in the zero count case the index value can carry some data that is only relevant in the "zero children" case.
  
  In my case I always either have non-zero children, or I have a constant value. Thus, I can always reuse the children index as a constant value index. This saves me 4 bytes per entry, as I don't need to store both a children index, children count, and a constant index
  - Ambroisie
    
    which means that in the zero count case the index value can carry some data that is only relevant in the "zero children" case.
    
    Which is equally doable when the zero children case is encoded by start == end?
    
    Thus, I can always reuse the children index as a constant value index
    
    As I said, you can just use start as said constant-index.
    
    aapoalas
    
    Right, I had a hard time making out your point from how terse the message was.
    
    I hadn't considered that but indeed, you can do the same that way too. If I had to look for a downside in that, it'd be that you always have to read both the start and end values to perform the comparison, which may be relevant if you have SoA storage (as I happen to have). Additionally, that logic is generally harder to express in a type system, if you want to try do that, whereas a null/non-null split is generally more doable.
    
    But these are superficial differences mostly; thank you, I honestly hadn't considered that the same is possible in the start/end formulation as well!
    
    Ambroisie
    
    If I had to look for a downside in that, it'd be that you always have to read both the start and end values to perform the comparison, which may be relevant if you have SoA storage
    
    It could be an issue, but I'd expect it to come off in the wash: you'd fetch both values anyway in a few more instructions? But would definitely need to be measured and compared rather than simply "vibed out".
    
    But these are superficial differences mostly; thank you
    
    No problem, sorry I wasn't very clear in my first message!
    
    aapoalas
    
    It could be an issue, but I'd expect it to come off in the wash: you'd fetch both values anyway in a few more instructions? But would definitely need to be measured and compared rather than simply "vibed out".
    
    Yup, only really obvious thing I can think that wouldn't be a total wash would be something like filtering all non-zero cases or zero cases in a SoA setting, especially using SIMD. But we're then so far from the ordinary that it's unlikely to be very commonly useful.
    
    My use case does actually have something like that: I can deal with constants much easier, so I like filtering them fast up front. But I'm on JavaScript so I doubt SIMD optimisation are present :)
    
    alurm
    
    Great post as always.
    
    Indexes are naturally relocatable, it doesn’t matter where in memory they are
    
    I never understood that point. Provided there's a way to do this without causing UB in the language of choice, seems like it should be possible to record (or deduct) a "master offset" before transmission (thus making pointers essentially relative) and then fixup every pointer by adding the current "matter offset" after transmission, right?
    
    I guess that would be an additional step and additional code.
    
    But still, if you wanna store/transmit pointers, that seems like the thing to do, right? Never saw anybody talk about that technique, that's why I'm asking.
    
    chc4
    
    capnproto serializes pointers as relative offsets. It's fairly standard in other encoding formats as well, and I think the point of the sentence you're quoting: you can turn non-relocatable pointers into relocatable relative offsets.
    
    alurm
    
    Thanks!
    
    mcherm
    
    Sure, one can do this. But the complexity is large compared to the index approach. During deserialization one can no longer just bulk copy bytes but must also have a second pass through the data adjusting each pointer. And you must know which fields are pointers (straightforward if you own the data structure 100%; more difficult if the ownership is diffused, eg. one module owns the outer data structure and another owns the inner).