CPU Cache-Friendly Data Structures in Go
46 points by runxiyu
46 points by runxiyu
I’ve read about data driven design in C and other low level languages. I always assumed that a higher level language would have abstractions that would prevent ddd. I guess that’s wrong?
Go gives you a lot more control over memory layout than other higher level languages.
Is it explicit or do you get for free by using the right data structures? I know the answer is probably in the article, but I'm not picking it out.
It’s explicit as it’s mostly done by laying out structs, including adding padding between fields.
Zig has amazing support for ddd
I’ve been meaning to try it out for that, I love fast software
It's much easier in higher-level languages than C, since you can hide the implementation (say you want to use structs of arrays instead of arrays of structs) behind an abstraction, and C is terrible at abstractions compared to just about any other language. And there are few languages that don't have the ability to mess with data in a precise way with primitive types and aggregates of them, even Java can do it, if uncomfortably. The abstraction capability coupled with the low-level knobs is pretty much what has made C++ the default choice for performant software for a long while. But I'd also rather do that stuff in C# than C, for example.
DDD usually makes people think domain-driven design vs. data-oriented design. I mix them up too because it doesn't really matter, but it's funny how opposite they are.
Why is this using [56]byte instead of uint64 to add 64-bit padding? One requires understanding the array representation, other is obviously 64 bits wide.
To pad the the cache line width of 64 bytes, you'd need to pad with [7]uint64, which is arguably more mental math than 56 + sizeof(uint64) to arrive at 64.
I must be missing something because I don't think 56 + sizeof(uint64) gives takes you to the next 64-bit boundary?
Is there a one byte header in the uint64 field?
Cache lines are usually 64 bytes, not 64 bits. If you're accessing memory in the same region of memory that goes into the cache line from multiple cores, it needs to be loaded from memory each time and can effectively not be cached. A header and alignment might change things a bit. I don't think that's relevant for the example. (And, although I don't know for sure, I don't think Go structs have headers. No idea if structs are allocated aligned though.)
The Branch Predition Friendly Code Branchless version seems a bit broken. It checks for x >= 128 instead of x > 128 and doesn't work for x > 255 albeit the type being int and not byte.