C++26 Shipped a SIMD Library Nobody Asked For
18 points by colejohnson66
18 points by colejohnson66
Here is the SG6 Chair's response to the referenced "6 reasons to use std::simd" critique. https://www.reddit.com/r/cpp/s/QIC0FC0sGN
I can't speak to the criticism in detail, but the author tests <experimental/simd> which is the experimental technical specification from years ago and not what was actually standardized in C++26.
The post makes this point about 6 times:
No optimizer integration. The compiler sees template instantiations and function calls, not SIMD primitives.
Why? The assembly fragments shown contain no function calls, just vector instructions, so clearly at some point in the compilation process the calls got inlined and they became SIMD primitives. Why can't that happen early? Is this a result of compilers having a fixed order they perform optimisatipns in, and the algebraic ones are before inlining? That sounds weird to me. What am I missing?
C++ defines templates to expand in their specific textual way with SFINAE and other quirks.
In theory, a compiler could detect typical uses when there is nothing weird going on and emit special-case code for them. In practice it's rarely done, because it requires precisely detecting when edge cases can't happen, and then still keep ability to handle weird edge cases properly. Compiler devs don't want to have two versions of the same code that need to be both fully tested and kept exactly in sync, so they'd prefer spec to say they can ignore edge cases (and delete the second half of the implementation). Otherwise if something is defined to be a template then implementing it as an actual template is the most straightforward way.
In compiling C++, I'd expect the compiler to check all types before starting optimisation, and because of how templates work, checking types involves instantiating templates. (Surely the idea isn't to defer generating type errors to the optimisation phase.) Once you have all the templates instantiated, there is no more templating going on and the template calls are just function calls (assuming they aren't dynamic member function calls, which would be stupid here), which can be inlined if the optimiser desires.
None of this seems to have anything to do with detecting special cases; this seems the general strategy to me. SFINAE and all that stuff is just details in the "template instantiation" part. Am I missing something?
EDIT: from another comment, apparently compiler improvements were indeed necessary but they were done