AVX-512: First impressions on performance and programmability

5 points by jmillikin


valpackett

SIMT exposes a more “scalar”-like interface. In this case, your code would define the work for just a single pixel, and hardware/compiler together does the job of parallelizing the for loop

Mmm, I wanna see the author try ISPC!—

I can’t comment on ISPC because I don’t know it

welp.

fanf

What makes AVX512 more attractive to me is that it scales down to small data better than AVX2. It has much more uniform support for masking, especially for loads and stores, so there’s much less need for scalar code to deal with the unaligned ends of arrays. Nice for string ops!