The radix 2^51 trick to 256-bit addition (2020)

33 points by val


isuffix

This is really neat! The discussion of carries in addition reminds me of this blog post about using “fatnums” to reduce carrying and simplify by-hand multiplication. Both techniques effectively amortize the cost of computing carries by splitting out a separate normalization step. Neat that in this case it also lets us exploit parallelism in the CPU.

https://parentheticallyspeaking.org/articles/fat-nums/
(Displays poorly on mobile :/ )

ibookstein

This reminds me of Carry-save adders, but in this case the carry vector is stored in the high bits of the words.

See also this submission, which apparently uses this trick as codified in AVX512 instructions vpmadd52luq and vpmadd52huq.

bitshift

When I first saw the title, I thought it had something to do with integer math in floating point, because doubles have almost that many bits of precision. You can add together four 51-bit ints without hitting 2^53-1, the threshold where integer math starts breaking down.

(Nope, it’s unrelated. The actual answer: 256 bits / 5 registers = 51.2 bits per register. This is the smallest number of registers that still gives you some carry-bit headroom, and it’s just a happy coincidence that this specific breakdown would also work in JS/Lua/etc.)

lproven

Fascinating and a cunning hack.

Garbi

I had the weirdest feeling reading this. Through most of it, until the five limbs part, at each stage I thought “what if they” and the next paragraph they sort of did that.