anubis_offload: userscript to offload Anubis PoW to native CPU or GPU code
72 points by thombles
72 points by thombles
On the one hand, this seems.. inevitable. But escalating this cat-and-mouse game also means that difficulties are going to be ratcheted up for everyone, eventually making the web unusable for everyone without these accelerator setups.
..I wonder if this will be what finally kills the mobile web? (Not even necessarily the performance impact itself, but the battery costs that it necessarily imposes as well.)
Proof-of-work hashing is nothing compared to what legacy JavaScript frameworks have forced mobile browsers to suffer. Anubis should replace PoW with Proof-of-React; churn through the latest React boilerplate. The mobile web has endured a decade of that already.
You know, I kinda hate this enough that I may go implement it just to see how it fails in practice.
I implemented it on a plane. I’ll push the branch when I can.
Here’s the PR: https://github.com/TecharoHQ/anubis/pull/1038
I doubt any attackers are solving the PoW and once they are it’s game over for the approach. Anubis works because it’s an unexpected obstacle to a dumb bot, not because of PoW
There are signs at least some of the bots have already levelled up https://social.anoxinon.de/@Codeberg/115033790447125787
That seems to confirm what I’m saying. The moment the kids were able to bother trying the PoW they could do it at scale and it didn’t slow them down enough so if range blocks were needed
The bots are not run by kids, the attacks are organized by AI companies, they use botnets built by adware companies, and they are supported by billionaires who are ruining everything.
For what it’s worth, Codeberg runs Anubis on a very low difficulty compared to most other deployments. Basically only proving that you’re running something vaguely browser-shaped.
Accelerator resistant proofs of work exist. Probably anubis will head down that route
…and then we’ll end up in a situation where the only software that can solve the challenge are the bots, because they have infinite amount of money to throw at it, while the average human with a 10 year old phone is not going to wait 5 minutes with a flaming brick in their hand.
We’re getting there already. I needed to use the “live chat” on a retail site a couple of months ago. Since it’s by necessity unauthenticated they’ve gone hard on the PoW. My 9yo laptop had to grind for anywhere up to a minute per sent message.
Yes and no. “CPU-shaped” and “GPU-shaped” proofs of work do exist. But distinguishing between JavaScript-shaped PoW and C-shaped PoW is likely going to be much harder; what’s fast in one is generally going to align with what’s fast in the other (with native code always having a baseline upper hand).
Couldn’t anubis use webgpu compute shaders to accelerate the computation as well. Wasm also has SIMD extensions which can be accelerated by browser on setups where webgpu is not available.
EDIT: Seems like it already uses webcrypto api when supported which should be pretty fast I guess.
WebGPU could indeed go fast, but it is not available on my platform.
Even if you hooked the webcrypto API up to a GPU, it would still be slow in comparison. The API is the wrong “shape” to make use of GPU-level parallelism.
Anubis doesn’t use WebCrypto on Firefox, even though Firefox supports WebCrypto - because it ends up slower than pure-JS for this use case, for whatever reason: https://github.com/TecharoHQ/anubis/blob/fb8ce508eebb3e652704ffdc22240cdcf9193975/docs/docs/CHANGELOG.md?plain=1#L23
WebGL can be another alternative, but you kinda have to emulate compute with textures. WebCrypto being slower than pure-JS on firefox sounds really weird indeed.
Given what I’ve seen shader wizards do in vrchat, emulating compute with textures certainly doesn’t seem impossible!
What have you seen?
Probably referring to the RISC-V emulator implemented as a HLSL pixel shader for VRChat (lobste.rs discussion, where you even commented :p).
Multiple emulators, and impressive stage lighting setups that emulate DMX512 over Art-Net for control.
Using WebGPU would make it even faster on already fast machines with modern desktop GPUs which support WebGPU. It would do nothing to improve the situation where Anubis’s performance is actually an issue, which is older machines, especially phones, which probably don’t support WebGPU.
WebCrypto is still CPU, not GPU. That’s the several orders of magnitude difference which motivated the author. WebCryptp vs JS hashing is likely much closer than the given kiloHash vs Mega-GigaHash difference
Yeah my point with webcrypto was that since it’s implemented by browser, they can use more efficient native code than JS, seems like that’s not the case weirdly enough.
since it’s implemented by browser, they can use more efficient native code than JS
My first guess as to why is the fact that the web crypto APIs all return promises. So if you are doing anything in a loop with them you hit the browser event loop every iteration. On my machine right here await Promise.resolve()
in a loop takes about 50ns per iteration.
Right. The overhead for bouncing between the web page context and the built-in “worker” is compensating for the efficiency of the implementation. WebCrypto was designed with “normal” use in mind, so it doesn’t include “just iterate sha-256 many many times” as a unit of work.
Oh good, I didn’t consider the consecutive jumps out and into the event loop. Thank you both.
If the goal is to avoid being accelerated using GPU, maybe anubis should do PoW using non-parallelizable hash functions like argon2id so that it can’t be GPU accelerated?
This is a really good suggestion! I would also propose to use convoluted hash functions (i.e. one after another), especially using newer hash functions that are not specifically hardware accelerated.
Spiffy, I was thinking about doing this as a Safari extension but didn’t know what JavaScript goo to put in the userscript!