anubis_offload: userscript to offload Anubis PoW to native CPU or GPU code

72 points by thombles

natkr

On the one hand, this seems.. inevitable. But escalating this cat-and-mouse game also means that difficulties are going to be ratcheted up for everyone, eventually making the web unusable for everyone without these accelerator setups.

..I wonder if this will be what finally kills the mobile web? (Not even necessarily the performance impact itself, but the battery costs that it necessarily imposes as well.)

dbushell

Proof-of-work hashing is nothing compared to what legacy JavaScript frameworks have forced mobile browsers to suffer. Anubis should replace PoW with Proof-of-React; churn through the latest React boilerplate. The mobile web has endured a decade of that already.
- cadey
  
  You know, I kinda hate this enough that I may go implement it just to see how it fails in practice.
  - cadey
    
    I implemented it on a plane. I’ll push the branch when I can.
    
    cadey
    
    Here’s the PR: https://github.com/TecharoHQ/anubis/pull/1038
    
    dbushell
    
    what have I started 😅
    
    cadey
    
    Something glorious. Feel guilty not, you have inspired something beautiful!
    
    singpolyma
    
    I doubt any attackers are solving the PoW and once they are it’s game over for the approach. Anubis works because it’s an unexpected obstacle to a dumb bot, not because of PoW
    
    fanf
    
    There are signs at least some of the bots have already levelled up https://social.anoxinon.de/@Codeberg/115033790447125787
    
    singpolyma
    
    That seems to confirm what I’m saying. The moment the kids were able to bother trying the PoW they could do it at scale and it didn’t slow them down enough so if range blocks were needed
    
    fanf
    
    The bots are not run by kids, the attacks are organized by AI companies, they use botnets built by adware companies, and they are supported by billionaires who are ruining everything.
    
    natkr
    
    For what it’s worth, Codeberg runs Anubis on a very low difficulty compared to most other deployments. Basically only proving that you’re running something vaguely browser-shaped.
    
    Yogurt
    
    Accelerator resistant proofs of work exist. Probably anubis will head down that route
    
    algernon
    
    …and then we’ll end up in a situation where the only software that can solve the challenge are the bots, because they have infinite amount of money to throw at it, while the average human with a 10 year old phone is not going to wait 5 minutes with a flaming brick in their hand.
    
    thombles
    
    We’re getting there already. I needed to use the “live chat” on a retail site a couple of months ago. Since it’s by necessity unauthenticated they’ve gone hard on the PoW. My 9yo laptop had to grind for anywhere up to a minute per sent message.
    
    natkr
    
    Yes and no. “CPU-shaped” and “GPU-shaped” proofs of work do exist. But distinguishing between JavaScript-shaped PoW and C-shaped PoW is likely going to be much harder; what’s fast in one is generally going to align with what’s fast in the other (with native code always having a baseline upper hand).
    
    Cloudef
    
    Couldn’t anubis use webgpu compute shaders to accelerate the computation as well. Wasm also has SIMD extensions which can be accelerated by browser on setups where webgpu is not available.
    
    EDIT: Seems like it already uses webcrypto api when supported which should be pretty fast I guess.
    
    retr0id
    
    WebGPU could indeed go fast, but it is not available on my platform.
    
    Even if you hooked the webcrypto API up to a GPU, it would still be slow in comparison. The API is the wrong “shape” to make use of GPU-level parallelism.
    
    Anubis doesn’t use WebCrypto on Firefox, even though Firefox supports WebCrypto - because it ends up slower than pure-JS for this use case, for whatever reason: https://github.com/TecharoHQ/anubis/blob/fb8ce508eebb3e652704ffdc22240cdcf9193975/docs/docs/CHANGELOG.md?plain=1#L23
    
    Cloudef
    
    WebGL can be another alternative, but you kinda have to emulate compute with textures. WebCrypto being slower than pure-JS on firefox sounds really weird indeed.
    
    DustyFuzzy
    
    Given what I’ve seen shader wizards do in vrchat, emulating compute with textures certainly doesn’t seem impossible!
    
    WilhelmVonWeiner
    
    What have you seen?
    
    vimpostor
    
    Probably referring to the RISC-V emulator implemented as a HLSL pixel shader for VRChat (lobste.rs discussion, where you even commented :p).
    
    DustyFuzzy
    
    Multiple emulators, and impressive stage lighting setups that emulate DMX512 over Art-Net for control.
    
    mort
    
    Using WebGPU would make it even faster on already fast machines with modern desktop GPUs which support WebGPU. It would do nothing to improve the situation where Anubis’s performance is actually an issue, which is older machines, especially phones, which probably don’t support WebGPU.
    
    freddyb
    
    WebCrypto is still CPU, not GPU. That’s the several orders of magnitude difference which motivated the author. WebCryptp vs JS hashing is likely much closer than the given kiloHash vs Mega-GigaHash difference
    
    Cloudef
    
    Yeah my point with webcrypto was that since it’s implemented by browser, they can use more efficient native code than JS, seems like that’s not the case weirdly enough.
    
    0x2ba22e11
    
    since it’s implemented by browser, they can use more efficient native code than JS
    
    My first guess as to why is the fact that the web crypto APIs all return promises. So if you are doing anything in a loop with them you hit the browser event loop every iteration. On my machine right here await Promise.resolve() in a loop takes about 50ns per iteration.
    
    valpackett
    
    Right. The overhead for bouncing between the web page context and the built-in “worker” is compensating for the efficiency of the implementation. WebCrypto was designed with “normal” use in mind, so it doesn’t include “just iterate sha-256 many many times” as a unit of work.
    
    freddyb
    
    Oh good, I didn’t consider the consecutive jumps out and into the event loop. Thank you both.
    
    j4m3s
    
    If the goal is to avoid being accelerated using GPU, maybe anubis should do PoW using non-parallelizable hash functions like argon2id so that it can’t be GPU accelerated?
    
    FRIGN
    
    This is a really good suggestion! I would also propose to use convoluted hash functions (i.e. one after another), especially using newer hash functions that are not specifically hardware accelerated.
    
    jevinskie
    
    Spiffy, I was thinking about doing this as a Safari extension but didn’t know what JavaScript goo to put in the userscript!