Why are anime catgirls blocking my access to the Linux kernel?

117 points by Helithumper

pushcx

Despite this post’s work to demonstrate Anubis cannot work in theory, there are many public reports that it works in practice.

And based on the name, I think it is an anime jackalgirl.

x64k

I think this post just misses two things: just how much bad crawling code there is out there, and just what a goldmine of information the Internet still is.

First, a lot of AI crawling deployment is bottom-feeding companies that don’t exactly deploy the best of crawling code. If it were just five or six Google-scale companies and a dozen or so startups doing the crawling, and at a decent rate with reasonably well-behaved crawlers, AI crawling would just add a little to the spambots baseline. But it’s not, it’s thousands entities that range from clueless corporate departments to academics and from techbro startups to former SEO agencies that are pivoting towards AI slop. If you somehow get a hold of a human at the other end, I bet 7 times out of 10 the problem is explaining them what rate limiting means in the first place.

Hell, based on some of the crawling code I’ve seen (edit: much of it basic, vibecoded crawling code that doesn’t do any kind of caching), I bet a good 10% if not more of Anubis’ impact comes just from the fact that the parser powering it chokes on the jackalgirl page. The inside-out captcha is trivial for computers to solve, as the post illustrates. But both the tech stack and the people working on it in many AI shops out there would have real trouble integrating it in their crawlers.

Second: even if they do integrate it in their crawlers, the Internet is big and Anubis isn’t that big yet. It’s still at the point where you don’t have to outrun the bear, you just need to outrun everyone else in your group. Anubis’ deployment is sufficiently low-scale that you can still crawl a lot of content without running into it, so it’s not really worth bothering with for the bottom feeders. And it’s still sufficiently niche-focused (largely tech) that competent crawler authors can do what taviso did and fly under the radar, while Anubis is happily blocking the worse, bulk offenders.
- gcupc
  
  I bet a good 10% if not more of Anubis’ impact comes just from the fact that the parser powering it chokes on the jackalgirl.
  
  My experience with captcha alternatives has been that “literally anything unexpected” will block the average spambot, and I would not be surprised if that’s true of the average crawler.
  - 0x2ba22e11
    
    I’ve seen comment sections cut their bot spam intake to zero by adding <label>What is two plus two? <input type=number name=antispam /></label> and rejecting POSTs that don’t come with “antispam=4”.
    
    kaimac
    
    The “captcha” on my guestbook is literally just an input field that says “please type hello”. Zero spam after I added that.
    
    kev009
    
    This kind of thing is well for one offs, but for instance if you made a wordpress plugin (for example of scale) where the admin can configure the string, it will probably be algorithmically defeated if it sees any significant uptake. Spam and AI crawling both seem like an arms race, and in both cases the victim has to expend more resources for no gain to tamp it down.
    
    val
    
    I had a PHPBB with a field like this, with a handful of questions it would randomly pick. Every time I changed the questions it would stop the spambots for a few days, then they’d come back. They were not bruteforcing them so there seemed to be a human somewhere recording the answers; but I don’t know why they bothered given that the forum was already effectively dead.
    
    spc476
    
    For you, it might have been dead. For SEO purposes, not so much. It probably had enough of a page rank [1] to leech some of it for their own sites/clients. I constant get emails for people wanting to advertise/guest post on my blog just because I have enough page rank to leech.
    
    [1] I’m using “page rank” in a generic sense of “how high your site is positioned on a web search” sense. I don’t think the official Page Rank of Google is a viable measure anymore.
    
    val
    
    former SEO agencies that are pivoting towards AI slop
    
    What benefit do they get from crawling and, presumably, training their own LLM?
    
    x64k
    
    Lots of what I’m generically calling “SEO agencies”, but broadly offered a wide set of packages that ultimately boiled down to “you’re going to rank everywhere” (so social media marketing and management, SEO of every hat color, content marketing etc.), relied extensively on human-written slop. LLMs killed a lot of the slop-generating part, but the infrastructure used to distribute, evaluate and manage the slop remained in place.
    
    So a few of these managed to creatively spin themselvess off into AI-enhanced marketing/advertising/whatever. They don’t specifically train their own LLMs – I mean, some do, I guess – but others just sell the training datasets that’s at least partially assembled through crawling, or provide “strategic insights” (i.e. crawl things and feed them through LLMs that extract said, erm, “strategic insights”). Or offer services centered around these, like custom GPT “development” services.
    
    I.e. they don’t crawl to get data to train their own LLMs. Some of them sell datasets from crawling, or feed it into AI tools for insights. For example – it’s not for SEO purposes but it’s a good illustration – I bet that’s why LKML gets crawled a lot, too: crawled data doesn’t go just into tomorrow’s vibecoding agent training data set, it goes into security and code quality “metrics” tools, recruitment agency AI tooling and so on.
    
    Much of it is snake oil but some companies that used to run large social media fleets are offering training data sets that are otherwise not easily available to AI shops per se, like DM data. It’s not the kind of traffic that usually hits Anubis, it’s just the route through which some of these companies managed to get into the game and now also generate some of the traffic that hits Anubis.
    
    (Edit: I guess my generic use of SEO is worth some explaining, it’s not a shorthand I should be using like that.
    
    I’m not just handwaving the SEO part. Back like 10-15 years ago, there were a lot of companies that were nominally not doing SEO and were just providing things like copywriting services or social media management. But if you crawled up the ownership hierarchy, you eventually ran into people who either owned several of these companies and a SEO business, or they were just outright running subsidiaries of SEO-oriented businesses. The weird org structure made some business sense, it kept the heavily outsourced component away from the main company and allowed them to easily “tap” into other markets that just happened to need slop as well.
    
    As people’s consumer habits began to move away from Google and shift more and more towards social media, lots of companies that had started as SEO marketing shops started to expand their offerings, and it wasn’t “just” SEO anymore. It was still a lot of slop writing, and still a lot of pushing, but it wasn’t Google search’s page ranking model they were targeting so you’d rank in Google searchis, it was Facebook, Twitter, Instagram and so on, and the point was to get you to the top of search lists, relevant hashtags, in users’ timeliens and so on.
    
    Since they already had the spinoff companies infrastructure in place, they would often do it by offloading new services to existing companies, so nominally they were still “just” a SEO shop, even though they really did a whole bunch of other things.
    
    Hence my lumping them under “former SEO agencies”. I don’t mean agencies that literally did nothing but SEO.)
    
    singpolyma
    
    I bet a good 10% if not more of Anubis’ impact comes just from the fact that the parser powering it chokes on the jackalgirl page.
    
    I expect it is close to 100%
  - jsnell
    
    Counter-abuse is not a static problem with a fixed correct solution. It is an iterated game played against an intelligent adversary. If you implement $PROTECTION to prevent $BAD_THING, people who want to do $BAD_THING will try to figure out how they’re being prevented from doing it and try to work around it.
    
    That’s really annoying, and it’s the main thing making the domain difficult. But there’s a really cool corollary, which is that if the attackers don’t react to what you’re doing, defense is really easy. If the attack is targeting a large population rather than targeting you specifically, any defense that basically nobody else has implemented will work fabulously.
    
    The anti-spam measure of my blog has been a “enter a password in this textbox, the password is …” form entry. Entirely trivial ot pass: the plaintext password right next to the form input hasn’t even ever changed! In the 20 years the blog existed, it’s blocked tens of thousands of spam messages, while letting through a couple of dozen. But that’s just because it’s a sui generis defense that’s not been worth anyone’s while to work around. If there were 100k Wordpress installations using that exact challenge it’d be a very different story.
    
    So, yes, a new custom proof of work challenge probably had an effect, but only because of some incidental second order effect that can quickly be resolved if the attackers are motivated. It’s just that they won’t be motivated unless usage of a specific system is high enough. So the actual defense of these proof of work systems[0] is almost certainly a form of security through obscurity, proof of work is just a substrate. But if you’re doing that, why not choose a substrate that’s going to be more effective or that causes less friction to real users?
    
    [0] It’s not just Anubis, there were half a dozen of them before Anubis happened to catch on.
  - cetera
    
    I seriously doubt the conclusions drawn by a lot of those success stories. That one only ran it for 1 month soon after the release of Anubis and used request rate as the only metric. Other stories came from open source projects on shoestring budgets, which generally don’t have the time, attention, or observability infrastructure to draw solid conclusions from HTTP logs either.
    
    I analyze HTTP traffic as part of my job, and even with multi-million dollar software for analytics and bot detection, it’s still a hard job to draw conclusions about which traffic is legitimate. And there’s literally no way to know what’s in the minds of the people running the scraper bots and whether they’ve been thwarted permanently or temporarily.
    
    Especially in the wake of a DDoS attack, it’s easy to think that a given captcha or firewall rule has “stopped” them, but inevitably they tweak some setting and come back. They may be bots, but there are people running the bots who may have incentives to get past your defenses. It could be that the incentives for circumventing Anubis aren’t compelling enough, but I think this blog post assures us that there’s nothing technically stopping them, since compute is cheap.
  - geocar
    
    Despite this post’s work to demonstrate Anubis cannot work in theory, there are many public reports that it works in practice.
    
    I have a few problems with the methodology on those reports, and I’m not so sure that’s the right conclusion either. When someone says “it works” what exactly do they mean? Their servers and users are generally happier after installation? Have they compared the Anubis-theory with other theories? Or did they just get the result they were expecting and move on?
    
    You know I have a lot of traffic, and since it’s material-revenue to me, I spend a fair bit of time trying to understand how I can handle it better, and I’ve seen JavaScript-simulators for adtech bot-detectors like the one described in the article over a decade ago, so every time I see one of these filters get popular, I copy the page/javascript and put a little setTimeout snuggled in there that does something different so I can detect the simulator (and distinguish it from browsers running the code-as-written) – and you know what? I get better results blocking the “valid” answers (which must come from a simulator for the popular js) than the invalid ones – and let me be clear: I am defining better results here as net-more money.
    
    Meanwhile, my experience is that people using browser-emulators (i.e. actually running the JavaScript) don’t actually tend to hammer my servers. I’ve exfiltrated a couple in my past life and so I can speculate on why that is, but if you’re just trying to minimise load, giving legitimate users work to do expecting illegitimate users to find the work more expensive than they’re willing to spend just doesn’t seem to be what’s going on.
    
    So I wonder if Anubis is actually helping those reporters, or if a JavaScript redirect that looks more complicated than it is would have done just-as-well.
  - hoistbypetard
    
    It’s the classic security vs safety question. You used to see it a lot in the late 90s/early 00s when people would argue the merits of Mac OS vs Windows for desktop computing. Some would claim that Mac OS was more secure because it saw less malware. Others would point out that malware was just as possible for Mac OS; it wasn’t intrinsically more secure.
    
    While you could quibble with either position, the fact was that even if Mac OS was just as bad (or only moderately better) in theory, it was still a much safer thing to use right at that moment, because the drive by spray-and-pray malware spotted in the wild wasn’t effective against it.
    
    This is the same thing; your expensive endpoints are going to get fewer hits from abusive bots because of Anubis. It’s like moving into a safer neighborhood; even if it’s not theoretically any more secure, especially if threat actors decide to specifically target it, you and your property are safer there, on average.
    
    It also reminds me of a question I heard in a meeting once (probably 15-ish years ago, now) when we were talking about mitigations to make exploits more difficult to write. “Does your threat model include Tavis Ormandy?” Given our goals ours did, and should have. I don’t think Anubis needs to, just now.
    
    Edit to add: A big part of how the abusive bots became a problem was because it was cheaper for them to keep crawling, over-and-over, than to save state and cache things. Despite Mr. Ormandy’s correct reasoning that defeating this challenge doesn’t cost a meaningful amount of compute, I think it does represent a targeted effort at individual sites that would be more expensive than just caching things and saving some state. And that is possibly enough of an advancement to fix the problem; if these badly behaved crawlers would cache and save state and only re-crawl when they needed to, Anubis wouldn’t be needed most of the time.
    
    fanf
    
    I commented briefly the other day about Windows boxes getting immediately pwned 20+ years ago as did ~david_chisnall.
    
    There were two things going on:
    
    Windows was more popular by orders of magnitude, so it was a more profitable target for malware. There were far more people working on malware for Windows than for other platforms.
    
    The unixes, including Mac OS X, had learned a hard lesson from the Morris Worm in the late 1980s and as a result were many years ahead of Windows in terms of not being a gratuitously soft target.
    
    Apple hired a bunch of BSD hackers who had been running ISPs and networking companies and suppliers of core internet backbone software for a decade, so at a time when Windows was losing its mind at the first sign of a TCP/IP packet, Mac OS was Not Shit as judged by unix / arpanet greybeards.
    
    Which is not to say that Mac OS was totally secure, but like your typical Linux or BSD installation at the time, it came preinstalled with a bunch of services that were turned off until you asked for them. Unlike Windows which was running about outside with its wobbly bits flapping around for all to see.
    
    The unixes were still riddled with memory-unsafe C vulnerabilities, but they were also not exposing those vulnerabilities to the network like Windows did. So they were a less attractive target for hacking, both because they were less popular, and because they were not so easily pwned.
    
    hoistbypetard
    
    The unixes, including Mac OS X, had learned a hard lesson from the Morris Worm in the late 1980s and as a result were many years ahead of Windows in terms of not being a gratuitously soft target.
    
    When I wrote my comment, I was actually thinking about the discussions of Mac OS 8/9 vs Windows 95/98… so things that were safer despite not having been informed so much by the lessons from the Morris worm. I think your comment is completely accurate starting somewhere between ’02 and ’04, but my frame of reference was just a little before Mac OS X was the norm for Mac users.
    
    david_chisnall
    
    Specifically in the case of slammer, as I recall, it was a trifecta of:
    
    SMB is on by default.
    
    The SMB protocol has vulnerabilities.
    
    The SMB protocol parser runs in the kernel.
    
    *NIX systems didn’t typically put these servers in the kernel, though they did run them as root, which was often equivalent. Samba had some wormable vulnerabilities at the same time, but most *NIX systems didn’t have Samba on and listening to the network by default. And that meant a worm struggled to spread because there would not usually be more than one or two Samba servers on a network, whereas there were dozens of Windows machines, and some were laptops that would move between networks.
    
    It was also common to restrict the Samba port to the local network. Until ISPs started blocking inbound SMB connections, most Windows machines were directly exposing the vulnerable port to the network. Back then, most home Internet connections didn’t use NAT because they had a single client and no LAN, so if Windows exposed the SMB port to the network, it exposed them to the Internet. Scanning the entire IPv4 address space is easy, so the worms would do that once they’d infected everything on the local network. Swansea at least blocked this port from the Internet.
    
    OS X and Linux at the time weren’t really doing much more (if any) privilege separation than Windows, they were mostly safer because they ran fewer things by default.
    
    mort
    
    The post, fairly convincingly IMO, demonstrates that Anubis doesn’t work by making it too computationally expensive to crawl sites. If the post is correct, the reason that Anubis works is due to some other mechanism – such as by tripping up crawler code.
    
    Irene
    
    I’m really happy I didn’t have to be the one to point out that she’s a jackalgirl. You can tell by the ears.
    
    cendyne
    
    Agreed with pushcx. There’s a reason it’s used across ffmpeg’s various properties. It stops the lazy abusers who crawl without respect.
    
    Just getting to a news aggregator like this site can take down a self hosted wordpress instance with a few plugins. The software people have out there today won’t suddenly become performant because a few hosts are saturated with the research equivalent to a script kiddy. So we have middleware like Anubis to slow things down.
    
    I encourage anyone here who cares about the success of the linux kernel and ffmpeg to fund Anubis’s development. This is a shared problem and now there’s a shared solution to it developed by someone who cares about their server not being ransacked by researchers using residential VPNs.
    
    mitsuhiko
    
    It’s not entirely surprising that works in practice because it’s such a novel kind of approach and it doesn’t protect a lot of really important pages yet. So it’s unlikely that someone who is mass crawling would custom handle the crawler for it. However, if someone were to do it, it would be very hard to bypass this, as demonstrated in the link.
  - icefox
    
    ermerghed. Once and for all, Anubis does not filter bots. It rate-limits browsers. This is good because it stops companies with more money than sense from poisoning their own damn well by desperately hammering on every single git repo diff in existence over and over again non-stop in the hope of extracting a few extra bytes of 🌈🦄content🦄🌈 to feed into their slop-machines.
    
    Do the bots still get through? Of course. But it forces the client to spend more time working on the request than the server does.
    
    kastor
    
    Once and for all, Anubis does not filter bots. It rate-limits browsers.
    
    Absolutely not, Anubis doesn’t rate limit anything: once you get past the one-time initial challenge (the “filter”), it does nothing to prevent hammering resources.
    
    But it forces the client to spend more time working on the request than the server does.
    
    Only for the very first request. After that, for the 7 following days, every call is free.
    
    Unreal
    
    Anubis is designed to rate limit bots/crawlers that change IP on every request, so it achieves its goal there. If you have a single IP, or a small batch of IPs hammering your git forge, it’s easy enough to ban those IPs. But if they’re changing IPs between tens of thousands residential IPs, from different ISPs and countries, then there isn’t much you can do.
    
    kastor
    
    Ah, yes, I forgot the threat model was bots with (nearly) one unique IP per call.
    
    An important point that the original blog post completely misses, too.
    
    icefox
    
    One that I missed as well, tbh.
    
    However at least on the version of Anubis I’m running it doesn’t save the IP’s that pass the test, it sets a cookie in the client browser. If the bot doesn’t save that cookie and return it on future requests (and why would it tbh) then it gets rate-limited every time. This is easy to verify by browsing to https://wiki.alopex.li/ in Firefox private mode; each new private tab invokes a new challenge. Or just dig into the browser inspector and delete the within.website-x-cmd-anubis-auth cookie and reload the page.
    
    This may vary with version; I haven’t dug deep into Anubis’s settings or functionality. You could easily imagine it having a fail2ban-like functionality where it invalidates the cookie every X hundred requests or something.
    
    dzwdz
    
    This is good because it stops companies with more money than sense from poisoning their own damn well […]
    
    I mean, AI companies poisoning their own well sounds like a good thing. I thought the point of Anubis was helping out with server load / blocking bad bots because you don’t want them, not doing favors for AI companies.
    
    icefox
    
    Yeah but it’s our well too. The well is us.
    
    mariusor
    
    Anubis recently started blocking how I access git.kernel.org and lore.kernel.org.
    
    More like, the kernel.org operators decided to use Anubis to try to diminish AI crawlers, it’s not like it just popped up into existence to just inconvenience OP.
    
    landon
    
    OP is talking about deed, not intent.
    
    krig
    
    Making life difficult for low-bandwidth users is definitely unfortunate, but 100% of the blame here should be placed with the LLM scraper companies like Perplexity[1].
    
    What are the potential outcomes of this arms race? It seems to me that either the AI bubble crashes so hard that constant scraping at massive scale becomes unviable, or the open internet as we know it dies. I guess for massive corporations, making it so that only massive corporations can provide website hosting looks like a good outcome?
    
    1: https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/
    
    jzb
    
    Indeed. It really bugs me when, in situations like this, people’s ire is directed at the organizations / people trying to mitigate harm rather than the ones who are causing the harm in the first place. “Just let me browse the web without inconvenience” – we’d like to! Stop supporting companies that harm the web!
    
    Riolku
    
    I thought that the main idea is that each (source IP, anubis deployment) pair requires a new solution, and so if you combine this with IP rate limiting you get a solution that just works: crawlers now suddenly have to solve so many challenges that it’s infeasible.
    
    Or maybe that’s simply not true? Maybe the proof of work is so small that a smart crawler can still easily DDoS?
    
    cadey
    
    Kernel.org also seems to run an old version of Anubis before the challenge string was changed to be 64 bytes of random data. That would fix this problem entirely.
    
    0x2ba22e11
    
    I guess a side effect of Anubis working quite well is that there are going to be a lot of outdated copies of it in use. People won’t feel much impetus to keep it up to date if it’s just a difference between it being 99% effective and being 99.1% effective.
    
    Riolku
    
    Can you elaborate a bit on why that fixes this?
    
    cadey
    
    The PR explains it better than I can right now: https://github.com/TecharoHQ/anubis/pull/749
    
    proctrap
    
    It can be configured and IIRC anubis can tighten the work required when more access happens. Thus squeezing out the bots. Either way: I am not sure the author researched the actual deployment goal and project very well. PoW Cookies can’t even be shared between anubis restarts docs.
    
    johnklos
    
    Anubis is like the question on web forms that asks you to enter the answer to 11 + 4 before you can submit. The work is trivial. Coding a bot to recognize the question box, parse the text, calculate the answer, then enter it is much, much more work than just calculating the answer.
    
    The whole idea is that new tests will (should) come out more quickly than scraper bots can be coded to perform the trivial, but specific, work.
    
    st3fan
    
    Some modern scrapers are full browsers now so something like Anubis has no impact.
    
    mtset
    
    It has the effect of rate-limiting requests, which is what it is supposed to do.
    
    Halkcyon
    
    Only the first request?
    
    c12
    
    Yes, but the threat model is that most requests originate from a different IP address.
    
    That means the same bot can end up making a hundred requests within ten seconds, most if not all from a different IP. Anubis would therefore slow those down if they were running full browsers or a JavaScript engine for executing code, or otherwise trip them up completely if they are dumb scraping bots.
    
    AFAIK the ultimate goal isn’t to be impenetrable, but to reduce the load on the server to make it managable because so often is the case that the scraper bot traffic is no different in volume to a traditional DoS attack.
    
    BlackLotus89
    
    In my experience those are not the problem. (Disclaimer I don’t and won’t use Anubis. My own solution works better right now)
    
    Anyway 90%+ of the bot traffic ai can identify (yes I can) comes using stupid crawler that seem to be written stupidly using simple requests.
    
    fluent
    
    llm crawlers could start breaking PoW, they dont because even assuming anubis runs on 11508 sites(which is a wild overestimation) its probably just not worth the time and effort(which is kinda low, yes) to develop for the amount of training data that would be collected
    
    Riolku
    
    Err sorry I must be missing something. They claim they can mine a token for all sites instantly? But doesn’t each anubis deployment have a unique cookie that you can’t share between deployments?
    
    My understanding was your submitted solution to the challenge has to begin with the server cookie
    
    hwayne
    
    If you mean this claim:
    
    So (11508 websites * 2^16 sha256 operations) / 2^21, that’s about 6 minutes to mine enough tokens for every single Anubis deployment in the world. That means the cost of unrestricted crawler access to the internet for a week is approximately $0.
    
    They’re not saying they can mine a token for all sites instantly, they’re saying that they can solve an Anubis challenge in 0.03 seconds, which is fast enough to handle all existing deployments in six minutes.
    
    valpackett
    
    I really don’t think it’s that common to run the default difficulty. I regularly see challenges that take a few seconds on top-tier CPUs.
    
    mtlynch
    
    The author is using a SHA256 generator in C with compiler optimizations for performance. I’d expect that to have performance that’s an order of magnitude faster than what the browser can achieve in JavaScript.
    
    valpackett
    
    The browser exposes hash algorithms via the crypto.subtle.digest, which would use the native optimized version (on modern CPUs, that would involve calling the instructions that implement SHA-2 directly in hardware). Hopefully Anubis does use that!
    
    Also, JITs can emit code that is, in some cases, faster than AOT compilers. Heck, they could just detect whole SHA-2 implementations and replace them with the fast instructions. I don’t think they do that currently, but they could :)
    
    cadey
    
    Anubis does do that, but not on Firefox because the jump from JIT space to webcrypto space is very slow on firefox for some reason I can’t debug. Firefox uses a pure-JS SHA256 implementation which is approximately 0.8x as slow as webcrypto on chrome. I consider that good enough.
    
    valpackett
    
    oops! Is that reported on bugzilla? I might look into that issue. (not that I don’t have a million other things to do, but…)
    
    dbushell
    
    I was able to generate SHA-256 hashes in a Wasm implementation (compiled from Zig) at 10x the speed of Web Crypto. This was only a rudimentary test in Chromium/V8 (macbook m4). I’m tempted to do some proper benchmarks now… possible my test was bad. Web crypto is almost certainly passing off to native code in V8 not sure about a hardware instruction. There will be added cost getting that value back into JavaScript land. Wasm can use shared memory.
    
    cadey
    
    Kernel.org also seems to run an old version of Anubis before the challenge string was changed to be 64 bytes of random data. That would fix this problem entirely.
    
    icefox
    
    hmm. It would, but for some reason my brain finds it unsatisfying. This problem seems to basically be a rainbow-table attack, which is traditionally solved by a salt value added to passwords. The salt is not secret, but makes pre-computing the whole table impractical. I wonder if something like that could be done by Anubis?
    
    idk, I’m not very awake right now.
    
    cadey
    
    The fix for Anubis makes challenge strings random.
    
    gerikson
    
    Site seems unreachable due to heavy load. Bet the author wished he’d had something like Anubis now 😉
    
    friendlysock
    
    Were we for or against hugs-of-death–or is it okay as long as it happens to somebody we disagree with?
    
    gerikson
    
    I’d have thought that someone who works for Google Project Zero had a bit more nous when it comes to DDoS. If it had been someone more obscure I’d not have made the quip.
    
    I saw the site pop up on HN a few hours earlier than here and it was already struggling then.
    
    apg
    
    Author did say it was a VPS with 128MB of memory… so, pretty tiny… even if only static html.
    
    friendlysock
    
    Okay–I don’t love it, but I see where you’re coming from, having made similar arguments in the past.
    
    srtcd424
    
    I think we’re just mildly amused by the irony? (And wondering if it highlights a point.)
    
    Halkcyon
    
    I don’t think it’s ironic that a tiny 128MB host struggles with Internet-sized loads.
    
    toastal
    
    …I just can want til a noscript option exists so I don’t have to enable JavaScript for sites I don’t yet trust just to see what is linked behind the manga wall.
    
    jussi
    
    The latest Anubis version does have support for No-JS challenge.
    
    toastal
    
    Oh, I mean they have shown as much on the cream-colored wall page, but I have yet to see it deployed.
    
    xfbs
    
    On a personal note: a bit weird to see the “art”, I wish this project was more neutral. Some people like odd things, and that is fine. But doesn’t have to be the loading screen for a whole bunch of websites (including gnome.org and now the kernel pages). Makes us compsci folks look even weirder than we already are
    
    srtcd424
    
    At the end of the day the author is fully entitled to their aesthetics in their own code though? And the sites using it are chosing to stick to the defaults. Author /asks/ people not to change the art without paying, but it’s open source so presumably trivial if the sites using the code thought it was important.
    
    I’m a bit of a grumpy old man in lots of ways, but ‘politically’ have a lot in common with the younger generation of devs even if my habits and tastes differ. I’ve never ‘got’ anime myself, but I’m beginning to find it strangely reassuring to see the Anubis mascot when I hit a site. Makes me feel like I’m on my home territory :)
    
    nonagoninf
    
    ¯\_(ツ)_/¯, it’s fun/playful, I don’t see the harm. If you are a ‘serious’ company, you can always fund development and get different artwork:
    
    https://anubis.techaro.lol/docs/admin/botstopper
    
    hwayne
    
    I really hope this business model words for @cadey and that more open source projects try to adopt it if it does.
    
    jzb
    
    I like the mascot. It’s not even 10% as weird as many of my friends in tech, and frankly – people need to get a grip and learn to accept harmless weirdness. People who are getting upset about seeing an “unprofessional” character on open-source projects, in particular, really don’t have a clue about the people who’ve carried open source this far.
    
    Me? I wish people would spend less time complaining about art and more time opposing resource-gobbling, unethical businesses that have made Anubis necessary in the first place.
    
    dmbaturin
    
    It used to be celebrated even in the Silicon Valley space in the early days of the Internet and sterile designs seems to have come with the companies that eventually enshittified the web, now that I think of it…
    
    yshui
    
    We need to normalize the weird, not hiding from it.
    
    gcupc
    
    Indeed. Don’t kill the part of you that’s cringe, kill the part of you that cringes.
    
    Aks
    
    I am cringe, but I am free.
    
    mtset
    
    I truly do not understand this. Every company has a logo; many have mascots. GitHub had Octocat, Duolingo has the demon owl thing, Reddit has a weird alien, Duracell and Nesquik and Geico and Cheetos and Kellogg’s Frosted Flakes and MailChimp and Toys ‘R’ Us and Froot Loops and Borden Dairy and StarKist Tuna and freakin’ Disney have anthropomorphic animals as their logos or in core marketing material.
    
    What’s different about Xe’s anthropomorphic jackal?
    
    pgeorgi
    
    What’s different about Xe’s anthropomorphic jackal?
    
    It vaguely (or not so vaguely) points to anime style and to furdom, which some people (from sexually repressed subcultures) associate with sexuality in the public sphere, and to them, that mustn’t be. It’s their own personal problem but that type is used to making their problems everybody else’s problems, so here we are.
    
    Now I kinda want to run Anubis with a Kellogg’s Frosties tiger-styled picture set, just to see if that suddenly makes Anubis “good”. I mean, that fella is also basically a furry and (unlike Anubis’ mascot) wears nothing more than a scarf.
    
    jaculabilis
    
    In many circles, anime is viewed negatively, and the Anubis mascot is depicted in an anime style. Octocat, the Duolingo owl, the Reddit alien, the Nesquik bunny, cereal mascots, etc. are not “anime-coded”, so to speak, and consequently do not have that negative association.
    
    Probably, some people annoyed by Anubis are proximately annoyed by the inconvenience, and merely target the mascot as an outlet for that. They would be equally annoyed if it had a western animation-coded mascot, they’d just target the annoyance at something else.
    
    cadey
    
    TBH I’m pretty sure it’s because I’m transgender and people are trying to use anything they can to discredit me because of it.
    
    jaculabilis
    
    While I’m sure some people are doing it for that reason, that fact about you is not displayed on the Anubis landing page or the project website, so it’s probably not where most of it comes from.
    
    mtset
    
    This is, indeed, what I’m hinting towards. I try not to say it out loud because the moderation staff of this site is, historically, uninterested in allowing those conversations to happen.
    
    xfbs
    
    I would have the same opinion if you were regular gendered and the mascot were a furry, or a barbie doll. I just like neutrality, it means that nobody can be offended or annoyed because they dislike something.
    
    But this is your personal project, so you should theme it how you want. That is your art, your preference. Maybe it should be the operators of the sites who could pick something more neutral or theme it in auch a way to look, idk, boring? Boring is good.
    
    friendlysock
    
    Comment removed by moderator Irene: made a queermisic insinuation of pedophilia, and refused to apologize
    
    mtset
    
    What exactly are you implying, here? I think I get it, but I could be wrong, so please, clarify.
    
    friendlysock
    
    Exhibit A: sample anubis image
    
    Exhibit B: Wiki entry for moe
    
    I’m not implying anything: I’m saying that for many users having a surprise interjection of moe jackalgirls is at best a jarring surprise from what they’re expecting, and that (combined with @jaculabilis’ observation elsewhere of inconvenience) is more than sufficient as a mechanism of action without reaching for transphobia.
    
    mtset
    
    A jarring surprise, sure. (The artist talked a bit about that on HackerNews.) What I don’t understand is why you said “coded as underage.” Is the character a child? I’d believe it, but it doesn’t read that way off the dome to me. And “underage” for what? She’s not depicted with a alcohol or drugs in any of the images in the repo.
    
    swifthand
    
    There is a divide between people who are are familiar with anime-esque art, and people who are not, when it comes to the neotenic features that are common to the style (e.g. huge eyes, large head:body ratio). Especially the so-called “chibi” or “super deformed” variants of the style.
    
    For people who have absorbed variants of the style for years, they look at the Anubis character and think “this is a stylized drawing of a character who might be of any age”, whereas people unfamiliar with the style would think “only a child has a head that large” at first glance.
    
    I am sympathetic to both interpretations.
    
    cadey
    
    She’s in her mid twenties.
    
    Irene
    
    Hey all,
    
    The mod team is well aware that attacks on the art style of someone’s project are often proxies for attacks on their queerness. It’s difficult, in general, to prove intent about something like that, and so we err on the side of making mod decisions on the basis of things we can be confident of, but let nobody think that we don’t see what they’re doing. Discussion of the art style of a technical project are already off-topic per a highly similar situation four years ago.
    
    In that spirit, then, of focusing our enforcement on things where there’s clear intent: his “coded as underage” wording is a veiled insinuation of pedophilia. Because it’s ambiguous, it likely doesn’t rise to the level of being legally actionable defamation, and because it’s clear that the only basis of it is that you don’t like the art or the person who commissioned it, it likely also doesn’t rise to the level of requiring us to notify law enforcement about your supposed suspicion, but these are things we assess every time something like this happens. If mods had any reason at all to think you were basing your ambiguous almost-accusation on anything of substance, we would have no choice but to get authorities involved, because that’s what it means to run any sort of social venue, online or off. I think you may feel as if your words have no consequences, perhaps because of some false idea that the internet isn’t real life, but I can assure you, they do.
    
    For anyone who lacks context: Unfounded accusations of pedophilia have long been thrown at queer people. This comes in a context where certain alleged democracies have been attempting to cast any form of queerness as inherently sexual and inherently pedophilic, with the likely end-goal of treating all queer people as criminals. Spreading this rhetoric actively helps that agenda. Again, we don’t know what’s in anyone’s hearts, but there is really no other way to take the remark. Lobste.rs is not a place for promoting hatred.
    
    Additionally, @friendlysock, you seem to know that there is no evidence for your insinuation, since your attempted clarification dropped the claim without mentioning it, even after you were asked again. Even if it was said in the heat of the moment, and even if you were simply imitating a form of hate that you’ve seen other people express without fully thinking through its implications, this rises to the level of requiring a public apology. The mod team has talked this over and, while we have various perspectives, we are all agreed that you need to apologize to @cadey so that we can all move on from this.
    
    valpackett
    
    I think one part of it is that mascots haven’t been there in proxy interstitials before.
    
    Until now it’s been kind of unexpected that a service page delivered by networking software in place of the actual domain’s usual content would have a strong identity of its own.
    
    kyrias
    
    I wonder whether these same people would be equally upset if Cloudflare were to adopt a mascot and include it on their interstitials though.
    
    x64k
    
    I find it oddly satisfying to watch so many of the people who scoff at Anubis’ unprofessional mascot go and push code on Github, whose mascot is the goddamn octocat. Everything’s unprofessional right until it’s corporate branding.
    
    There’s an analogy to “the media is the message” to be drawn here, I’m just not sure which one yet.
    
    jjuran
    
    Github, whose mascot is the goddamn octocat. Everything’s unprofessional right until it’s corporate branding.
    
    Before GitHub was acquired by Microsoft, they had a mockup of the Oval Office at their HQ for no apparent reason, hosted “drinkup” events (encouraging alcohol consumption) on a regular basis, and described cloning a repo within the service as “hard-core forking action”. The octocat seems quite unremarkable by comparison, IMHO.
    
    Aks
    
    But we are weird. A lot of compsci folks, like me, are weird autistic queers. And I’m all for embracing that.
    
    xfbs
    
    For personal projects, definitely. Be weird. For the linux kernel pages, that “regulars” and people of various cultures and backgrounds access? Stay boring. Imho
    
    Aks
    
    Why?
    
    xfbs
    
    Because while we can embrace weirdness in personal project, we should be inclusive in group projects. Around the Linux kernel there is a hub of people of many different backgrounds. There are autistic people, queer people, gun nuts, people from different cultural backgrounds. People have different tastes, traumas and dislikes. By staying neutral, we are accepting of everyone and can focus on the kernel itself.
    
    alexandria
    
    Ok but… the only people who have a problem with openly autistic queer, are… bigots. I do not think bigots should be included, and I think not including bigots, is a positive step for the health of a project.
    
    xfbs
    
    I never said anyone has a problem with openly autistic or queer people. I basically said, inclusivity means neutrality. It means not picking sides. It means judging submissions based on their content, and not on the personal life of the person submitting it. Putting catgirls on your website is not inclusivity.
    
    It is also not definitely exclusionary, but it could be interpreted as a bias towards a specific subculture (anime lovers), to the exclusion of people who happen to dislike that subculture (for various rational or irrational reasons).
    
    alexandria
    
    I never said anyone has a problem with openly autistic or queer people.
    
    But uh, contextually you did. The subthread you’re replying to here is
    
    But we are weird. A lot of compsci folks, like me, are weird autistic queers. And I’m all for embracing that.
    
    it’s difficult to see what telling someone who is saying “I am a weird autistic queer and I want to embrace that and show it to other people” to uh, “stop being weird”, means other than “stop showing your queerness and your autism”, because “By staying neutral, we are accepting of everyone and can focus on the kernel itself.”
    
    So let’s step through this logically — who would not be accepting of queerness, that we would be including here, if we’re pushing visible queerness out of the linux kernel project because “we are accepting of everyone”. The answer to that is very clearly, “people who are uncomfortable with queerness”.
    
    xfbs
    
    I think you misrepresent my opinion.
    
    I am accepting of queerness — it is not something you choose. But just because I accept it, does not mean that it should be made visible in software projects unrelated to queerness.
    
    I am accepting of autism. There seems to be a lot of autistic people in computer science. I may be a part of that group. But still, we don’t need to come up with autistic logos and art and put them on software project websites. We can be happy that we are unique.
    
    I am accepting of other religions. But just because I accept them, does not mean that their religious symbols should be presented in software projects.
    
    I am accepting of vegans. But just because I accept them, does not mean that I need vegan memes in my inbox when I subscribe to mailing lists of public software projects.
    
    My point is that there are a lot of ways of life, minorities and such. But acceptance ≠ visibility. And the neutrality point is, if we make one marginalized group visible, then in some ways we are kind of excluding the rest (are hindus worth less than queer people? what about ukrainians? should furries also get a mascot on the kernel website?). So the only way to be truly accepting is to not take sides, be neutral. See a person for their knowledge and contributions, and not their personal life or ideologies.
    
    That, to me, is true acceptance. When you don’t make it a big deal. I get that in the current landscape, there is a lot of promotion of certain subcultures, but many people that I talk to in private say “we don’t really want this, we just want to live in peace and blend in”.
    
    Also, my core criticism of the art has little to do with autism or queerness. But yes, in regards to this “weirdness” (which may not be the right word, more like “nonconformity” or “uniqueness”) is that one should embrace it, on personal projects (as this author has done, which is fine) or communications. But for popular software projects, keep it neutral. Makes sense right?
    
    edk-
    
    I can’t follow your reasoning. “not picking sides” and “judging submissions based on their content” would apply to… submissions. They say nothing about having some art on your website. You seem to think displaying anime jackal girls is not inclusive because some people don’t want to see an anime jackal girl. Do you feel the same way about all other public art? kernel.org and many other Linux websites prominently feature a cartoon penguin, is that also not inclusive?
    
    xfbs
    
    You make a good point, I think I wasn’t quite clear. Sometimes it is difficult to narrow down what you are trying to say (but trying to do so is a good exercise, so I enjoy it anyways).
    
    I can’t follow your reasoning. “not picking sides” and “judging submissions based on their content” would apply to… submissions. They say nothing about having some art on your website.
    
    Correct, so one thing I am trying to say is: public software projects should center around the code itself, and reduce the amount of distractions. As in, reduce anything that could be controversial. Which is, generally, the case. I see a limited amount of GitHub projects that have statements about something unrelated to the code, for example expressing support for Ukraine. And, while I also support Ukraine, I would not use a large open-source software as a platform for that, for the reasons I have laid out before. That, in my opinion, is something that should live on a personal GitHub profile page.
    
    You seem to think displaying anime jackal girls is not inclusive because some people don’t want to see an anime jackal girl. Do you feel the same way about all other public art? kernel.org and many other Linux websites prominently feature a cartoon penguin, is that also not inclusive?
    
    What I am saying is this: many open-source software projects have some kind of logo or mascot. Generally, these fall into one of three categories:
    
    Typography-based logos (think nginx, curl)
    
    Abstract logos (can’t think of examples, but like, geometric shapes)
    
    Animal-based logots (think GitHub octocat)
    
    These are quite uncontroversial. The aim with these is recognizability, it is part of their branding. It is not impossible to offend someone with these (certain geometric shapes could have specific meanings in other cultures, or animals can have certain connotations) but it is less likely.
    
    But there are certain classes of mascots or art that (can, to some people) have other connotations. For example, anime culture also has very sexualised content (I believe it is called hentai, but I am not very knowledgeable about this subculture). Furry art, which I would roughly describe as humanistic animal cartoons, have a similar connotation (this is something I learned from furry friends, who have gone into great detail outlining how that works and how they practise it, but again I am not a subject matter expert).
    
    Even if not every anime character or furry art depicts this, or is intended to depict this, there is still the connotation of it. And I would consider that a distraction. You can easily imagine how some people might be put off because of that. It could even go so far as to people having trauma relating to that (for various reasons, that I don’t want to outline in detail, at least not in public here, but feel free to DM).
    
    Does that make more sense?
    
    edk-
    
    Thank you, I do understand you better now, although I don’t entirely accept everything you’re saying. In particular, you say that “public software projects should center around the code itself, and reduce the amount of distractions” but not why that should be so. I happen to think it’s fine if people add a bit of personal flair to their work. It won’t always be to everyone’s taste, it won’t always be to my taste, but… not everything has to be? I don’t mind giving people some room to be themselves.
    
    As for using a software project as a platform: I don’t know that this has much bearing on the anime jackal girl, which looks to me like simple self-expression rather than activism, but for what it’s worth I think some causes are absolutely worth doing this for. I don’t think there is such a thing as passive support. I’m not sure on what basis I could be convinced that software projects ought to be exempt from talking about things like invasions and genocides; you’re welcome to explain, but I suspect we’ll have to agree to disagree. (To be precise, in case you do want to reply to this, the arguments I think are missing would be that 1. software projects should reduce the amount of distractions, morally speaking; 2. this pursuit is unconditionally more important than protesting against crimes no matter how awful.)
    
    More generally I think calling crimes against humanity controversial and refusing to talk about them serves the interests of the crimes against humanity doers.
    
    Back to the jackal girl though, you make a distinction between art that is generally able to be controversial and art that is not. I don’t deny that this has a basis in reality. But I think that a lot of the controversy comes from people who are being, at best, mildly prejudiced, and if you pander to narrow-minded bigots you are picking a side.
    
    As for the sexual angle to this genre of art: pretty much every art form plays host to some porn. Might that make some people uncomfortable? Yes! But it doesn’t mean we should hide all art away from public view—nobody would even think of suggesting that—so why would it for this specific form? I think you would need to do quite a bit of work to prove anime furry PTSD specifically is a problem worth banning a genre of software mascots to avoid.
    
    pgeorgi
    
    By staying neutral, we are accepting of everyone and can focus on the kernel itself.
    
    This “neutral” you refer to doesn’t exist:
    
    Communicating in English on the list is a choice to the exclusion of everything else.
    
    Showing each contributor’s name is a choice - patch contributions could be treated as “blind auditions” to increase neutrality.
    
    Proudly wearing the corporate email domain in submissions is a choice, where folks (and their contributions) from CCP-controlled-huawei.com might be treated differently from I-have-a-10%-stake-in-them-intel.com folks (and their contributions).
    
    xfbs
    
    Using english is the language that excludes the fewest people (because it has the most speakers), which makes it a natural and neutral choice.
    
    Showing contributor’s names has little to do with neutrality and more with recognizing their work.
    
    Not sure about the email stuff, most people just use whatever email they happen to use. Neutrality means that you judge the submission and not the person (you do not take a bias from the person’s company, for example).
    
    I don’t quite understand what you are getting at
    
    Aks
    
    If someone can’t tolerate queer or autistic (traits one can’t change, nor should!) then they are not people who can work together well with anyone.
    
    xfbs
    
    This is not about intolerance. We are lucky to live in societies that are very tolerant and it should not matter who people are attracted to or if their brains work differently. It is fine if people are “weird” (I dont mean that in a derogitory way, I embrace my weirdness).
    
    The point I was making is that we are all weird, in different ways, and that is fine. But in order to be open for all kinds of other weird people, the projects we work on together should be neutral. So we dont “choose sides”. Or exclude people that are differently weird. Does that make more sense? It is a bit of a philosophical point.
    
    I don’t want some project to suddenly embrace 4chan memes just because they love it and “inclusivity”. I want neutrality. Anime, mangas, memes, furry drawings all those are fine but not something I wanna see in a project I contribute to (those are more for discord servers and such).
    
    ClashTheBunny
    
    What’s weird about the logo?
    
    I’ve always wondered if this is a classic “affirming the consequent”: “I’ve heard of a cat girl kink, I see a cat girl, therefore it’s a kinky thing”? Or “I’ve seen furry folks dress like a cat girl, I see a cat girl, so this is a furry thing”?
    
    Or is it that there is something inherent about cat girls that is weird? Like cat girls are sometimes dehumanized, being either anthropomorphic and treated like objects or enslaved because they’re pets, but because they’re often girls, it’s kinda treating women like slaves or objects?
    
    Or is it just “this is a subculture, people make fun of subcultures, I don’t want to be judged like that, especially when it isn’t my subculture”?
    
    fiatjaf
    
    Would you be ok with it if it was the Pepe the Frog?
    
    mtset
    
    Are you saying that people dislike the Anubis mascot because they think it’s associated with a hate group? Because that’s why people dislike Pepe the Frog.
    
    ClashTheBunny
    
    I think I’m explicitly asking for the reason people feel uncomfortable. Making another metaphor, like “this is like Pepe the Frog”, is just muddying the water. I can then argue against why I wouldn’t want Pepe to show up on my screen for 5 seconds, but that doesn’t argue against Anubis’s mascot. We’re specifically creating straw-person arguments and arguing about them.
    
    iggle
    
    Comment removed by author
    
    fiatjaf
    
    I started running my own work-in-progress git server some months ago and almost immediately started getting bombarded by queries on the same repos – not even all repos, just some. And I don’t think that was even linked in any obvious places like package manager directories or anything like that.
    
    One of the first projects to be discovered by the bots was a fork of a Flutter app that was on GitHub. My fork was unfinished and contained some small changes plus changed the app name to begin with. And yet it got crawled over and over.
    
    My questions are:
    
    What is anyone gaining from that?
    
    If the original project had already been crawled does that mean that my fork would be added as +1 with an almost exactly equal codebase to the LLM training set, making it look like there exists more Flutter apps with those exact patterns in the world than there really are?
    
    What if I had made not one fork, but 99999999 forks, all keeping the same codebase and just changing the name?
    
    srtcd424
    
    What if I had made not one fork, but 99999999 forks, all keeping the same codebase and just changing the name?
    
    Hmmm… a deliberate LLM tarpit targeted specifically at crawlers trying to steal code, generating nonsense repos in real-time in response to requests? Then feed info on the source into some sort of dynamic centralized block list, like the spam traps of yore?
    
    x64k
    
    I like how we went from “don’t run random code you find online” to “running random code you found online as a service” in like thirty years :).
    
    k749gtnc9l3w
    
    We have long been at «is it illegal not to run random code you found online» stage by now (see also: parallel thread on adblocking lawsuits)