combat LLM spam by building a web of trust
44 points by op
44 points by op
I have a better, simpler idea: enact strong anti-LLM policies, and enforce them. Also, move off of platforms that encourage LLM use or are pro-AI, such as GitHub.
Is it 100% effective? No. Some may try to hide their LLM use, but... it usually comes out, and then a banhammer is flinged. They'll learn very quickly not to do that. Chances are, many will look at the policy and decide not to try, because their LLM-induced dopamine rush would be instantly killed by the stress of being afraid of getting caught and getting banned.
And if it's a corporation pushing the LLM spam? Ban the entire corporation. If you self-host your forge, ban their corporate network on the firewall.
Problem solved, and we don't need to introduce a silly system that penalizes first-time and drive-by contributors. As with crawler defenses, proof of work ultimately hurts the little people, and vouching is proof of work. Don't hurt the little people. Hurt the baddies, and you can do that more effectively with a strong and swiftly enforced anti-LLM policy.
The goal of tangled sounds more like avoiding spam than just avoiding LLMs altogether.
Specifically,
To ease this burden, maintainers from across the Tangled network can now vouch for or denounce contributors that misuse these tools and create a maintenance burden.
Emphasis mine.
Is there any other spam (when it comes to contributions to a project hosted on a forge) than LLM spam, though? :)
Ok, fine, there's the kind of spam that advertises scam services in issues and whatnot - closing open registration usually takes care of most of that.
There's also all the low effort drive by PRs and "doesn't work" style issues we've been suffering with for years. I've never gotten LLM related spam, but I do get these other kinds
These don't benefit as much from trust signals, though. If it's a bad report from a human, you see it right away. If it's a bad report from an LLM, it can take much longer to understand that it's wrong because it appears detailed and insightful.
You could use these signals to outright ban trolls / malicious users, but this gives them an oracle; if they want to keep wasting your time, they will just create a new account.
But this thread underscores why the entire scheme is tricky. Trust isn't binary. Some projects will ban AI contributions. Others won't mind as long as the contributions are high quality. A vibecoded project might vouch for low quality AI submissions, and since vibecoded projects are easy to crank out, they might end up being the majority. Unless you have some centralized, uniform policy enforcement, how do you stop that?
You only see vouches made by you or accounts you vouched for, so this scheme already accounts for that. As maintainer of a project allowing LLM contributions, you aren’t gonna see the denounce someone got for a LLM PR to another project banning it and vice versa.
You'd need a whole new platform for that and I don't think it would be effective. Many projects are ok with LLM submissions. Many devs are ok with switching use depending on the project. And risking a ban on a platform just because someone wants to run an LLM witch-hunt sounds counterproductive. You can already see occasional incorrect accusations of LLM writing the post, here and on the orange site. Having to deal with that in PRs would be really sad.
The stated goal here is to avoid "spam", not "LLM".
vouch for or denounce contributors that misuse these tools and create a maintenance burden
(emphasis mine)
This system could also be used to avoid contributors that create a maintenance burden the "old fashioned way". It seems like a more advanced version of the "commit bit".
I think this is the answer for when someone has broken policies, not what specific policy they broke.
If you have an anti-LLM policy, you can enforce it with this.
If you have an anti-harassment policy, you can enforce it with this.
If it's not gated on someone submitting a PR to your project specifically, I predict this will become useless at best, toxic at worst. Some people will just mass-denounce users who ever used LLMs. Then the brigading for other reasons will start. This is the common path of moderation systems and I don't see anything counteracting this behaviour here.
Edit: To be clear, I'm not opposed to the idea itself - the web of trust is cool. But this project talks about the technical side only, not social. If someone tries to create a moderation system and doesn't have a huge "how will this scale without abuse" section with outcomes baked into the system, they're in for a surprise.
You only see vouches made by you or accounts you vouched for, so this scheme already accounts for that, as you won’t see these mass denounces of users you don’t know.
I'm afraid of this happening too, but I guess we'll have to wait for the first high-profile cancellation to know for sure or not.
theres a concept of decay built in, that is one step in the right direction. and right now it doesnt control policy. this is starting with a social motivator to address a social issue, kinda neat experiment and clever design.
The problem is that the system doesn't control which issue it is and crams them all under the same label. For one person the problem is going to be all low quality stuff, for another it's any LLM use, for another it's trans developers.
Additionally, there are no consequences for a denounced user. Only a hat.
what's the point then? I still have to process the PR
I'm guessing this is just a starting point to see how the system works, with additional functionality (e.g. blocking people based on their trust level) being added later.
this might be something i add in later! i think to begin with, i'd like to test the thing out @yorickpeterse says. down the line, i'd want to let users choose their "reactions" to denounced users: be it blocking, or de-prioritizing or whatever.
Does anything prevent me from spinning up a few different domains with a million users apiece to vouch for each other? Then other people can just buy difficult-to-disentangle reputations from me.
I'd prefer something more like lobste.rs tree of invites model, 'cause if someone starts abusing it then it's easier to just snip off an entire subtree of it. It also grows more slowly, which IMO is a feature.
I like the human.json model (https://codeberg.org/robida/human.json) especially how it is visualized in the extension. It finds the shortest path between a site you trust, color coding the distance, and showing the route.
For human.json, presumably no one would vouch for any nodes in your network or youd have so few inbound connections that the distance would be large. So it's not so much that you can't get sites into the network, but that vouches and untrusting would bounce you out quickly. How it works in practice is TBD.
Oh thanks for pointing that out, I added it to my site.
Trying out the extension for the first time, on your site it says "Not participating in human.json", is that me or do you see that as well?
A hat appears over a user only if you have directly vouched/denounced them, or if somebody you have vouched for has vouched/denounced them.
The labeling is per-user, so as long as you only vouch for people you trust not to vouch millions of random people, your vouches will be totally unaffected by any sybil-type activity.
It would be nice if there were a petnames-like UI layer, so you could see e.g. "vouched for by X, Y, Z" either inline, or on hover.
It would be really interesting to see a "vouch" model that scores the the amount of vouches based on how distinct they are (using the lobsters style invite tree).
For example, if someone gets 100 vouches, but they all share the same ancestor, vs someone who gets 5 very distinct vouches, then those 5 would count more?
In wonder how much that could combat "reputation farming"
you would only really be able to form your own "circle" of vouches among your bots. the rest of the network will not be able to see your circle's decisions, unless they start vouching for your bot accounts.
ultimately all data is public and somebody can create tangled2.org that creates a global graph but the vouches are intentionally attenuated past your circle in the UI.
I like the idea, I've been wondering though what if we just communicate naturally? It seems every minor communication is just so well formatted and consistent. Leave you typos in the things you type create a genuine fingerprint.
I feel like we’re speed running the trust metric research done mostly contemporaneously with the birth of open source. I wonder what @raph thinks about all this…
Oh wow, there's a name I recognise from the good old days :) I also think Slashdot deserves some credit for its triple-layered meta-moderation system - it wasn't perfect, but it was definitely significantly better than nothing.
There are like six of these already. Why do another one rather than join forces with an existing one?