LLM generated submissions should be disallowed

129 points by orib

There's a lot of talk about what to do if an LLM generated article gets posted. I don't believe that there's a clear policy about whether they should be allowed.

I think users posting them regularly should be banned from the site.

A notification on the submission page reminding people that LLM generated articles are not allowed here would also be a good idea.

This should reduce the amount of debate on whether the article should be flagged or commented on.

Internet_Janitor

Sounds good to me. Even if slop occasionally slips through, an explicit policy against LLM-generated content- ideally without carveouts and exceptions to squabble over in the comments- should reduce junk on the front page and provide clear-cut grounds for banning regular offenders.

GavinAnderegg

I agree in general with this. If someone can't be bothered to write something themselves, I'm not interested in reading it. That said, I don't know of a foolproof way of identifying LLM-generated text. I don't love the idea of people (or sources) being banned because the articles they post might be generated. I've been accused of using LLMs in my writing because I sometimes use em-dashes… even though I've been using them for over 25 years now.

Internet_Janitor

The occasional false-positive shouldn't be a problem so long as it isn't a zero-tolerance policy, and I don't see why it would need to be. My reading of the OP is that bans would be at the discretion of mods and in response to a pattern of posting slop repeatedly.

I think it's very important to avoid letting imperfect detection of slop get in the way of having a policy against it. Mistakes will happen from time to time, but we must apply back-pressure against the onslaught of llm-generated garbage flooding the web and choking out human-authored articles.
hailey

The em-dash meme needs to go away. Sure it's a trope, but I don't actually care about it. There are far more obvious tells. Reading LLM generated text makes me feel concussed - there are a lot of words in front of me but for all the text I'm reading I am unable to actually pull much meaning out of it.

hailey

Agreed.

It's usually pretty obvious when something is LLM generated, and in many cases I've seen the author has posted about using LLMs elsewhere on their site, even if they haven't disclosed it in the article in question. That tends to make it pretty clear.

The community's slop radar seems pretty accurate too - I can't recall seeing any big comment threads accusing the author of using LLMs when in fact they have not. If nobody can tell then nobody can tell.

I'm happy to proceed assuming good faith in the truly ambiguous cases, because usually it's blatantly obvious and it's the blatantly obvious stuff that's causing problems. Nobody is trying to game lobsters by sneaking in as many undetected LLM written posts as they can.

addison

I really despise LLM-generated articles and want to see them gone. This extreme case is obvious and likely easy to identify, and I believe there are exceptionally few who would dislike seeing these gone.

Let's suppose someone now submits software where they have accepted some LLM-generated commits. Or, maybe they've generated it entirely with LLMs, but have documented the process as an analysis of doing so. These whataboutisms are me playing devil's advocate, but it's clear that there is a spectrum of tolerance in lobsters. I highly doubt that any content touched by LLMs being banned will be accepted. I think the most likely to be widely accepted answer is flagging without negative karma consequences, just as a way for people to drop a "hey, this is generated to my threshold of unacceptable, heads up" to subsequent viewers. That is largely what the big comment threads now do, and perhaps we can reduce the fighting in the comments + give people some signal about the content they are exposing themselves to.

orib

The other scenarios you provided are different categories, and fairly clearly deliniated ones. If you want to have them treated differently from how they are now, start a new thread.
- addison
  
  I suspect that were this policy implemented, such categories would be flagged the same. People tend to look for witches when you give them torches.
  
  ~~As an aside, I'm not certain if it's intended, but your message comes off a bit hostile~~ (since edited). I'm genuinely trying to engage with what you're proposing. I suspect that such a blunt instrument will not be effective, given that it will be used inappropriately and subjectively. I also want these things gone, but I do not feel that this is the way to do it. Spammers will already get cleared, and slop that only serves as engagement bait already gets pretty quickly marked as spam. This additional step will just invite people to debate even more under every post containing what they perceive or do not perceive to be intolerable slop.
  - orib
    
    I trust the mods to look and remove repeat violations of the policy they intend to enforce. I don't believe that it would make sense for flagging to do anything than alert a moderator of a potential policy violation.
    
    I don't think this is a real problem.
    
    I do think it would make sense to discuss if we should allow vibe coding here, but that's a different thread.
- apg
  
  It’s very easy to immediately yell “LLM slop!” on an article that you don’t like. And then, where are we? I want to see on-topic articles that I agree with, and also that I disagree with. That’s healthy.
  
  I’m not sure how to evaluate articles as being “slop.” There are some obvious examples, and there are some not so obvious examples. There’s also the real possibility of legitimate articles appearing to be sloppy because the author happens to use certain style that Lemons tend to copy.
  
  I think that a submitters “overall sloppiness” on “authored by” submissions might be a “fair” way to deal with this. Consistently posted, obvious slop, flags the author as sloppy. Maybe a mod reaches out says “stop” and if they don’t, they get banned.
  
  Not sure if “sloppy submissions” via someone should be counted the same. It seems that the software could cool down, but not ban someone’s ability to post if they’re constantly submitting what appears to be slop. But, forcing every submitter to defend an article’s provenance or else they get banned, wouldn’t make for a good time.
  - wareya
    
    Not every rule is a slippery slope. There are cases of actual obvious LLM slop and those are enough to be moderated. I think that you underappreciate just how antisocial LLM slop writing is. Right now those very obvious cases hover on the front page for multiple days because the AI bandwagoners are upvoting them.
    
    apg
    
    There are cases of actual obvious LLM slop and those are enough to be moderated
    
    Of course there are extremes, and if there’s a slop button, people will agree, in the same way “spam” is well filtered today.
    
    You conveniently dismissed my point, which ironically, is the point.
    
    I think that you underappreciate just how antisocial LLM slop writing is.
    
    I think you under appreciate just how toxic assuming someone else doesn’t understand is.
    
    wareya
    
    I didn't dismiss your point, I argued past it because I don't think it's well-supported. You're making a slippery slope argument. You might not have thought about it to yourself that way, but that's what your first and final paragraphs are doing.
    
    apg
    
    Tell me: what are the indicators of a slop post?
    
    If your answer is anything close to “I know it when I see it”—there’s the argument. Again, we can all agree that obvious slop is obvious slop. In the absence of objective evaluation, you’ll get subjective bias.
    
    wareya
    
    Extremely high frequency of LLMisms, which change every year-ish but really are distinct from human-written text when published unfiltered. Having way too many em-dashes and "it's not just X, it's Y" and 3-item bullet-point lists and breaking everything down into high school essay format are the tells of about 6 months ago. Human writers do those things, but not with anywhere near the density that LLMs do them.
    
    You don't need an objective evaluation. You only need a good-enough evaluation. There are existing rules that have subjective evaluations. In fact there are vibe-centric items in the posting guidelines. In fact, the very first item on the posting guidelines is vibe-centric: "Lobsters is more of a garden party than a debate club." It even specifically outlines that judgment is often necessary when moderators act: "There isn't a clear-cut line between this and discussing trends and advocating for improvements in the field, so expect frustrating judgement calls." This is normal for rules and guidelines written by mature people for other mature people.
    
    apg
    
    If you can enumerate “LLMisms” then I can tell an LLM to mimic a different style that doesn’t use them.
    
    Your argument is “I can tell you when I see it.” We’re cooked.
    
    And yes, moderation is always subjective in the absence of objective rules…
    
    wareya
    
    My argument is not "I can tell you when I see it." I gave you a list of specific things. When there are too many of them, then I know it's LLM slop. Really. Yes, it's a statistical argument, but it's not "I know it when I see it".
    
    If you make your LLM avoid emitting any LLMisms, then yeah, you're not going to be able to tell that its output came out of an LLM.
    
    Yes, we're probably going to be cooked eventually.
    
    I've seen actually obvious LLM slop articles that were like 15 pages long and probably seeded from like 1.5 pages of real human writing sit on the front page for like two days. If someone wants to get their thinking across to people, they should do it themselves and respect their readers' time, and also respect their own thoughts. When something gets mechanically expanded to 15 pages by an LLM, the arguments and bits of logic get confused and self-contradictory. Same with LLM-driven machine translation into languages that the author can't read or isn't sufficiently literate it.
    
    That's the kind of low-hanging fruit that an LLM content ban needs to address ASAP. It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.
    
    And yes, moderation is always subjective in the absence of objective rules…
    
    Every good moderation system has a carveout for the mods to deal with people who are intentionally abusing the margins of overly-objective rules. Moderation is always subjective, even when it pretends to be objective. Rules aren't laws! They aren't a program!
    
    bitshift
    
    It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.
    
    Well said! Thank you so much for writing this. I'm not anti-AI by a long stretch, but I am against people wasting my time, and I love rules that can be evaluated by readers without relying on suspicions. Banning all LLM content is overly broad, but I'm on board with banning "slop" in the pejorative sense.
    
    Spam is a great comparison. It's okay to post an article from your company's blog—even if technically you got paid for writing it! The rule is against spam, not against money. And it's okay to ban an advertisement that's all fluff—even if you don't know for a fact that the author received cash versus a fruit basket in compensation.
    
    apg
    
    So… you agree. Got it.
    
    mk12
    
    If it were that simple then the labs would bake “avoid LLMisms” into the system prompt.
    
    fleebee
    
    If you can enumerate “LLMisms” then I can tell an LLM to mimic a different style that doesn’t use them.
    
    I don't think it's very likely that the kind of "author" who types a 2-sentence prompt into ChatGPT and publishes the result verbatim in their blog will take the time cover their tracks.
    
    xyproto
    
    We could just fight fire with fire and use an LLM slop detector.
    
    apg
    
    Do you have an accurate slop detector? Cause I’ve never seen one, and since Lemon squeezers are ultimately optimizers, subtle changes until it’s defeated will just be done.
    
    wareya
    
    You don't need one. Just like you don't need absolute accurate detectors for every other rule violation that's currently on the rules. You're being disingenuous.
    
    apg
    
    disingenuous
    
    No. I am pointing out that you can’t “fight fire with fire” if you don’t have a slop detector. You can accept an imperfect detector if you want, but the goal is to continue to have a community that posts quality content to discuss. Too many false positives negatively reduces engagement. Too many false negatives, and all you’ve done is move the slop bar. Moving the slop bar just repeats the cycle.
    
    madhadron
    
    This seems reasonable to me. If someone can't take the time to lay out their thoughts, why should I take the time to read them? If they want to use a chatbot as a rubber duck for working on their argument or checking their grammar, fine. I don't think we even need particular detection, just the expectation of community members and, in blatant cases, removal.
    
    Helithumper
    
    Related: https://lobste.rs/s/wee21u/this_is_written_by_llm_comments_should_be
    
    Some Examples:
    
    https://lobste.rs/s/ojvhq9/coding_is_thinking_why_i_still_write_code
    
    https://lobste.rs/s/eaxtmb/claude_for_legal_suite_plugins_for_legal
    
    https://lobste.rs/s/fvqkke/new_claude_code_programmatic_usage
    
    I do agree that LLM Generated text should be filterable and/or flag-able.
    
    The issue with labelling the content as "off-topic" is the case where the post is LLM-Generated, but actually on-topic. This may result in a contradictory use of off-topic. I think in the past a new option for flagging was discussed as well (e.g. https://lobste.rs/s/po97lh/new_tag_suggestion_genai_assisted)
    
    I still think that a new flag option is a better option instead of abuse of the off-topic flag as discussed in https://lobste.rs/s/rkjpob/proposal_add_ai_generated_as_flag_reason.
    
    orib
    
    I accept your feedback, I phrased that poorly: It should be disallowed.
    
    I don't particularly care about filterable or flaggable. The users posting it should be removed from the site. Flagging or tagging it is a waste unless it leads to action being taken.
    
    k749gtnc9l3w
    
    You are talking about selected worthless examples, but for all the garbage quality of LLM-generated text in https://lobste.rs/s/hfnps5/osmand_s_faster_offline_navigation it actually has unique on-topic content too…
    
    orib
    
    No. I'm talking about not wanting any LLM generated text to be posted, to the best of our ability. Note the people voting it as spam. If I had seen it, I would have been one of them.
    
    k749gtnc9l3w
    
    More people upvoted it, though, because it does have real content. (As you can see, I did complain about garbage LLM writing in the comments, but also tried to guess-the-prompt with bullet points of the meaningful content from the post)
    
    pyj
    
    I'd rather prioritize seeing every blog post from every incredible person here than ever see an LLM generated article.
    
    Some don't hesitate to post their own content, but others do. It'd be nice to have a mass list of blogs from people on this site so I could add to my RSS reader list. "Homepage" is in our profile already so maybe there's a way to generate and make that info available.
    
    oceanhaiyang
    
    All forms of dashes are illegal now
    
    icefox
    
    I don't know why people seem to think em dashes are some kind of smoking gun. Pandoc's HTML output will generate them from markdown --, for example.
    
    bitshift
    
    First they came for the hyphens, and I did not speak out, because I used en-dashes for numeric ranges.
    
    Then they came for the en-dashes, and I did not speak out, because I used em-dashes for the cutting phrases in my haiku…
    
    Then they came for the horizontal rules, and there was no one left to speak out for me.
    
    cyberia
    
    Concerns:
    
    It is not easy to know with perfect accuracy when text is LLM-generated (although in a majority of cases it is obvious).
    
    Occasionally, a post which is somehow "important" or "notable" is LLM-generated. For example the CopyFail report.
    
    Proposal:
    
    LLM-generated content be disallowed except under exceptional circumstances such as high-impact security vulnerabilities.
    
    Judgement of whether text is LLM-generated should be conservative, giving benefit of the doubt in borderline cases.
    
    "Exceptional circumstances" should be at mod discretion, or a list of qualifying circumstances could be specified and iterated on as the policy evolves.
    
    atk
    
    Well, I'd say this is a good idea, but I also think it's going to turn into witch hunts.
    
    The reason LLMs write like they do is because someone out there writes like that.
    
    Or close enough to it.
    
    I don't know what to advise. Other than caution.
    
    orib
    
    Yes, there will be mistakes. The problem with rushing to build an unpleasant future is that things tend to get worse. People that shouldn't have to care about certain problems start needing to. We already opted out of the best outcome, now we're trying to find ways to minimize the tech industry's harm.
    
    It may be worth thinking about what the thngs that get built may be used do, and not merely what you hope they will do.
    
    Fixing things after they get broken is hard, and even a good job leaves scars.
    
    hongminhee
    
    As a non-native English speaker, I'm worried this proposal will hit translators first.
    
    I write in Korean first and use an LLM to translate into English. Sometimes that's Kagi Translate; sometimes Claude, when the subject needs more background. The thoughts are mine and I don't paste the output verbatim. Even after editing, I've been told my writing smells like slop.
    
    Native speakers are much better at noticing what sounds like slop. I can catch it fairly well in Korean, but not in English. I can revise for hours and still miss the tells. The weak point is my English ear, not the argument.
    
    If the test is “does this sound off to native speakers?”, non-native writers will lose. The rule may say “quality”, but the effect is: people like me post less. That pressure is already here on Lobsters. I feel it.
    
    If the question is who came up with the ideas, then translation should not count against the post. It's no different from a grammar checker or a fluent friend's edits. The style may change. The claim does not.
    
    This comment was also written with LLM assistance. To make that check possible, I'm sharing the Korean original here.
    
    wareya
    
    In most cases, "this was partly machine translated from <language X> into english" or something similar at the top of the post would go a long way toward convincing most people that it's not slop. It's not perfect but I think for most cases it would be enough. Maybe I'm just being optimistic though.
    
    Student
    
    Dvorak keyboard generated posts should be banned. I think users posting them regularly should be banned from the site.
    
    Edit: I have flagged this post as spam for being written on a Dvorak keyboard.
    
    hoistbypetard
    
    Can you explain why you think think this is a reasonable analogue? The topic here is about how posts that aren't generated by humans should be disallowed. There's a big difference between a data entry method and how the post was generated.
    
    apg
    
    This is a contentious comment (I saw it at -10, and now at -8), but it’s a fair criticism of the idea as far as I’m concerned.
    
    Rovanion
    
    Care to explain why you think that? I read it as both inflammatory and an apples to oranges comparison.
    
    apg
    
    I’m on the side of limited LLM generated submissions, and banning those who continue to submit them, fwiw.
    
    However, a part of me thinks that good content is good content, and I don’t necessarily care how it was written. If someone authors a blog post with a spell checker, or a grammar checker, or by speech-to-text, but the thought came from them, we probably can agree that it’s OK.
    
    If an author has a thought, uses that to prompt an LLM to build an argument for or against it in a manner that treats the LLM as an “assistant” … is that OK? Where does the line get drawn exactly?
    
    “Vibeblogging” — we can definitely agree is just slop and ban it. “Write a post about how a panda should have been the Linux mascot.” But, “Help me restructure this argument about why Object Oriented Programming blah blah blahs under the OOPS theorem of blah” … not sure?
    
    The OP, here, points at this, albeit, as you say in a potentially inflammatory way.