The Science of Detecting LLM-Generated Text

5 points by lr0

mullr

What if this big llm vendors kept a database of the text they generated? Or of some kind of (perceptual) hash of the text? They should be able to answer the question “did this particular text come out of your system” with some amount of confidence.

kantord

you would have to trust the vendors. but also, some local LLMs are surprisingly strong even on consumer-grade hardware, so I would not be surprised to see a lot of LLM-generated text coming from local LLMs in the future.

And the potential for local LLMs is something that is hardly explored yet. Companies will put AI in everything, but as the ecosystem matures, actual real life use cases for LLMs might be different than previously imagined. (Coding agents are one example that will be here to stay for sure)