Why open source may not survive the rise of generative AI

7 points by Foxboron


acatton

I'm among the first critics of the plagiarism machine they call "AI" or "LLMs". But I don't buy this article's argument.

From what I understand, the author's thesis is:

  1. Alice writes a coffee-brewing piece of code under GPLv3.
  2. OpenLLM drains the ocean to train their model on Alice's code because they don't give a damn about other people Intellectual Property, they only care about their.
  3. Mallory uses TalkGPT to generate the coffee brewing code for her commercial IoT coffee maker.
  4. Since TalkGPT was trained on Alice's code, it more-or-less spits out Alice's GPL code with some mistakes that Mallory fixes.
  5. Without her knowledge, Malory is now selling an IoT product with GPL violation.

Will this happen? Yes, to my chagrin.

But this was happening before. Cheap device manufacturers from Alibaba would grab whatever code they could find online, and just violate the license. There has been a few posts over the years on reddit (usually /r/opensource or /r/freesoftware) where random people asked for the source code of linux-based devices and got no response or aggressive responses.

What I'm trying to say is that shady business are always going to do shady business practices. Especially when the consequences are low.

I'm unconvinced that LLMs are going to change the situation. To go back to my example, now, Malory has to maintain code that she doesn't understand. Also, when Bob comes along and sends a patch to Alice because he found a misbehavior in the code, Malory won't profit from it. Unless OpenLLM releases a new TalkGPT trained on the new code from Alice, and TalkGPT is able to detect that Malory's code was ripped from Alice and needs to be fixed, which is very unlikely.

So I do think that F/LOSS is here to stay. It's easier for shady businesses to violate the GPL by stealing code instead of using LLMs for deniability.