GNU and the AI reimplementations
19 points by amontalenti
19 points by amontalenti
Moreover, this time the inbalance of force is in the right direction: big corporations always had the ability to spend obscene amounts of money in order to copy systems,
But now they can do it for a negligible amount of money. Whereas previously buying a commercial license for otherwise copylefted software was a reasonable choice to avoid spending said "obscene amounts of money", now we've killed that business model. The FOSS funding situation was already horrible pre-AI; yet maintainers gotta eat nonetheless.
Now, small groups of individuals can do the same to big companies' software systems:
I assume most maintainers would prefer to be paid in money, which can be exchanged for food and shelter, rather than getting paid in the ability to do "clean-room" rewrites of proprietary software.
by the way, I got clickbaited :( when I saw this on the IRC channel I thought this would be an actual response from the GNU project. oh well
I think it is dangerous that people like Antirez, who we look up to, write these kind of articles without a clear disclaimer at the top "this is an opionion, i am not a copyright expert or lawyer".
Because, and this is a fact, most of this is opinions and wishful thinking about a topic with a largely unclear legal status right now. There is barely any jurisprudence to refer to. This story is also incomplete because copyright law is very complicated and it is not just copying, there is also derivative works and there is context and ownership and a whole thing that happens on the side of the LLM providers, etc. Nothing about this simple or remotely similar to what we did decades ago when GNU was born.
Hi st3fan. What you write is not how jurisprudence works. For things that never happened before, it's grey area, like, is it fair use or not to train an LLM on XYZ? But this is different. Unless a new law is made, the old copyright law applies perfectly when you create code with LLMs. Does it violates the copyright law? It is a problem. Otherwise it is not. If you invent a new gun the old laws still apply if you kill somebody: this is a trivialization, but that's how it works. The grey areas are for new things. Copyright laws perfectly describe if some code is in violation or not of some other code.
We aren't lawyers, but I see no reason why one creative work, e.g. an image, is different from another creative work, e.g. code. SCOTUS declined to look at Thaler v. Perlmutter. LLM generated work is not copyrightable. Vibed code is in the public domain.
That's a big claim you're making there. Have you seen any commentary from genuine legal experts that agrees with your claim there?
Given that billions of dollars of software has been created by serious (brand name) companies using AI assisted programming tools over the past 24 months I would expect there to be way more credible commentary on this than I've seen so far.
A huge difference is that in Thaler v. Perlmutter, Thaler listed the AI itself as the work's sole author. He later tried to claim that he was the actual author because he made the AI, but the court said "no, that's not what you said on the application, you don't get to change your mind now".
I think the broader point is still that you aren't a lawyer (most of us here aren't), and whether it's well-trodden territory or novel, our legal opinions are less informed and less useful because of that.
Domain expertise doesn't generalize to other domains.
Given how poor a track record actual lawyers have at predicting the outcomes of copyright cases, maybe this is just gatekeeping.
Rewriting proprietary software to make a copyleft version is good. Rewriting a copyleft version to bypass rights of users is bad.
The article's whole gotcha is based on misunderstanding of GNU. They don't care about copyright per se, they care about people having freedom to control their software. Licenses are merely a tool, and also one that evidently has stopped working.
Tanenbaum protested about the architecture (in the famous exchange), not about copyright infringement. So, we could reasonably assume Tanenbaum considered rewrites fair
Only if you take as given Tanenbaum believed Linux to be a rewrite of minix. And I’m pretty sure he did not.
No mention of Thaler v. Perlmutter misses the most important part of the story: vibe coded source code has no copyright in USA.
The Thaler vs. Perlmutter opinion is very readable, I encourage people to read the first few pages: https://media.cadc.uscourts.gov/opinions/docs/2025/03/23-5233.pdf
It addresses a much narrower claim: whether AI can be legally considered the sole author of a work of art for the purpose of copyright. It explicitly does not opine on whether Thaler would have been granted the copyright if he listed himself.
I find it a bit difficult to point at that case because it is very different. It is a case where somone tried to copyright a completely new work of art (a picture). That copyright application was rejected by the copyright office. They basically said "AI cannot be an author under copyright law". (This is what Thaler tried to do - he tried to make his program (The creativeity machine) the owner of that copyright). And the Perlmutter in this case is the person representing the copyright office. Not the owner of the original work. AFAIK there is no original work for that specific case. (Their "Almost Paradise" visual artwork)
It is of course relevant for GenAI in general but I think that is where the similarities end. "Reimplementations of software" is about .. software. Where there is an original and a reimplementation. And a reimplementation is usually not a completely new original work - there may be, such as with chardet, API similarities for compatibility. Or pieces of code that 1:1 map to another work.
I'm sure Thaler v. Perlmutter is relevant in some way but we haven't had a real case about software yet .. so it is really unclear what would happen there.
I think focusing on Thaler v Perlmutter is a bit of a false goal to hunt down, because ultimately the appelate courts and the SC both basically said the gudiance of the Copyright Office was correct. If we presume this extends to their entire guidance on AI works (found here: https://www.copyright.gov/ai/ai_policy_guidance.pdf ) then AI works are only copyrighted when the human behind the AI was the actual author of the work and used AI to give it form. If the AI is merely following an automatic process, there would be no copyright.
So given if we simply tell an AI to implement a cleanroom (with two teams of AI for the two sides of a cleanroom), then it would be a purely mechanical process that forms no copyright basis.
Moreover, this time the inbalance of force is in the right direction: (...) Now, small groups of individuals can do the same to big companies' software systems: they can compete on ideas now that a synthetic workforce is cheaper for many.
Well, this is omitting the elephant in the room: no small group of individual created one of these horizon LLMs so far. Only big corporation having access to truckload of data and hardware are currently creating and operating those. These corporations have total control on these new tools. The bleeding edge programmers tools are not open source anymore.
If you were up trying to build one of those, let alone hardware and energy access, I'd bet you'd end up in jail long before collecting all this copyrighted corpus these big corporations collected.
I'd argue the power imbalance is now even worse now than before.
The other thing that is relevant here, in my opinion, is the slow erosion in the popularity of the GPL in favor of less restrictive licenses like Apache/MIT. This is relevant because as tools are reimplemented (either the old-fashioned way or the new-fashioned way), they tend to not adopt the GPL. This is troubling for anyone who is a proponent of the GPL because the GPL is only relevant when there is a certain critical mass of GPL-licensed products.
Is GPL relevant if any software can be generated on the spot? The point of GPL is to democratise software, take away exclusivity from Big Corpos. Do we still need GPL if pretty much any software is can be made at any time? And also under permissive license. That is while not GPL, the code is still available if you want it for some reason, and you still can use it.
I think software, specifically source code, in general is becoming less relevant. It is now an option to roll your own instead of using open source or closed source or an proprietary OS-provided libary, business licensed software, etc. You can ask an AI to build you software either unique or following some specification.
Crazy story as an example:
I was looking at https://github.com/openai/symphony which is a project OpenAI did in Elixir. Their README literally says: "if you do not like our implementation in Elixir then run our 2100 line SPEC.md past your coding agent and ask it to implement this project in your preferred language"
This just blows my mind. We've gone very rapidly to a situation where software can now be a natural language Specification that you feed into a program and as a result a program rolls out of it. Do I now own that? Can I put any license on that? Is that now 100% mine without worries? (IANAL but the answer is most likely yes?)
Sofware as we know it is not dead yet but we you can see where things are heading.
I think the GPL is even more relevant, not less. The point of the GPL license is to protect the source code in such a way that users can continue to access it if the source code gets modified and distributed. In a world when AI code generators are prevalent and the barriers to making code changes plummets, I think a GPL proponent would be even more concerned that the code changes are accessible to everyone
This was/is one of my first thoughts as well. I would like to use strong copyleft licenses more, because putting something out there for people to enjoy/experience and getting it ripped off from a LLM to serve to users for a subscription, without attribution, feels bad.
Do we still need GPL if pretty much any software is can be made at any time?
It's easy to imagine a future where the purveyors of GenAI gate access to certain features behind higher prices.
Generate a cute flyer for a birthday party? Free.
Generate fan-art? $10/month for 100 pieces.
Generate software? $200/month, because you need to pay it to keep up with the competition.
Don't mistake the current all-you-can-eat buffet as anything other than VC-funded loss-leading to entrench GenAI in all parts of society, in expectation of collecting rent once they succeed.
GPL licenses started losing mindshare long before LLMs appeared. There's always been a robust counternarrative (chiefly from the BSD camp) against the views of Stallman/FSF on how to best organize non-restrictive software licensing.
I don't know the exact numbers, but I would wager that before the widespread adoption of LLMs, the ratio of non-GPL licenses on places like Github was maybe 80%. Of course, some projects are more "foundational" than others so just looking at raw project counts is misleading.
I personally think that taking the code that somebody wrote, feeding it to Claude code, and asking it to create a rewrite of it for purposes of avoiding a copyleft license is tempting, but a bad idea.