Building a 64-bit OS from Scratch with Claude Code
28 points by isene
28 points by isene
This post was cool, but this disclaimer should have been at the beginning, not the end:
Transparency
This post was written by Claude Code based on our actual development session. The narrative document was also created by Claude. All code is public domain
All code is public domain.
Actually that reminds me - LLMs can't be assigned copyright... So I guess a lot of code is going to be in the public domain going forward?
Has this already been discussed?
This an ongoing topic of legal debate and court cases.
No one yet knows if making code with this tool means you own the code (like code you make with other tools) or if it's some new thing.
can a compiler be assigned copyright? And yet a binary is copyright protected, despite the fact a compiler made it not a human. It is a derived work from the source code. I'm sure you could make a similar argument in court that a piece of software made with an LLM is derived from the context and prompt you feed it, thus granting it similar protections. Although training data's licenses becomes tricky as it might be seen as derived from that.
if it produces a piece of copyrighted code that it was trained on, there is a risk that you will be liable…
i guess this also applies to humans as well, right?
Imagine accidently adding code you think isba good idea, but you're actually copying code from a non-free open source thing you read a while ago? isn't that the same problem then?
i guess this also applies to humans as well, right?
Yes, of course. It has always been a possibility. The difference is that we humans are able to reflect and communicate, while the LLM is not able to even grasp what copyright is, let alone determine if its output has copied another project's code.
Beyond that, we actively seek out people who haven’t seen the competing implementation to write a clean-room version.
The difference is that we humans are able to reflect and communicate,
Sure, but both a human and the LLM can either actively take code from somewhere, or can accidently take code from protected sources.
the first case isn't worth discussing because it's on purpose (in case of the LLM by the prompter).
the latter case is a more interesting part, because you can't even probove you made all the code yourself.
this is a weird problem because even a tool that recursively generates solutions ti code problems might replicate copyrighted code, as code is also just data.
It is, for instance Wine and ReactOS refuse contributions from anyone who has been anyway near the microsoft codebases
this would be a problem if you were copying the code, that's why in reverse engineering you don't want people who have read the closed source code that you want to replace.
Ever looked at how many "hobby" Unix-like kernels are on Github? There are hundreds of them. A tiny kernel is a very common learning exercise, and most people seem to get it to boot, run a few tasks, find that moving on is hard, drivers for real hardware are hard, and give up soon after getting it booting and displaying some text.
There are umpteen implementations of Forth, too. Recent discussions I've seen on HN and I think here were on the theme of how Forth is a wrapper around assembly, implementing it in higher and higher level languages is wrong, it's pointless, you learn by implementing it brick by brick in assembly, and you'll never use it but you learn a lot by doing that.
So although I've not looked, I bet there are lots and lots of hobby implementations of Forth, too.
LLMs can't think, can't code, can't analyse, can't build... but they are very good at remixing what's in the training corpus, and all of Github and all of every public Gitlab and every other Git instance the crawlers can find is in there -- and crawlers don't care about licences. They ingest whatever they can get, no matter if there's warning text saying you're not allowed to. And that means that all the leaked Windows source code is in there, too.
Claude didn't write this. The slacker issuing prompts didn't write this. Claude is an automatic jigsaw-puzzle builder, and it can make arbitrary pictures from the jigsaw puzzle pieces it has. Don't look too close because there are a lot of overlapping tabs, unfilled blanks, tabs forced into blanks where they don't fit, and so on. But squint at it from a distance and it looks like a picture.
Lots of people like burgers and hotdogs. I wouldn't eat them if you paid me. Meat nauseates me anyway, and the idea of mechanically-recovered meat, of "lips, eyelids, and xrsxholes" as my father called the stuff, dyed pink and salted and mixed with MSG, then cut into tubes and patties and fried, is about as appetising as feeling hungry and nipping down to the graveyard with a shovel. It's vile beyond belief.
And yet, McDonald's is worth nearly $30Bn.
Of stuff that is beyond disgusting to me.
The "I don't care how well your AI works" post had people happily saying "I use LLMs a lot" and complaining about the negativity.
Yes I am complaining about your slop. I detest slop. I don't care who you are, what your slop is shaped into, how you moulded it... it's slop. You're covered in blood and urine and excrement. You smell of it. You are tainted by it, and you will remain tainted by it even after a dozen showers. I will see you in my mind, covered in blood and bodily waste, proudly holding up what you shaped from it.
That was mechanically recovered by grinding up humans' hard work and craft, fuelled on my daughter's future, fuelled on forests and pollution and death.
You are stained by it, and I will not stop judging you.
I think maybe you’d do your blood pressure a favor by taking a break from the vibecoding tag?
Sticking our heads in the sand provides only short term relief. That can be practical at times. But given the viewpoint you replied to, do you think your response here will have a constructive or dismissive effect?
Hobby kernels and Forth implementations have existed for decades, they’re learning exercises, not industrial products, and that’s fine. LLMs aren’t magical, but they also aren’t just ‘remixing’: they generalize patterns, which is why they can produce working code outside the exact training data. Critique is welcome, but exaggeration and disgust metaphors don’t make the argument stronger.
they also aren’t just ‘remixing’: they generalize patterns, which is why they can produce working code outside the exact training data
[[citation needed]]
This is reproducible. The tools are available. The AI is accessible.
This is quite interesting - it would be nice to see how following along might produce slightly different implementations....
The definition for NEXT seems wrong? It loads the next word, not pop from the function stack. The source code on GitHub has a section labeled "Core Forth Words (64-bit)" that are Forth word assembly labels that use NEXT, but they are never called and also wrong afaik; it says that rbp is the function return stack, but lodsq loads from rsi instead and none of them ever manipulate them.
There is instead an entire separate "Forth" interpreter later in the source file, which has each word ret into the next. I also don't really get that one, however, because it has a loop with lodsq as well despite likewise not touching rsi, instead of a normal threaded interpreter where you setup a sled of words and each ret is treated like a tailcall.
Speaking as an AI hater, examples like this are important to stay in the realm of reality. If folks say "AI is absolutely useless in every instance," I think that's incorrect, you could easily give a human without AI assistance this task or a similar one and, like the story of John Henry, the best of the best would only barely keep up with an AI, and their pace would not be sustainable. I don't know of many other devs who could produce an artifact like this in 6 hours, on their own.
I still keep my overall hater opinions, however, because I think such a challenge is rarely actually that useful? Like, it's very entertaining to say things like "make a poem about putting a sandwich in the VCR in the style of King James bible," but how often do you need that to happen? How big is the market for "bespoke, extremely odd requests, on demand"?
Like, a food analogy would be a robot that could either produce prototypes of foods that have never really existed, because such a mix would obviously taste terrible ("make a recipe for ground beef and anise in a 5-layer sweet custard parfait" / "make me a steak with fresh grapes on it") or the robot would make extremely popular foods at an extremely mid level that will still contain occasional mistakes ("make me a recipe for a pot roast," and every now and then it'll give you the wrong advice on oven temperatures or mixing order). I think it's a marvel such a robot exists, and there's a lot about humans and culture and language we can explore for the following decades on the implications of it. But is it useful in the things that matter?
(to say nothing about power usage/environmental considerations, what it means to unleash these on society, the financial speculative bubble and its ramifications, killing the commons like Wikipedia and Stack Overflow and hosting via bot DDoS...)
There are times such a bot will be useful, and I love an example like this OS, which I'll point to if I find someone who's out of their mind about LLMs. But I think you can still be a hater who's "in your mind," as it were.
I suggest the concatenative tag as well. What's the timeout on being able to suggest tags? I guess something under 6 hours.
Though I didn't go look at the code for this, I don't think there's a specific timeout. For example, you can still suggest tags on this 9 day-old post. I think once one suggestion has been applied, the site no longer offers the "suggest" link. In this case, if you look at the moderation log, you can see that:
Action: changed tags from "ai osdev" to "osdev vibecoding"
one suggestion was already applied.
This turned out to be an interesting (albeit surprising) discussion about the current landscape of AI code and copyright. As a copyright abolitionist, my dream may accidentally come true when AI code floods the space to the point where no legal system can keep up and copyright in this area is abandoned.
I asked Claude Code to help me build “Simplicity OS” - an operating system where everything is a Forth word. The entire OS would be like Lego blocks: simple, composable words that directly control hardware. Want to know your screen resolution? SCREEN returns all parameters to the stack. Want to change brightness? SCREEN-SET takes parameters from the stack and applies them.
It seems really odd to try so hard to write an OS controlled by code, while refusing to write code
This was a pure experiment - based on the situation where I had a fever and were really not up to writing any code myself. And it was an interesting exercise.
I thought it was interesting, too. Thank you for sharing it. It wouldn't be something I'd enjoy replicating, but it's neat to see what happened with your experiment.
I'm curious: was the last sentence in your "Transparency" section written by Claude?
The point: Show what’s possible. No gatekeeping. No mystery. Just: “Here’s what we did, here’s how we did it, go build something.”
It seems like a very Claude voice, but I couldn't decide whether I thought that was Claude commenting on itself or you commenting on Claude.
I directed it to write something to the effect of: Make this about showing off what CC can do and make sure I get no credit for any of it.
That's fair; I see it as my job to keep track of what LLMs are capable of, even if I fucking hate using them. So I try them out regularly. I just found the concept of the project very ironic when juxtaposed with doing it entirely with LLMs.