Principia Softwarica
88 points by runxiyu
88 points by runxiyu
The title is clearly a reference to ‘Principia Mathematica,’ but ‘softwarica’ is not Latin. It’s like someone writing a purportedly C function call as foo(bar; baz]. A better title might be ‘Principia Programmatum’ (The Principles of Programs) or ‘Principia Artis Programmatoriae’ (The Principles of the Art of Programming).
If we want to be pedantic, "Programmatum" does not exist either. The closest would be "Programmum" but it means "proclamation, edict", not program. There isn't any word in Latin to describe software or programming so it's fine to invent a new one.
It's true that "Programmum"/"Programmatum" sounds more correct because it's derived from the English "program" which is itself derived from French "programme", itself from Latin "programmum". But we can take one step closer to the original and use the French word "logiciel" ("programme" is also a valid word for software in French, but this meaning comes from English), which comes from "logique" (logic) + "iel" from "matériel" (hardware / made of matter).
So, as "logique" comes from "logica" and "matériel" from "materialis" and the noun form of "materialis" is "materia" (matter), we could translate "program" to "logicia" and "programs" to "logicium".
(disclaimer: I don't really know Latin, only studied it for two year in high school)
If we want to be pedantic,
Libenter :)
There is no word "programmum" in Latin: "programma" is attested in Late Latin as a loan from the Greek "πρόγραμμα", and its genitive plural is then indeed "programmatum". Its use with the meaning "computer program" is reportedly attested in the Vox Latina journal (and also on Vicipaedia, for what it's worth).
I thought butchering the Latin was kind of the point. It's a dead language, and a title referencing a well known failed project. Titling any English language work in proper Latin today would be mildly ridiculous, and that's coming from someone who used to read Latin quite fluently.
But Latin is not a dead language, though. Citing Wikipedia:
While often called a "dead language", Latin did not undergo language death. Between the 6th and 9th centuries, natural language change in the vernacular Latin of different regions evolved into distinct Romance languages. After the fall of the Western Roman Empire, Latin remained the common language of international communication, science, scholarship and academia in Europe into the early 19th century, by which time modern languages had supplanted it in common academic and political usage.
Also, a language being dead does not mean that there's no people who care about speaking it correctly or that it's free for all to butcher. Someone's gotta maintain that legacy grammar ;-)
Latin is not a dead language
My grandfather told me a story about hitchhiking in continental europe between the wars. He was in a youth hostel one morning, and tried to strike up a conversation with another young man while they were shaving in the bathroom. They found the only language they had in common was latin!
My grandfather read classics at Oxford, so he was more likely to reach for the language than most people. (And he wouldn’t have told the story if it had been a common occurrence!) Even so, basic latin was a mandatory requirement for every student at Oxford and Cambridge and I expect many other universities right through the 1960s.
But since the 1960s the number of people in Britain who are routinely taught latin to secondary school standard has declined dramatically.
It’s still used for ceremonial purposes in places like Oxford and Cambridge. (For instance Cambridge University has an annual ceremony once a year for which a classics professor has to write an oration in latin, which is apparently witty if you know the lingo.) And latin is the official language of the Roman Catholic church, so it’s living enough to be the national language of a weird microstate.
But 90ish years after my grandfather’s unlikely story, there’s no chance that two gap year students meeting in a youth hostel might both know more latin than “eheu! caecilius est in horto!”
Over half a million college students in Germany learn Latin, usually for 4+ years.
However, today it's more likely they find another common language to communicate with. :)
While often called a "dead language", Latin did not undergo language death.
I think linguists did the world a disservice by calling languages dead and alive. I wish they had used neutral words like crystallized and liquid or amorphous instead. Languages such as Latin are not dead. They continue to be used, though evolution has stopped.
Dipping in Wikipedia, although I was a bit confused by different articles on Neo-Latin and Contemporary Latin... I found what I was looking for: Lexicon Recentis Latinitatis. Because some guys still publish new documents about the current world in Latin, someone has to come up with new words for Latin.
So likely Latin is evolving so slowly that you could call it a dead language, but it does evolve. (Edit: to continue the analogies... it's likely on a very LTS state, with someone still pushing very small updates.)
Initially I balked at the mention of Plan 9, but the detailed explanation of why the author chose Plan 9 (TLDR: it's small) is a great piece of writing – in itself an excellent advertisement for the books!
Why did you balk at the plan 9 mention? Just curious
Only because I have no Plan 9 experience, and I'm so time-poor that I instinctively flinch at the prospect of having a Whole New Thing to learn about! I (sadly!) don't actually have time to read these books right now, irrespective of the OS in question, so it was just a reflexive reaction rather than something more thought out.
I love how author ported OCaml to plan 9 and then rewrote large parts of userspace in OCaml and to better understand what happens in C.
In process the code also became runnable on Windows of all things :)
Curious who funds this.
Interesting that a very old version of OCaml was used (1.07): https://aryx.github.io/xix/#about
This is an insane undertaking. 17 books over 6k+ pages. All seem to have been released last week?
Does anyone have any additional context or background on the author?
See this 9fans thread
https://www.mail-archive.com/9fans@9fans.net/msg45156.html
TL;DR: it started in 2014 and the author is going to present his work at IWP9 tomorrow 13:45 local (Victoria, BC, Canada) time: http://iwp9.org/#prg
Thank you, this is exactly what I was hoping for.
There is a bit of drama in the thread in regards to AI usage:
yes I started to use Claude Code in March this year to assist in writing. I started this project though in 2014 and spent almost 6 years full time on it since 2014 decomposing every programs in many parts and books.
Worth reading the thread before committing to reading the books, I think.
I can't help but think that a couple shorter books would've been more useful to readers. I'm sure the author has learned a ton during the process, however.
Principia Softwarica is a series of books explaining how things work in a computer by describing with full details all the source code of all the essential programs used by a programmer. Among those essential programs are the kernel, the shell, the windowing system, the compiler, the linker, the editor, or the debugger. Each program will be covered by a separate book.
The operating system they target appears to be Plan 9
The operating system they target appears to be Plan 9
This is a very poor, inaccurate summary; did you actually read the page?
« Learn Here, Apply Everywhere
You do not need to use Plan 9. Understanding one small elegant OS gives deep intuition about Linux, macOS, and even Windows. »
I definitely do understand that, although I should have rephrased it as s/target Plan 9/use Plan 9 as the primary example/.
It does mean the codebase is marginally more familiar to me than the other OSes though.
windows actually has a 9P implementation in it
it's used for the WSL2<->windows filesystem bridge. i wish it was exposed more generally as a FUSE-like thing (but from what i can tell, it's slower than SMB)
My impression is 9p is a bad fit for POSIX semantics and an even worse one for NT.
My impression is 9p is a bad fit for POSIX semantics and an even worse one for NT.
It does have to be said that there was an incredible amount of tension between the POSIX community that insisted that fork() was entirely adequate, and the NT one that demanded threads not to mention the weirdos who wanted coroutines ("fibers").
Perhaps the culmination was in the migration from Delphi (NT, hence threads) to FPC/Lazarus (broadly POSIX, which ended up implementing threads at different times on different target architectures).
History shows that threads and coroutines/fibers are extremely useful, but "can be difficult to debug".
Which- in principle- has given us Rust as the answer to a maiden's prayer.
This isn't about process model, but rather the filesystem semantics of 9p vs. NT or POSIX (it doesn't fit cleanly onto either)
but from what i can tell, it's slower than SMB
A few years ago I tried to use rsync inside of WSL to copy a lot of data (I don't really want to guess how much), and the ETA was well over a day. I snooped around, and saw some people say that the 9P bridge is slow - so I tried out the native rsync build. It finished in less than an hour.
Yet in the very first paragraph it states:
All the programs come from Plan 9 from Bell Labs,
So it appears that OP is correct. OP didn’t claim that it wasn’t transferable to other systems.
No, it is not correct. It specifically says, at some length, that the lessons are intended for users of all OSes but especially the C/Unix family of OSes, and spells out very clearly and explicitly that it uses Plan 9 because the code is exceptionally small, concise and clean.
It does not target Plan 9.
It explains Linux, FreeBSD, and all other Unix-like OSes using Plan 9 as its tutorial example. It is not "targetting" Plan 9. It is targetting people, and explaining Unix and C concepts using Plan 9 because the entire OS is half the size of Vim.
Saying it "targets Plan 9" is actively misleading.
The issue was mere word choice. Your opposite accusations (not having read it vs. being manipulative) are kinda over the top.
As you wish. I am used to being downvoted. I think it was a vital clarification, and I do not really care if lots of people disagree with that.
Shockingly, a lot of the world runs on incorrect and fallacious beliefs, and you get shouted at when you point out that they are wrong. They remain wrong, however angry they make people.
Project Oberon (Wirth & Gutknecht) is the closest in spirit to Principia: [...] However, Oberon can only run Oberon programs
While I understand what the author means here, this is IMO not a restriction of the Oberon system; in principle, any programming language can target its architecture and runtime.
I see that it was forked from Plan 9, but between Plan 9 and Inferno, which is better?
I see that it was forked from Plan 9
No, it was not "forked from", as far as I can see. It uses Plan 9 as an example, and it works through its examples.
It also clearly explains why Plan 9. The most relevant part of the answer to your question being that Plan 9 is written in C. A highly restrictive form of C, at that.
Inferno is not written in C. The point of Inferno is that it is written in an architecture-neutral language that runs on a VM that is part of the kernel.
That language is Limbo.
(In fact, since Inferno didn't catch on, that language is more or less the only one that targets that VM.)
Because Inferno didn't catch on, Limbo did not catch on. That means that few people know Limbo, whereas a lot of people know C. This also means that there are very few reference materials for Inferno, while there are absolutely loads for C, in more or less every human language from every society that uses computers.
So, for this pedagogical purpose, Plan 9 is better than C: because of the world's familiarity with the language Plan 9 is written in.
I was puzzled about this until I found the fork in question. It's linked in a mailing list post mentioned by another comment on this page.
Oh, and that means yet another fork of Plan 9 here
https://github.com/aryx/principia-softwarica
I know ...
In its readme:
A fork of Plan 9 from Bell Labs, curated for education.
Agree to what you say on Plan 9 and Inferno. However the code in this book(s) is a fork of Plan 9. The first slide says so: "A New Plan 9 Fork".
I am mystified. What slide? I see no slides or presentation. It's a book, or rather, a series of books.
« Principia Softwarica is a series of books explaining how things work in a computer by describing with full details all the source code of all the essential programs used by a programmer. »
These are literally the first lines on the page, excluding titles and links.
Sorry, forgot to post the link: https://aryx.github.io/assets/pdfs/slides.pdf
Search "presentation slides" in the page.
Wow, this looks very ambitious. I believe reading good code -- especially with explanations for design choices -- makes one a better programmer. Maybe particularly if that code is for a domain just outside one's own. This should be a great tool for learning.
I have seen an interesting post on the 9fans mailing list regarding their GenAI use: https://www.mail-archive.com/9fans@9fans.net/msg45167.html.
Also worth noting: https://git.9front.org/plan9front/9front/2e21c09e8cdbca26aa3b069699239467e1fabc40/commit.html.
So what are the advantages of basing this on Plan 9 rather than on (as a specific example) Minix, the architecture of which is validated by the /substantial/ number of systems on which it is installed?
I can think of several.
Also, a monolithic kernel like plan9 is perhaps more useful as a teaching tool in that it’s more like Linux, Darwin and NT. (Very large brush, don’t fight me please.)
That said, I would still be very interested in learning about a microkernel with the same pedagogic method as this.
Also a valid point, I think, although I'd be interested in @markmll's response.
I was assuming that an OS written by an experienced educator with the intention of its being a teaching tool would be a better explanatory resource than an OS written by an experienced programmer with the intention of its being used for production purposes. I'm also assuming that a very tight microkernel-based OS would be easier to "prove correct" when expanded for SMP operation, and could similarly be "proven correct" for non-standard caching schemes on NUMA systems ("proven" used cautiously, since I am aware of the practical problems).
Having said which, I would never propose either Ameoba (Tanenbaum) or Oberon (Wirth) for the purpose of exposition: they were both too far from being even minimally usable for day-to-day computing when worked on as research projects.
"Completeness" might or might not be an issue, after all GCC or Clang are arguably more "complete" than the Plan 9 compiler since they incorporate optimisation techniques which were unknown (or for which no feasible algorithm was available) in Plan 9's heyday.
I feel that a stronger argument in Plan 9's favour is the relative rarity of MINIX-style microkernels in production environments, as @iv has said it would be interesting if somebody could expound upon one of the more recent microkernel families such as L4 /particularly/ if the textual structure were the same as for the monolithic kernel.
However with the increasing emphasis on systems which are either virtualised (IBM's VM, Linux's KVM etc.) or containerised (Docker etc.), and the sporadic attempts to run "only as much as is really needed" on the bare metal, perhaps we should be asking ourselves whether it would in fact be better to teach microkernel technology as being more representative of the "move difficult stuff into a VM or container" tendency.
Finally, I'd note that like many others I have misgivings about throwing the current generation of "AI" at every problem and expecting to get a coherent result. However a compiler which I use on occasion has recently been forked by somebody who is apparently using commercial/proprietary AI to add facilities to which the core developers are vehemently opposed: whether we really do need Pascal turned into yet-another-BASIC is a question I shall leave to the readers :-)
I was assuming that an OS written by an experienced educator with the intention of its being a teaching tool would be a better explanatory resource than an OS written by an experienced programmer with the intention of its being used for production purposes.
That's a good point.
The problem is, I think, is that Minix 3 seemed to change direction later on, and shift from being a teching tool to trying to be something pragmatic and useful in what I think was an attempt to attract attention and contributors. I would be happy to be wrong, mind you.
Intel should have donated, both money and manpower, in return for the vast deployments of Minix 3 inside the SMU of every Intel CPU for now approaching 2 decades, which saved it $millions. But it didn't, and now, arguably, it is reaping the rewards.
What was it: /* You are not expected to understand this. */ ?
I have enormous respect for the guys at Bell Labs, but even if that was expected to only be for internal use I feel that a comment of that nature really is dodgy: if the author couldn't be bothered to explain himself inline he should have explicitly referenced his design notes, if for no reason other than the fact that he might fall under a bus on the way back from lunch.
I'm not saying that Tanenbaum, Wirth et al. were always consistently better. But at least as they matured they realised that they were writing for an audience, and tried to live up to it.
However a compiler which I use on occasion has recently been forked ... whether we really do need Pascal turned into yet-another-BASIC is a question I shall leave to the readers :-)
Please note that I am not referring to https://lobste.rs/s/ot6g23/blaise_modern_self_hosting_object_pascal here but to a discussion (currently occupying 27 pages) on the Free Pascal users' forum which started off with an (entirely reasonable IMO) attempt to sort out a scoping irregularity and spiraled; the comment relating to AI is at https://forum.lazarus.freepascal.org/index.php/topic,73678.msg581369.html#msg581369
So it's not just the OS, its a lot more. The slides show that basing it on Linux would many many many times more. I don't know how much lines in comparison it would be to base it on Minix though
Unix did not have networking as an initial part of the design
I interviewed for a job at Apricot where I was amused to see... what was it... "Swansea University Computer Society Net3" as part of their startup messages. They were in Birmingham, I was in Devon... I needed the cash but go figure :-)
The thing that everybody seems to raise is that "in *n?x, everything is a file". But broadly speaking network interface devices aren't files, individual listening sockets aren't files, USB endpoints aren't files, GUI windows (i.e. with title bar and other furniture) aren't files: there's a major problem here since even if an OS /demanded/ that everything appear in the global namespace based on / the basic file API is inadequate.
I suspect that a unified ioctl() API intended to handle all current devices would make the worst excesses of POSIX and IEEE look tame. And as for the devices that nobody's yet thought of...