A Vision for Future Low-Level Languages
23 points by zmitchell
23 points by zmitchell
Unless I missed something, this seems like a very common desire. Most people want to use 1 language rather than 2 or 3. It's less to remember
But the problem is actually doing it :-) And I had to skip to the end fo the article to see:
Many of the things I’ve mentioned in this article are ideals. I don’t know how well some of them will turn out in practice
It feels like it should reference Ousterhout's Dichotomy
Recent related thread on Rust: https://lobste.rs/s/pwsnpd/powerletters_for_rust#c_irvu2i
Also related: Rust’s Ugly Syntax - i.e. it's not "just" the syntax; it's that low level code needs to express more semantics.
That is, more semantics are relevant to low-level code -- you don't want to make things that are relevant to a problem invisible
And https://lobste.rs/s/jsriyn/on_ousterhout_s_dichotomy :)
Some thoughts:
I might feel easier to start with low-level, but given the experience of Objective-C I think this is a misleading feeling.
Python + C is obviously not an example of "one language", as it is two languages, with exactly the kinds of barriers you'd expect. Well-written Objective-C code can do this, as you can mix and match much more freely. It does take some care to design your high-level, convenient APIs in such a way that they can also be both implemented and, importantly, used efficiently.
A lot of my book iOS and macOS Performance Tuning deals with exactly that sort of design: convenient and fast, based on my almost 40 years of experience writing fast, convenient code with that tooling.
While focused on Objective-C, the principles are more generally applicable, IMHO.
That given high-level languages may not be well-suited is, just with the operating system dichotomy I described above, mostly accidental. If you design a language to span from low- to high-level I think it is much, much better to start with high-level, which is what I've done with Objective-S. You get more low-level, machine-level constructs by hardening.
You harden a late-bound message-send to become a procedure call.
You harden a number object to be a 64 bit integer.
You harden a polymorphic reference to a pointer.
Often this can be accomplished simply by adding more restrictive types. My experience is that you don't actually need to do this a lot, though where exactly you need it is difficult to tell before you start, which is why having the flexibility of a single language is so important.
OPENSTEP even used Objective-C in the kernel, and replacing that with C++ was one of the things the Darwin engineer regretted.
I think it's not a dichotomy, but a 7-otomy :-)
Here was another person who wanted "Rust for all levels", with 4 levels:
From Languages to Language Sets
https://gist.github.com/xixixao/8e363dbd3663b6729cd5b6d74dbbf9d4
My response with 7 levels: https://news.ycombinator.com/item?id=43390774 copied at https://lobste.rs/s/1kxvjz/from_languages_language_sets#c_8duyxw
I think with a small desktop app, you can kinda use one language. Although Lua for games is a thing, and Verse is a new thing
But bigger systems are more heterogeneous, and software is getting bigger. Especially cloud systems.
Software systems are also getting older as time passes, so there is more heterogeneity that way too
On the other hand, I don't believe in unlimited diversity -- I still want to "get rid of" Unix sludge, at least in some kind of limited but realistic and useful world (a new distro). But the trends are against that, and even that's a more modest goal than trying to fit Rust and Python in the same language
In reality it kinda works already -- to a degree you can write "Python scripts" in Rust (I don't though)
And Python also spawned Numba and Mojo and a dozen others for fast code
But I think the split will always be there, and the trend is in the opposite direction. Insisting on homogeneity limits you to smaller projects / lower-level projects.
(Or you can move into PL design/implementation and never write any other kind of software for the rest of your life. That is one way to solve the problem. :-) )
I think Ousterhout's Dichotomy, although currently historically and empirically true, is mostly a false dichotomy.
Just like the dichotomy we had in the late 80s and early 90s between easy-to-use but crashy personal computer operating systems like MacOS and Windows 3/95 and rock-solid but difficult-to-use server operating systems.
At the time, it was both empirically true and also thought of like an iron law of nature, a tradeoff you just couldn't escape. Now my phone and my watch run the server operating system Unix. It turned out to be an accident of history.
Objective-C is a mostly unintentional proof-of-concept that you could get both low and high-level in one language, just as NeXTStep was an intentional proof-of-concept that you could get a beautiful, user-friendly server-class operating system.
Not sure whether or not I agree with you, but I feel like it's worth observing that in that case the dichotomy was broken by material abundance. Personal computers became fast enough that the overhead required to be rock-solid became relatively unimportant.
Objective-C is an interesting design point to bring up, because it is kinda two languages: One that is very static and compiles to fixed code as much as possible like C (naturally), C++, Rust and Zig, and one that is very dynamic where everything possible might go through a pointer and a dynamic dispatch like, well, Smalltalk, but also like Python or Ruby or most Lisps. The annoyance is that these languages kinda don't overlap very much, but maybe that's a good thing in disguise because it avoids the various hacky edge cases needed to handle, for instance, trait objects in Rust.
dichotomy was broken by material abundance
That would have been my guess as well, and there is some superficial truth to it, but it turns out the same machines could run both kinds of OSes. Heck, Unix started on a 16 bit PDP-11. Windows 95 required a 386, the same computer that Linux was developed on.
Regarding Objective-C: yes it is two languages, but smashed into one. I tend to liken it to a bit of a car crash. And they actually overlap a lot, but not really in a good way.
You have two kinds of string, two kinds of arrays. You have two kinds of structuring data: classes and objects. You have functions and methods. You have two kinds of numbers, again with disparate syntax and no numeric tower for the objects (that might be fixable) and no syntax for arithmetic on the object numbers. So quite the mishmash, due to history and due to plopping the Smalltalk bits on top of the C bits.
I am quite convinced that going the other way is much, much simpler. You have numbers. By default they are objects, with a numeric tower. See Some thoughts on security after ten years of qmail 1.0 for reasons why that should be the default. All the arithmetic is quasi-normal message-sending like in Smalltalk, preferably with the SmallInteger optimization. Add an optional "int" type declaration and the "object" becomes a machine integer that gets all its operations statically dispatched etc.
Objective-S also has a fairly fully-fledged meta-object protocol (likely its true innovation) that is fairly agnostic about whether things are evaluated at compile-time or run-time, so it is probably a reasonable vehicle for implementing this sort of successive hardening for low-level machine access.
I think the goal posts have moved though. I think that may have been true of Objective C in the past, but I'm not sure how fast it has evolved in recent years
Python handles more problems than Perl, and Rust arguably covers more apps than C++
As an example: as far as I can see, Objective C isn't very memory safe. But both Python and Rust are
So in that sense it's a lower-level language than both!
The goal posts have moved for both low level and high level languages
As I wrote, Objective-C was an unintentional proof of concept that this is possible, so yeah, it has lots of shortcomings. Interestingly though, it is remarkably effective despite all those shortcomings, IMHO exactly because it has a sorta-kinda solution to that problem. Even a sorta-kinda solution to this problem is a often a lot better than a well-designed thing that is not a solution to this problem.
In terms of safety, the overall language combines the memory safety of C with the static type-safety of Smalltalk. So absolutely excellent safety properties. ;-)
And yet, in practice it is actually quite safe. Or to be more precise: it is both very straightforward and very easy to use it in safe manner. Unlike plain C or C++ where the default always seems to be "let's crash". For example, the id subset is safe.
And yes, Objective-C can go pretty low, because if you really want to, you can do almost anything you like, just like C. For obvious reasons.
But I keep coming back to "unintentional proof of concept". Objective-S is an intentional design and goes much, much higher. And also aims to make going lower/faster more straightforward and safer while reducing the current overlap. So the default is very high-level, think architectural description, which includes storage abstraction, streaming and dynamic messaging as subsets. When you harden that, mostly by just adding (primitive) type information, you get something Pascal-/Oberon-ish. So closer to the machine, but still not defaulting to foot-cannon. If you want that, if you want C-level access, you need to really want it. But you can get it if you want it.
We'll see how it works out.
Well I am objecting to this part, because both system and scripting languages are a moving target:
I think Ousterhout's Dichotomy, although currently historically and empirically true, is mostly a false dichotomy.
At the time, it was both empirically true and also thought of like an iron law of nature, a tradeoff you just couldn't escape.
I'm saying you COULD have escaped it at one point (but I'm not saying we did)
But then if either scripting languages or systems languages IMPROVE, then you're BACK in it.
I don't think it will ever be settled, and it's not a false dichotomy ... it may be a harder or easier tradeoff at different points in time
Original: https://web.stanford.edu/~ouster/cgi-bin/papers/scripting.pdf
For the past 15 years, a fundamental change has been occurring in the way people write computer programs. The change is a transition from system programming languages such as C or C++ to scripting languages such as Perl or Tcl.
Because Python and JavaScript are the top two languages, and they are not systems languages, what he said was undoubtedly true.
You obviously CAN write any Python or JavaScript program in something like Objective C, but it's worth thinking about why we don't. The observation is about what is happening in the wild, not about what is theoretically possible.
And the funny thing is that I think he wrote this pre-Java and pre-C#.
I would say that this is a THIRD category of language -- they are neither scripting languages or system languages!
And plenty of programmers stay within one of these categories: systems, scripting, or "business productivity".
As mentioned, I think this is because computing as a whole has gotten drastically bigger and more diverse in the last 30 years. The trend is toward more heterogeneity in languages, not less.
Rather than there being a false dichotomy, I think it's more like a trichotomy or "worse"
But then if either scripting languages or systems languages IMPROVE, then you're BACK in it.
Why? Why do improvements to those language automatically widen the gap? I don't see any reason whatsoever why that should be. So I'd be very curious about your reasoning as to why any improvements on either side automatically force the gap to widen.
I mean, couldn't improvements also narrow the gap? At least in principle?
But not just in principle: to me, that seems to be exactly what is happening. For example Go was designed and intended to be a systems language with fewer foot guns and faster compile times than C++. But instead most of the takeup outside Google was apparently from Python programmers who found it gave them the convenience of Python while being faster.
Similarly, many of the improvements to C++ in recent years have been touted as making it possible to use C++ like a scripting language. C++ has become more Pthonic
"Mojo is an in-development proprietary programming language based on Python available for Linux and macOS. Mojo aims to combine the usability of a high-level programming language, specifically Python, with the performance of a system programming language such as C++, Rust, and Zig. " -- Wikipedia
The approach is very similar to the one I have outlined for Objective-S.
Julia was in fact specifically designed to address this two language problem pdf.
So to sum up this part: you claim that improvements automatically cause further divergence, for which I didn't see you give either a logical reason or empirical evidence. I would love to see either if you can provide it. The empirical evidence I have seen seems to actually point in the opposite direction, and I can also give some reasons.
The observation is about what is happening in the wild
Yes...but that is also the property a false dichotomy has: it is a contingent dichotomy that is falsely believed to be a logically necessary one. I am not claiming the distinction doesn't exist. I am claiming it is, like the one with our OSes, not a logically necessary one as it is believed to be, but rather a mostly accidental one. And no, "accidental" in this context does not mean there were or are no reasons. There are. But these are not necessary.
transition from system programming languages such as C or C++ to scripting languages such as Perl or Tcl.
Yes, this is a move that is happening. Just like people moved to use more of the "Objective" part of Objective-C rather than the C part. I am not sure what that has to do with the dichotomy. That more and more of the components are also being programmed in the "scipring" language would, in my mind, again show rather the opposite: a convergence, not a divergence.
You obviously CAN write any Python or JavaScript program in something like Objective C, but it's worth thinking about why we don't.
Most JavaScript is written to run inside the browser, so you can't run Objective-C. Not even via emscriptem. But Objective-C is really not the point at all. Again, it is just a mostly unintentional proof of concept of a language overcoming the dichotomy. And compared to Python, Objective-C has exactly the problem I wrote: it is a dynamic language placed on top of a systems language. So all the C complexity and foot guns and ceremony are there by default. And it is AOT compiled by default.
WebScript was a pretty nice scripting version of Objective-C, as I point out in the article on the 4 stages of Objective-Smalltalk. But by solving a bunch of issues, it revealed other ones, all of which are rooted in starting with the complexity of a systems language. You have to come from the other side: start with something that looks like a scripting language but can be hardened into a systems language.
This is harder (but not impossible, see Mojo) if the scripting language you start with wasn't designed for later hardening. If it was, it appears to be fairly straightforward. At least that's been my experience with Objective-S.
After all, a lot of what both scripting and systems languages do is the same. You define structs/classes and functions/methods. You have variables, expressions, procedure calls/message sends, usually assignment.
My analysis in Beyond Procedure Calls as Component Glue: Connectors Deserve Metaclass Status shows that what we think of as "systems" programming is actually the special case, and "integration" or "scripting" languages are actually the more general case.
Your observation that more and more programming is moving towards "integration" / "scripting" is evidence in favor of this analysis.
Just like what we think of as "general purpose" programming languages are actually a special case of architecture-oriented programming, specifically DSL for the domain of "algorithms".
It's just that historically we came to it the other way, so integration/scripting/coordination languages are viewed as the DSLs.
You have to come from the other side: start with something that looks like a scripting language but can be hardened into a systems language.
I think D is an example of the main problem with that approach: a high-level language relies on a heavy run-time system (a garbage collector) and so do most of the libraries written in the language. If you want to use the low-level subset of the language you are on your own.
There are lots of high-level languages that have written substantial parts of their runtime in their low-level subset, but that’s a situation where they must rely on nothing so it’s ok that they can’t use normal libraries.
But I agree that if it’s acceptable to require a GC, then it’s reasonable to expect that a language should be easy to use without the 100x performance penalty that scripting languages often have.
Hmmm..
"D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001." -- Wikipedia. (my emphasis).
So it really shows more the problem of coming from the other side.
However, I absolutely agree with your point about (tracing) garbage collectors: the Go experience non-withstanding, I think tracing garbage collectors are generally unsuitable for a systems programming or for scripting languages with aspirations to reach into systems programming.
Apple did this right by using reference counting for both Objective-C and later Swift. Their brief foray into tracing garbage collection (for which I was there) was yet another very convincing data point that tracing garbage collection, despite its obvious benefits, is not fit for purpose, as it is difficult to impossible to subset. It runs a proof of unreachability by determining global reachability.
Reference counting determines unreachability by locally keeping track of reachability. So it is contained.
But I agree that if it’s acceptable to require a GC,
Just use reference counting that you can elide if you can determine the references statically, either manually or using a borrow checker or similar automated techniques.
The default is reference counting, just like the default for numbers is number objects with a numeric tower. You can provide additional information to have the number harden into a machine integer, float double or whatever; and you can provide additional information to have the reference not be counted.
without the 100x performance penalty that scripting languages often have.
That's another one of those things that we take to be true, but that just is not. See WebScript/Objective-C.
Well, I gave an example: say you manage to make your unified "Python and C++" language. It can do everything C++ can.
But like C++, it's not memory safe. Then Rust comes along, and suddenly people have more reason to use Rust than your language.
The goal posts move
(And actually, even C++ 11 moved the goal posts versus C++ 98)
That's not an example for anything we talked about, at least not as far as I can tell. ¯_(ツ)_/¯
As scripting languages tend to be memory safe, a memory-safe systems language closes the gap between systems programming languages and scripting languages, it doesn't widen it.
Your hypothetical of a Python + C++ that is memory unsafe is your hypothetical and doesn't correspond to anything I wrote. What I talk about is a language that is designed to be both, not a mashup of two existing ones, and certainly not one that includes all of C++ (shudder). That is pretty much the opposite of what I am talking about.
And C++ 11 moved the "goal posts" in the direction that I stated: closer, not farther away.
What happened after C++ 11 was released? Yes, C++ programmers can now write higher level code. That's good
Just like some people write "Python scripts" in Rust. Also good and fine
But in the same period, Python (and JavaScript) grew faster than C++ and Rust. And they also both gained a ton of features
That's why I say the goal posts moved. If you're designing a language that's meant to occupy the design space of Python in 2010, then Python already passed you by
Computing is getting more diverse; not less
asyncio is another great example. Rust has async/await, but it seems too "low level" for many people because of the interaction with ownership.
On the other hand, Python and JavaScript both gained async/await in the last ~decade, and these features are extremely widely used, on both the client and the server
Likewise, C++ 20 got coroutines, which is higher level, but my impression is its design problems are more severe than Rust's
In any case, C++ coroutines are not going to "replace" Python and JS. I'd even go as far as to say Python and JS have an inherent advantage in that space
Every language is moving -- asyncio is a great example, with 4 languages getting it recently -- but there is no convergence.
It's more like an explosion of diversity. You can see that with Swift and Zig too
I truly don't know what (or why) you are arguing for or against.
Your original claim was that the gap between scripting and systems languages is growing again because languages are changing. All your examples were doing the opposite: showing a clear narrowing of that gap.
Same now:
C++ programmers can now write higher level code. That's good
Yes, and it narrows the gap. It doesn't widen it as you claimed.
Just like some people write "Python scripts" in Rust.
Once again: narrowing the gap. Not widening it.
Python (and JavaScript) grew faster than C++ and Rust
I first thought you meant they got faster than C++ (nope), but OK, they grow. What does this adoption of programming languages at a particular position have to do with the claimed "widening" of the gap between languages? Nothing.
That's why I say the goal posts moved.
Your claim was that the goal posts moved by widening the gap. I don't see any evidence for that, and all your examples have shown the opposite.
If you're designing a language that's meant to occupy the design space of Python in 2010,
Who talked about occupying the design space of Python? Absolutely nobody. Well, you did. But I don't understand why.
asyncio is another great example.
Of an increasing gap between scripting languages and systems languages? Hardly. All the low and high-level appear to be adopting it. So one more piece of evidence of the gap narrowing.
Last message ...
Python/CUDA as used in deep learning is a prominent example of this -- software systems got simultaneously higher level and lower level
None of this supports your claims.
All the evidence you yourself have presented contradicts your claims.
Not sure what to make of this.
I would rather use 3 languages. A memory safe language for most code (C#), A language where I can optimize well, and a language in the browser. I dislike the browser so much that lately I've been writing less than 10 lines a year so I stick to JS which has full access to the dom
I didn't like the article but I don't think people need to like what I like. I'm sure there's people who dislike my language and I'm fine with that
shared types which can be thought of as a wrapper over atomic reference-counting
That sounds a lot like Swift's classes.