A Vision for Future Low-Level Languages

23 points by zmitchell

andyc

Unless I missed something, this seems like a very common desire. Most people want to use 1 language rather than 2 or 3. It's less to remember

But the problem is actually doing it :-) And I had to skip to the end fo the article to see:

Many of the things I’ve mentioned in this article are ideals. I don’t know how well some of them will turn out in practice

It feels like it should reference Ousterhout's Dichotomy

Recent related thread on Rust: https://lobste.rs/s/pwsnpd/powerletters_for_rust#c_irvu2i

Also related: Rust’s Ugly Syntax - i.e. it's not "just" the syntax; it's that low level code needs to express more semantics.

That is, more semantics are relevant to low-level code -- you don't want to make things that are relevant to a problem invisible

https://lobste.rs/s/tn9gho/rust_s_ugly_syntax

matklad
And https://lobste.rs/s/jsriyn/on_ousterhout_s_dichotomy :)

Some thoughts:
- it feels like it’s easier to start with a low level language and make it acceptable for high level, than doing the opposite. I’d say Python is a negative example here, as “write hot spots in C” only works if your hotspot is sufficiently coarse-grained/batchable (https://lobste.rs/s/ilpqe5/vectorized_interpreters_mrt_for_pl)
- I think in “C, C++, Rust, Zig” one language is not like the others! Rust is at least decent for high-level stuff. And I think this is historically contingent, because we (still) lack a decent high level language. Only Go seems to have acceptable quality-of-implementation, but it lacks in the language department (null pointers, sum-types, late-arriving generics (note the same trajectory as with Java and C++!)). So there was a historical push to lift Rust, from “ a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety” to “a language empowering everyone to build reliable and efficient software”. Compare to Go’s “open source programming language that makes it easy to build simple, reliable, and efficient software”. It’s interesting how Rust overall trajectory is U-shaped: it’s started roughly at the same place as Go, then went pretty hard down to the basement, and now it is being lifted again.
- in this sense, Zig has different demand-side pressures, as Rust plays a role of “decent high level language” for Zig.
- mpweiher
  
  I might feel easier to start with low-level, but given the experience of Objective-C I think this is a misleading feeling.
  
  Python + C is obviously not an example of "one language", as it is two languages, with exactly the kinds of barriers you'd expect. Well-written Objective-C code can do this, as you can mix and match much more freely. It does take some care to design your high-level, convenient APIs in such a way that they can also be both implemented and, importantly, used efficiently.
  
  A lot of my book iOS and macOS Performance Tuning deals with exactly that sort of design: convenient and fast, based on my almost 40 years of experience writing fast, convenient code with that tooling.
  
  While focused on Objective-C, the principles are more generally applicable, IMHO.
  
  That given high-level languages may not be well-suited is, just with the operating system dichotomy I described above, mostly accidental. If you design a language to span from low- to high-level I think it is much, much better to start with high-level, which is what I've done with Objective-S. You get more low-level, machine-level constructs by hardening.
  
  You harden a late-bound message-send to become a procedure call.
  
  You harden a number object to be a 64 bit integer.
  
  You harden a polymorphic reference to a pointer.
  
  Often this can be accomplished simply by adding more restrictive types. My experience is that you don't actually need to do this a lot, though where exactly you need it is difficult to tell before you start, which is why having the flexibility of a single language is so important.
  
  OPENSTEP even used Objective-C in the kernel, and replacing that with C++ was one of the things the Darwin engineer regretted.
- andyc
  
  I think it's not a dichotomy, but a 7-otomy :-)
  
  Here was another person who wanted "Rust for all levels", with 4 levels:
  
  From Languages to Language Sets
  
  https://gist.github.com/xixixao/8e363dbd3663b6729cd5b6d74dbbf9d4
  
  My response with 7 levels: https://news.ycombinator.com/item?id=43390774 copied at https://lobste.rs/s/1kxvjz/from_languages_language_sets#c_8duyxw
  
  I think with a small desktop app, you can kinda use one language. Although Lua for games is a thing, and Verse is a new thing
  
  But bigger systems are more heterogeneous, and software is getting bigger. Especially cloud systems.
  
  Software systems are also getting older as time passes, so there is more heterogeneity that way too
  
  On the other hand, I don't believe in unlimited diversity -- I still want to "get rid of" Unix sludge, at least in some kind of limited but realistic and useful world (a new distro). But the trends are against that, and even that's a more modest goal than trying to fit Rust and Python in the same language
  
  In reality it kinda works already -- to a degree you can write "Python scripts" in Rust (I don't though)
  
  And Python also spawned Numba and Mojo and a dozen others for fast code
  
  But I think the split will always be there, and the trend is in the opposite direction. Insisting on homogeneity limits you to smaller projects / lower-level projects.
  
  (Or you can move into PL design/implementation and never write any other kind of software for the rest of your life. That is one way to solve the problem. :-) )
mpweiher

I think Ousterhout's Dichotomy, although currently historically and empirically true, is mostly a false dichotomy.

Just like the dichotomy we had in the late 80s and early 90s between easy-to-use but crashy personal computer operating systems like MacOS and Windows 3/95 and rock-solid but difficult-to-use server operating systems.

At the time, it was both empirically true and also thought of like an iron law of nature, a tradeoff you just couldn't escape. Now my phone and my watch run the server operating system Unix. It turned out to be an accident of history.

Objective-C is a mostly unintentional proof-of-concept that you could get both low and high-level in one language, just as NeXTStep was an intentional proof-of-concept that you could get a beautiful, user-friendly server-class operating system.
- icefox
  
  Not sure whether or not I agree with you, but I feel like it's worth observing that in that case the dichotomy was broken by material abundance. Personal computers became fast enough that the overhead required to be rock-solid became relatively unimportant.
  
  Objective-C is an interesting design point to bring up, because it is kinda two languages: One that is very static and compiles to fixed code as much as possible like C (naturally), C++, Rust and Zig, and one that is very dynamic where everything possible might go through a pointer and a dynamic dispatch like, well, Smalltalk, but also like Python or Ruby or most Lisps. The annoyance is that these languages kinda don't overlap very much, but maybe that's a good thing in disguise because it avoids the various hacky edge cases needed to handle, for instance, trait objects in Rust.
  - mpweiher
    
    dichotomy was broken by material abundance
    
    That would have been my guess as well, and there is some superficial truth to it, but it turns out the same machines could run both kinds of OSes. Heck, Unix started on a 16 bit PDP-11. Windows 95 required a 386, the same computer that Linux was developed on.
    
    Regarding Objective-C: yes it is two languages, but smashed into one. I tend to liken it to a bit of a car crash. And they actually overlap a lot, but not really in a good way.
    
    You have two kinds of string, two kinds of arrays. You have two kinds of structuring data: classes and objects. You have functions and methods. You have two kinds of numbers, again with disparate syntax and no numeric tower for the objects (that might be fixable) and no syntax for arithmetic on the object numbers. So quite the mishmash, due to history and due to plopping the Smalltalk bits on top of the C bits.
    
    I am quite convinced that going the other way is much, much simpler. You have numbers. By default they are objects, with a numeric tower. See Some thoughts on security after ten years of qmail 1.0 for reasons why that should be the default. All the arithmetic is quasi-normal message-sending like in Smalltalk, preferably with the SmallInteger optimization. Add an optional "int" type declaration and the "object" becomes a machine integer that gets all its operations statically dispatched etc.
    
    Objective-S also has a fairly fully-fledged meta-object protocol (likely its true innovation) that is fairly agnostic about whether things are evaluated at compile-time or run-time, so it is probably a reasonable vehicle for implementing this sort of successive hardening for low-level machine access.
- andyc
  
  I think the goal posts have moved though. I think that may have been true of Objective C in the past, but I'm not sure how fast it has evolved in recent years
  
  Python handles more problems than Perl, and Rust arguably covers more apps than C++
  
  As an example: as far as I can see, Objective C isn't very memory safe. But both Python and Rust are
  
  So in that sense it's a lower-level language than both!
  
  The goal posts have moved for both low level and high level languages
  - mpweiher
    
    As I wrote, Objective-C was an unintentional proof of concept that this is possible, so yeah, it has lots of shortcomings. Interestingly though, it is remarkably effective despite all those shortcomings, IMHO exactly because it has a sorta-kinda solution to that problem. Even a sorta-kinda solution to this problem is a often a lot better than a well-designed thing that is not a solution to this problem.
    
    In terms of safety, the overall language combines the memory safety of C with the static type-safety of Smalltalk. So absolutely excellent safety properties. ;-)
    
    And yet, in practice it is actually quite safe. Or to be more precise: it is both very straightforward and very easy to use it in safe manner. Unlike plain C or C++ where the default always seems to be "let's crash". For example, the id subset is safe.
    
    And yes, Objective-C can go pretty low, because if you really want to, you can do almost anything you like, just like C. For obvious reasons.
    
    But I keep coming back to "unintentional proof of concept". Objective-S is an intentional design and goes much, much higher. And also aims to make going lower/faster more straightforward and safer while reducing the current overlap. So the default is very high-level, think architectural description, which includes storage abstraction, streaming and dynamic messaging as subsets. When you harden that, mostly by just adding (primitive) type information, you get something Pascal-/Oberon-ish. So closer to the machine, but still not defaulting to foot-cannon. If you want that, if you want C-level access, you need to really want it. But you can get it if you want it.
    
    We'll see how it works out.
    
    andyc
    
    Well I am objecting to this part, because both system and scripting languages are a moving target:
    
    I think Ousterhout's Dichotomy, although currently historically and empirically true, is mostly a false dichotomy.
    
    At the time, it was both empirically true and also thought of like an iron law of nature, a tradeoff you just couldn't escape.
    
    I'm saying you COULD have escaped it at one point (but I'm not saying we did)
    
    But then if either scripting languages or systems languages IMPROVE, then you're BACK in it.
    
    I don't think it will ever be settled, and it's not a false dichotomy ... it may be a harder or easier tradeoff at different points in time
    
    Original: https://web.stanford.edu/~ouster/cgi-bin/papers/scripting.pdf
    
    For the past 15 years, a fundamental change has been occurring in the way people write computer programs. The change is a transition from system programming languages such as C or C++ to scripting languages such as Perl or Tcl.
    
    Because Python and JavaScript are the top two languages, and they are not systems languages, what he said was undoubtedly true.
    
    You obviously CAN write any Python or JavaScript program in something like Objective C, but it's worth thinking about why we don't. The observation is about what is happening in the wild, not about what is theoretically possible.
    
    And the funny thing is that I think he wrote this pre-Java and pre-C#.
    
    I would say that this is a THIRD category of language -- they are neither scripting languages or system languages!
    
    And plenty of programmers stay within one of these categories: systems, scripting, or "business productivity".
    
    As mentioned, I think this is because computing as a whole has gotten drastically bigger and more diverse in the last 30 years. The trend is toward more heterogeneity in languages, not less.
    
    Rather than there being a false dichotomy, I think it's more like a trichotomy or "worse"
    
    mpweiher
    
    But then if either scripting languages or systems languages IMPROVE, then you're BACK in it.
    
    Why? Why do improvements to those language automatically widen the gap? I don't see any reason whatsoever why that should be. So I'd be very curious about your reasoning as to why any improvements on either side automatically force the gap to widen.
    
    I mean, couldn't improvements also narrow the gap? At least in principle?
    
    But not just in principle: to me, that seems to be exactly what is happening. For example Go was designed and intended to be a systems language with fewer foot guns and faster compile times than C++. But instead most of the takeup outside Google was apparently from Python programmers who found it gave them the convenience of Python while being faster.
    
    Similarly, many of the improvements to C++ in recent years have been touted as making it possible to use C++ like a scripting language. C++ has become more Pthonic
    
    "Mojo is an in-development proprietary programming language based on Python available for Linux and macOS. Mojo aims to combine the usability of a high-level programming language, specifically Python, with the performance of a system programming language such as C++, Rust, and Zig. " -- Wikipedia
    
    The approach is very similar to the one I have outlined for Objective-S.
    
    Julia was in fact specifically designed to address this two language problem pdf.
    
    So to sum up this part: you claim that improvements automatically cause further divergence, for which I didn't see you give either a logical reason or empirical evidence. I would love to see either if you can provide it. The empirical evidence I have seen seems to actually point in the opposite direction, and I can also give some reasons.
    
    The observation is about what is happening in the wild
    
    Yes...but that is also the property a false dichotomy has: it is a contingent dichotomy that is falsely believed to be a logically necessary one. I am not claiming the distinction doesn't exist. I am claiming it is, like the one with our OSes, not a logically necessary one as it is believed to be, but rather a mostly accidental one. And no, "accidental" in this context does not mean there were or are no reasons. There are. But these are not necessary.
    
    transition from system programming languages such as C or C++ to scripting languages such as Perl or Tcl.
    
    Yes, this is a move that is happening. Just like people moved to use more of the "Objective" part of Objective-C rather than the C part. I am not sure what that has to do with the dichotomy. That more and more of the components are also being programmed in the "scipring" language would, in my mind, again show rather the opposite: a convergence, not a divergence.
    
    You obviously CAN write any Python or JavaScript program in something like Objective C, but it's worth thinking about why we don't.
    
    Most JavaScript is written to run inside the browser, so you can't run Objective-C. Not even via emscriptem. But Objective-C is really not the point at all. Again, it is just a mostly unintentional proof of concept of a language overcoming the dichotomy. And compared to Python, Objective-C has exactly the problem I wrote: it is a dynamic language placed on top of a systems language. So all the C complexity and foot guns and ceremony are there by default. And it is AOT compiled by default.
    
    WebScript was a pretty nice scripting version of Objective-C, as I point out in the article on the 4 stages of Objective-Smalltalk. But by solving a bunch of issues, it revealed other ones, all of which are rooted in starting with the complexity of a systems language. You have to come from the other side: start with something that looks like a scripting language but can be hardened into a systems language.
    
    This is harder (but not impossible, see Mojo) if the scripting language you start with wasn't designed for later hardening. If it was, it appears to be fairly straightforward. At least that's been my experience with Objective-S.
    
    After all, a lot of what both scripting and systems languages do is the same. You define structs/classes and functions/methods. You have variables, expressions, procedure calls/message sends, usually assignment.
    
    My analysis in Beyond Procedure Calls as Component Glue: Connectors Deserve Metaclass Status shows that what we think of as "systems" programming is actually the special case, and "integration" or "scripting" languages are actually the more general case.
    
    Your observation that more and more programming is moving towards "integration" / "scripting" is evidence in favor of this analysis.
    
    Just like what we think of as "general purpose" programming languages are actually a special case of architecture-oriented programming, specifically DSL for the domain of "algorithms".
    
    It's just that historically we came to it the other way, so integration/scripting/coordination languages are viewed as the DSLs.
    
    fanf
    
    You have to come from the other side: start with something that looks like a scripting language but can be hardened into a systems language.
    
    I think D is an example of the main problem with that approach: a high-level language relies on a heavy run-time system (a garbage collector) and so do most of the libraries written in the language. If you want to use the low-level subset of the language you are on your own.
    
    There are lots of high-level languages that have written substantial parts of their runtime in their low-level subset, but that’s a situation where they must rely on nothing so it’s ok that they can’t use normal libraries.
    
    But I agree that if it’s acceptable to require a GC, then it’s reasonable to expect that a language should be easy to use without the 100x performance penalty that scripting languages often have.
    
    mpweiher
    
    Hmmm..
    
    "D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001." -- Wikipedia. (my emphasis).
    
    So it really shows more the problem of coming from the other side.
    
    However, I absolutely agree with your point about (tracing) garbage collectors: the Go experience non-withstanding, I think tracing garbage collectors are generally unsuitable for a systems programming or for scripting languages with aspirations to reach into systems programming.
    
    Apple did this right by using reference counting for both Objective-C and later Swift. Their brief foray into tracing garbage collection (for which I was there) was yet another very convincing data point that tracing garbage collection, despite its obvious benefits, is not fit for purpose, as it is difficult to impossible to subset. It runs a proof of unreachability by determining global reachability.
    
    Reference counting determines unreachability by locally keeping track of reachability. So it is contained.
    
    But I agree that if it’s acceptable to require a GC,
    
    Just use reference counting that you can elide if you can determine the references statically, either manually or using a borrow checker or similar automated techniques.
    
    The default is reference counting, just like the default for numbers is number objects with a numeric tower. You can provide additional information to have the number harden into a machine integer, float double or whatever; and you can provide additional information to have the reference not be counted.
    
    without the 100x performance penalty that scripting languages often have.
    
    That's another one of those things that we take to be true, but that just is not. See WebScript/Objective-C.
    
    andyc
    
    Well, I gave an example: say you manage to make your unified "Python and C++" language. It can do everything C++ can.
    
    But like C++, it's not memory safe. Then Rust comes along, and suddenly people have more reason to use Rust than your language.
    
    The goal posts move
    
    (And actually, even C++ 11 moved the goal posts versus C++ 98)
    
    mpweiher
    
    That's not an example for anything we talked about, at least not as far as I can tell. ¯_(ツ)_/¯
    
    As scripting languages tend to be memory safe, a memory-safe systems language closes the gap between systems programming languages and scripting languages, it doesn't widen it.
    
    Your hypothetical of a Python + C++ that is memory unsafe is your hypothetical and doesn't correspond to anything I wrote. What I talk about is a language that is designed to be both, not a mashup of two existing ones, and certainly not one that includes all of C++ (shudder). That is pretty much the opposite of what I am talking about.
    
    And C++ 11 moved the "goal posts" in the direction that I stated: closer, not farther away.
    
    andyc
    
    What happened after C++ 11 was released? Yes, C++ programmers can now write higher level code. That's good
    
    Just like some people write "Python scripts" in Rust. Also good and fine
    
    But in the same period, Python (and JavaScript) grew faster than C++ and Rust. And they also both gained a ton of features
    
    That's why I say the goal posts moved. If you're designing a language that's meant to occupy the design space of Python in 2010, then Python already passed you by
    
    Computing is getting more diverse; not less
    
    asyncio is another great example. Rust has async/await, but it seems too "low level" for many people because of the interaction with ownership.
    
    On the other hand, Python and JavaScript both gained async/await in the last ~decade, and these features are extremely widely used, on both the client and the server
    
    Likewise, C++ 20 got coroutines, which is higher level, but my impression is its design problems are more severe than Rust's
    
    In any case, C++ coroutines are not going to "replace" Python and JS. I'd even go as far as to say Python and JS have an inherent advantage in that space
    
    Every language is moving -- asyncio is a great example, with 4 languages getting it recently -- but there is no convergence.
    
    It's more like an explosion of diversity. You can see that with Swift and Zig too
    
    mpweiher
    
    I truly don't know what (or why) you are arguing for or against.
    
    Your original claim was that the gap between scripting and systems languages is growing again because languages are changing. All your examples were doing the opposite: showing a clear narrowing of that gap.
    
    Same now:
    
    C++ programmers can now write higher level code. That's good
    
    Yes, and it narrows the gap. It doesn't widen it as you claimed.
    
    Just like some people write "Python scripts" in Rust.
    
    Once again: narrowing the gap. Not widening it.
    
    Python (and JavaScript) grew faster than C++ and Rust
    
    I first thought you meant they got faster than C++ (nope), but OK, they grow. What does this adoption of programming languages at a particular position have to do with the claimed "widening" of the gap between languages? Nothing.
    
    That's why I say the goal posts moved.
    
    Your claim was that the goal posts moved by widening the gap. I don't see any evidence for that, and all your examples have shown the opposite.
    
    If you're designing a language that's meant to occupy the design space of Python in 2010,
    
    Who talked about occupying the design space of Python? Absolutely nobody. Well, you did. But I don't understand why.
    
    asyncio is another great example.
    
    Of an increasing gap between scripting languages and systems languages? Hardly. All the low and high-level appear to be adopting it. So one more piece of evidence of the gap narrowing.
    
    andyc
    
    Last message ...
    
    Ousterhout's dichotomy is real, and will never be settled permanently
    
    Programming requirements (and Python and JavaScript) are getting higher-level at a faster rate than Rust or C++ are getting higher level
    
    Programming requirements (and C++/Rust/CUDA) are getting lower-level at a faster rate than Python and JavaScript are getting lower level
    
    Simultaneously, Python/JS are getting lower level, and C++/Rust are higher level, to a degree
    
    Python/CUDA as used in deep learning is a prominent example of this -- software systems got simultaneously higher level and lower level
    
    mpweiher
    
    None of this supports your claims.
    
    All the evidence you yourself have presented contradicts your claims.
    
    Not sure what to make of this.
    
    levodelellis
    
    I would rather use 3 languages. A memory safe language for most code (C#), A language where I can optimize well, and a language in the browser. I dislike the browser so much that lately I've been writing less than 10 lines a year so I stick to JS which has full access to the dom
    
    I didn't like the article but I don't think people need to like what I like. I'm sure there's people who dislike my language and I'm fine with that
    
    turbolent
    
    shared types which can be thought of as a wrapper over atomic reference-counting
    
    That sounds a lot like Swift's classes.