Linear types proposal for Hare
47 points by runxiyu
47 points by runxiyu
I'm not sure this is really a proposal? It reads more like a collection of rough notes on what linear types in Hare might look like, but with very few details on how this would actually achieved, whether it would be memory safe and so on.
memory safety is one utility, but I think you can get good API design utility from having affine/linear types.
Ownership is present as an issue in many PLs, just Rust exposes that to you and lets you annotate it
I find the destructor part interesting, a good example of how previous design choices can make new ideas very difficult. Mojo is the language I know with good Linear type support, the design looks so simple because of how it composed with existing design decisions. There is no explicit "linear" keyword, instead you can achieve the behaviour for certain types by telling the compiler to disable implicit dtor for said types:
It seems like though in Mojo you can't borrow a resource, you always have to move. At least I couldn't find anything about borrowing in the docs
It supports borrowing... see argument conventions; this example should work. I'll start a conversation in the Discord, looks like a bug
edit: this works in nightly: https://godbolt.org/z/1Yahxrooe. It wasn't working in 0.26.1 because def functions have a different semantics there which have been changed.
Since the article mentions Rust, it's worth saying that Rust has affine types, not linear types. Affine means they must be used zero or one times. This is the case in Rust because it's possible, via things like reference count cycles with Rc or Arc, to create types which are never cleaned up (put another way: Rust considers memory leaks safe, and doesn't guarantee freedom from them).
This is the case in Rust because it's possible, via things like reference count cycles with Rc or Arc, to create types which are never cleaned up (put another way: Rust considers memory leaks safe, and doesn't guarantee freedom from them).
Well no, rust has affine types because that's what its types are, you can just not use a value and you might get a few warnings but the compiler will not actually prevent you from doing that. That was the case long before leakpocalypse.
Before leaks even get into the ring, panics make linear types difficult, if a value must be used exactly once and something can panic between its creation and use, then it's not going to work. You'd either need to add panic safety and interleave it with linear types, or (I think more likely if something like linear rust ever appears) have "best effort" linear types: pseudo-linear types which the compiler requires you to consume, but which still have a drop in case of unwinding, and don't guarantee either is called (because both panic=abort and leaks).
This does depend on how you think about Drop, though. If you switch from looking at the surface syntax to something where drops have been elaborated in, then Rust is a lot closer to linear- both "just don't use a value" and "panic" still invoke Drop.
Well, I'm really unsure how fixed arrays with linear types would work, and I suspect this would be one of the hardest parts (since linear types obviously have no concept of a zero value). Though I suspect in these situations you'd want a fixed-capacity dynamic array / slice / whatever the new name will be, instead of a fixed array, anyway.
Uninitialized memory is always hard, and that's basically what a fixed array full of linear types is... you probably would have to cover it up and make it look like a fixed-capacity dynamic array.
Hmm, now that I think of it, in Rust you can often hide it by making initialization happen either all at once or not at all... let arr: [Thing; 8] = (0..8).map(|i| make_thing(i)).collect() or something like it would let you populate a fixed-length array without ever having to say "some of these values are uninitialized" outside of the collect() impl... Not sure whether or not rustc is smart enough to actually do that in practice though.
I would be tempted to just support this feature on top level types and not worry about nested types.
Maybe you only get like 60 percent of the value, but I feel like you could keep things really simple.
Since we're chatting all things affine....
is there a theoretical basis for something to be affine-generic? Like "if you can consume this, consume it... if you can't, copy it instead"?
I often find myself wanting to write a functional API but to have the API be "smart enough" to just consume inputs that will get dropped. Right now in Rust I think that we don't really have that ability, and it's up to the user of the API to make these decisions
Cow supports this fairly easily. Not monomorphised at compile time, and not elegant, but not bad. Take a Cow, or impl Into<Cow<…>> if you prefer (advantages and disadvantages, debated), and you can branch on whether it's Owned or Borrowed, and call .into_owned() at such time as you just wish to maybe-clone and consume it.
There's Perceus: reference counting with reuse with annotations to track when a function can consume its arguments.
Lots of systems also do this with reference counted objects in the form of copy-on-write, including Rust's Rc::make_mut.
This doesn't exist in Rust because of a couple of intersecting design decisions.
T, &T, &mut T can all be taken in or returned)Clone must be cloned explicitly: there is no implicit &T to T conversion.impl on a type you do not control.If any of these were not the case, you'd be able to write code this way, but their combination forces you to have ex. separately named Vec::iter(&self) and Vec::into_iter(self) methods (even though their signatures are different). And so it is up to the user of the API to manage, yes.
This is a design tradeoff that Rust very intentionally makes. It is verbosity (and expressiveness, c.f. that first design decision) in exchange for clarity around behaviour / performance. As the user, you have to keep track of ownership when you are using or writing APIs. In exchange, you get to be able to read code and immediately know 1) what's actually being called when a function is called (because no function overloading) or when a method is called on a type (because coherency i.e. orphan rules) and 2) vaguely what the performance characteristics of that call will be (due to calls to clone() being overt + the naming convention of foo() vs. foo_into() throughout the ecosystem).
(I too mope and cry at the lack of function overloading and the strictness of orphan rules every time I write Rust... I don't care about performance! Give me nice syntax! Of course, this is not what most people want out of Rust, so, you know. It's my own fault for using it like OCaml.)
But that is to say, drop any of these and you can do what you're describing! This is not incompatible with an ownership system. It is incompatible at a design level with Rust's ownership system, since the whole language is so geared towards performance + clarity + expressiveness.
Unfortunately, explicit ownership systems have not really taken off outside of Rust. This is... I dunno, it kind of makes sense. If you're using linear typing for performance then you're also probably going to want to be explicit about performance around memory and not have a garbage collector, and if you're having linear types and not having a garbage collector it seems kinda crazy to not use those linear types for managing memory as well. (This is my criticism of the submitted article, by the way.)
And then you pretty much get Rust.
Implicit ownership semantics, though... ~rpjohnst mentions the Perseus algorithm. That's a reference counting system that's been enhanced by ownership analysis: when the compiler can detect linear use of an object, it will perform a tonne of optimizations, i.e. eliding counts and deallocating early to let the compiler reuse memory directly. Which I believe is similar to what you're describing.
That Perseus system is implemented in Koka, and also Lean. It's been a while since I read the paper and I'm unfamiliar with the details of each implementation (or any innovations since 2023), but the primary optimization (count elision) can happen without user annotations, while I'm unsure if the memory reuse (they call this "functional-but-in-place") can occur without the user telling the compiler how to do so (via "resource tokens" etc), or if this part is implemented in Lean at all. It's a cool system though, and I think having a groundwork like that makes it easier to justify exposing a linear system for resources à la this article...