Move Expressions
22 points by robinhundt
22 points by robinhundt
In this case, there are two aspects I disagree with:
First, I think the magical keyword-that-looks-like-a-function syntax and the associated inconsistency that needs to be taught and remembered is going in the wrong direction. This is the reason that macro invocation requires ! to distinguish it from function calls.
Second, I think that existing move is already bad enough as a means of burying "this will be transferred to another thread/task/whatever and likely outlive this scope" inside the body of the closure rather than in its signature. This would double down on allowing that sort of thing.
In both cases, I think this proposal would go in the direction of less consistency with the design philosophy of the language's existing syntax and add more "outsized complexity".
The magical keyword-that-looks-like-a-struct-field .async worked great, despite endless complaints that it reuses an existing syntax. In practice, the fact that it's a reserved keyword and gets syntax-highlighted as a keyword in editors solved the issue. I suspect move will be fine too. Macros use !, because macro names are not keywords.
It is a very sweet syntax sugar, but doing nothing means there is { let foo = foo.clone() } required around closures. If you write the kind of code that needs this pattern, it gets really tiring. Doing something minimal either has special-case syntax sugar that works only with simple values or a single trait, or requires move lists with duplicated expressions, which is barely any better (you trade let x = for move(x)). This one generalizes well to arbitrary expressions, and doesn't need them written twice in the source code.
As far as I remember, I never had an issue with .await... which makes sense because, while it IS a keyword, within the higher-level abstraction of "async lets you block without blocking", it still behaves like a method... just one that needs to be a compiler intrinsic. Thus, I don't consider it to qualify for the definition of "magical" that I intended here.
I dislike move() for the same reason I dislike proposed things like .super. It complicates wrapping one's mind around how data and effects flow around the syntax.
(I'm reminded of things like ON ERROR in BASIC and the ultimate caricature of all that's wrong with complicating flow... COME FROM in INTERCAL. For those who haven't read up on the parody language known as INTERCAL, it's the inverse of GOTO. You code can be just puttering along and then be yoinked away without warning by a COME FROM anywhere else in the codebase.)
...or, for that matter, bad Python code where whether a function raises a NameError depends on where it's called from because it references variable bindings expected to be declared in earlier stack frames.
It's great to see all the different ideas to make ref counting more ergonomic, but so far I can't say I've seen a solution that fully resonated. Anyone else?
There are many proposals for actually ergonomic ref counting that appeal to me; I don't have a preference among them (e.g. trait based or binding based) as long as they actually solve the problem, which Niko's proposals do not.
I really can't take seriously the "muh systems language" crowd that demands everything be fully explicit, in a language that already has arbitrary implicit code running on scope exit. Rust doesn't even have a cultural stigma on expensive drops (e.g. recursive deallocation)! If you don't like automatic reference counting then don't use it.
For me, both explicit allocation and implicit deallocation on drop cut in the direction of encouraging memory frugality, so that asymmetry is actually a good thing.
It's not "muh systems language" but "I came from Python to Rust for the strong compile-time guarantees but, the more I use it, the more I become used to the memory efficiency".
I don't want Rust to wind up being de facto sloppy about memory consumption because of its ecosystem in the same way that D is de facto garbage collected because of how much of its ecosystem depends on the optional garbage collector.
I personally still prefer the block syntax over a function call syntax for consistency, and I guess I'm a little partial to super { } instead of move { } as it just feels a little more "logical" to me: the block isn't moving something but it is itself moving into the upper or super scope.
the block isn't moving something but it is itself moving into the upper or super scope.
It's kind of both. For || move(x.clone()):
x.clone() call is moved (hoisted) to outside the closure; as you're pointing outx.clone() is being moved into the closure as it's captured by ownership2 is Rust's usual meaning for move, hence the proposal using that.
If I really wanted to, I guess I could argue that a block like { x.clone() } returns a value by ownership, and this return value gets stored in a variable name, like let foo = { x.clone() };; now when you access foo from a closure you're merely referencing it, of course. But! Put that block in a closure and throw a super in there for good measure, and now the "return" from the block just "happens" inside the closure: so in a way, there's no move from outside of the closure into the closure, as the value never was really nameable outside the closure.
But of course, this is just semantics and indeed it's fully valid to argue that indeed the x.clone() is moved into the closure as well.
I'd love the super {} idea if it didn't also imply move. That is, I'd prefer:
move || iter.inspect(|i| inspect_item(i, super { tx.clone() }.clone()))
move keyword on the closure that we know and "love"super is a keyword with an associated block expression, ala if or for and friendssuper without a move:if (connect_thinga) super { ThingA::new() }.connect();
if (connect_thingb) super { ThingB::new() }.connect();
if (connect_thingc) super { ThingC::new() }.connect();
Haha wait a second this is horrible! What are the semantics of a move expression in a normal function? 😆
I think this is going in a good direction. I especially like that that the desugaring of move expressions is very easy to explain. However I'm wondering what happens if you're referring to a place inside a closure once inside a move expression and once outside of it in the closure. Will the place be captured twice? Will the move expression influence the other place that is not inside the move expression? Would it just be a compile error? I'm not sure what the best option would be here.
Also, I remember at some point there were discussions about &move references ( I found this pre-RFC https://internals.rust-lang.org/t/pre-rfc-move-references/14511 ). I'm not sure what the state of that discussion is, but it seems that these move references and move expressions could be syntactically ambiguous, or potentially confusing when reading code that uses both.
There is pretty serious work proceeding on &own T and &uninit T for usage in in-place initialisation. This seems like a worthwhile consideration on the design space, but beyond that may be unlikely to be picked up.
This looks great. "Just call clone()" felt too magical, and the capture list was just.. weird (substituting a "plain" path is okay, but saying foo.bar().baz doesn't actually call bar() feels like a bridge too far). This seems like it should integrate much nicer than either of those.
One concern I do have is that move(blah) feels too much like a "regular" function call given what it does. I'd rather see it as a block (move { blah }).
I'm also not sure on the postfix question.. the reasoning makes sense, but dealing with C# recently really helped me appreciate how much comfier foo().await.bar().await is to write (or read) than await (await foo()).bar(). On the plus side, I guess you wouldn't really end up nesting move blocks.
I have the perfect syntax for this:
$example = function () use ($message) {
var_dump($message);
};
I found it a bit un-intuitive. For instance the first example:
tokio::task::spawn(async {
do_something_else_with(
move(self.some_a.clone()),
move(self.some_b.clone()),
move(self.some_c.clone()),
)
});
I would expect the closure to still capture &self because using async move today would do that.
Yeah, this seems like a pretty fatal flaw, that the expression within the 'move' must be evaluated before the async block.
That's the whole point of this feature.
In this case you must call .clone() before the async block, but any syntax that keeps these expressions outside of the block needs some variable name or duplicated expression to refer to it inside the block.
Does anyone know why rust closures capture by reference implicitly? I understand closure authors usually want that because they are usually trying to reference a value that still needs to be used by the surrounding scope and often mutated by the surrounding scope. But with the default move rules in the rest of the language, it seems like it would be pretty comfortable to capture by reference within the closure by prefixing with an ampersand the same as non-closure code. It would be pretty difficult to accidentally move a value into a closure when you really didn't intend to since you would get an error due to the same value being referenced later in the surrounding scope.
It’s more complex than that: https://doc.rust-lang.org/reference/expressions/closure-expr.html#r-expr.closure.capture-inference
In general, going from least to most feels better to me than going from most to least. So capturing by &, then &mut, then by ownership, with the override to force all captures by owning, just feels right.
I appreciate the nuance here, that each is subject to inference rules that determine how to capture each individually. But if I accept your intuition about going from least to most, I would reframe my question as: does anyone know why rust function call arguments are not subject to the same inference rules as closure captures? In other words this code could avoid the type error:
fn strlen(s: &String) -> usize {
s.len()
}
let c = String::from("foo");
println!("{}", strlen(c));
// expected `&String`, found `String`
Is there a self-contained rationale for the asymmetry? Or is the asymmetry purely a byproduct of unrelated decisions?
Well, that wouldn't be inference: it would be a coercion. Rust does very few coercions, as a general rule. There's one or two big ones, and they don't apply in this case.
Captures were inferred because of the ergonomics of the way closures are usually used: as parameters to functions. Most of the time, the usage follows the pattern to infer captures, and so it felt redundant and noisy to include all the time. Furthermore, Rust tends to to infer function signatures, but does in closures, because closures written like this don't form an external API, and so the breakage and error message concerns don't apply.
Of course, sometimes people want explicit captures, for complex cases that feel not nice to work around, which is where some of the discussions around all of this have ended up.
I think the "problem" there isn't really about the closure syntax, but about the method call desugaring. For example, if you have:
impl String {
fn len(&self);
}
let s: String;
let f = || s.len();
Then f captures s by borrow because it desugars as || (&s).len() => String::len(&s), which is trivially a borrow.
The consequence is a bit less obvious, but it's the same desugaring that happens for all (borrowing) method calls.
Where it gets less obvious is that it will always (unless you use move explicitly) capture Copy types by borrow, since it doesn't need to invalidate the old object. It's understandable how we got there, but it also tends to end up causing the silliest lifetime errors (which can't be fixed with the usual std::mem::id() trick that can otherwise force rustc to capture a particular variable).
I may be missing your point but I find your example is an echo of mine. Given these two fn signatures:
impl String {
fn len(&self) -> usize;
}
fn strlen(&String) -> usize;
Both require a &String as their first argument but they require the caller pass it differently:
let title = String::from("...");
title.len(); // fn signature determines use
strlen(title); // error
You are showing that fn arguments are given similar use-bssed inferrence as closure captures when they are considered the subject of a method call. That seems in line with the least to most principle that Steve mentioned for closure inference. So we now have two cases where these inference rules are shown to be useful: method calls and closures. So my question is: given this proven usefulness, why aren't the same rules aren't applied to all fn arguments?
This is also not an inference issue, though it's much more shaped like it than the other one: this is how method lookup is defined, specifically. It is willing to ref and deref to find an appropriate target type. It's defined this way for methods because we didn't see the value in making people write . vs ->. It's not for any other arguments in methods, only the receiver, and so is consistent with the rest of function syntax just the same.
why aren't the same rules aren't applied to all fn arguments?
The rules are different only for methods because of that additional syntax to invoke them and because in practice without it code gets very messy. Rust doesn't auto ref-deref other things because of a general stance against doing such things because as a systems language, people enjoy control, and too much "magic" tends to upset people. There is a (relatively small) group of people that wished this rule were removed, rather than adding more inference to more places in Rust.
A lot of design details surrounding closures appear to have been optimized for use by the methods on the Iterator trait. I assume that's another one.
The moved value comes in via the explicit argument(s), and captures are for comparing against stuff adjacent to but not passing through the iterator chain.
What about nested closures? If move expressions only applied to individual variables then they could be hoisted through the appropriate number of nested closures to where the variable is bound. But with arbitrary expressions, I'm not sure there is any viable way but hoisting up a single level. This is not sufficiently general for the use cases I'm interested in.