Never snooze a future
26 points by thunderseethe
26 points by thunderseethe
"Never snooze a future" sounds like great advice, but the big problem I see with it is... I can barely tell what snoozing a future looks like. I consider myself pretty good at async Rust, having written it since it came out around 2019/2020, but all of these examples in the article look so different and similar to otherwise valid async Rust. So I'm afraid this advice is basically unactionable.
The larger takeaway I got from this article, like most articles about cancellation safety (/ snoozing futures), is that this kind of issue is baked-in to the design of async Rust and basically can't be fixed.
I agree with you! That's why the final section is about lints we could add that would catch these things automatically. Part of the difficulty is that those lints would fire for a lot of working code today, though, so we need viable replacements for the patterns that code today is using. New macros like join_me_maybe::join! are the other half of the story, and getting things like that into futures and tokio is probably a prerequisite.
I guess the advice is simply to keep in mind which async constructs will behave badly when acting on locks.
If you do not hold async locks (and thus locks across awaits) you should not run into this issue*. And frankly that has worked pretty well for me.
* Also to note that the async locks are slower than the std ones, so avoid them if you can.
If you do not hold async locks (and thus locks across awaits) you should not run into this issue*. And frankly that has worked pretty well for me.
I 100% agree and I bring this up in code review when it's relevant. But the advice of "don't hold locks across await points" is different advice than "never snooze a future". Holding a lock across an await point is somewhat easy to notice because it has a relatively simple shape. (There's even a clippy lint for it!)
But for never snoozing a future, it's hard to catch that when it can happen *gestures broadly* almost any time you're driving more than one future on a single task. The author even seems to know that their advice isn't very actionable: "there probably isn't a simple, mechanical rule".
That being said, I agree with the author when they say "I think we can live with that". I see a lot of fear of async Rust online and in-person, but I still think Rust provides the best experience for writing low-level, high-performance async code. I can count on one hand the number of times I've encountered a futurelock in the wild, and every case was using fairly advanced async features and was caught before reaching production.
The takeaway I had from oxide's recent run in with Oxide's futurelock incident is that you don't always know what locks something might take in the background (or how that might change). It's definitely a minefield.
Exactly. The original Futurelock bug was an internal semaphore in Tokio's bounded channel implementation. (Sidenote 3 in the article.) Telling folks not to use locks is totally impractical. They're buried in everything, and library authors don't consider it an incompatible change to start using them, nor should they. The point of the long digression about threads is that functions with documented warnings like "this is only safe as long as you don't use threads or locks" are radioactive.
I don't like the advice "do not hold async locks" unless you go back one step and point out that the problem is not lock().await, it's requiring a &mut T across/during an .await in a context requires Send + Sync (so, any tokio task involving a method that might be called concurrently).
Say you're sending data out on a socket and share that socket from many tasks, you need some kind of async lock or something that resembles one, like using channels to serialize mutable access to the resource. Sometimes, it's cleaner just to use an async lock instead of spawning a dedicated task that has exclusive ownership of the resource.
I'm quite experienced in Rust and I have long ago decided to keep away from async locks. Maybe some day I'll have to use one, but so far I've managed without :)
As a Rust noob I’m genuinely scared of async. Every time I try to learn how it works or how to use it right I’m flabergasted by the complexity and amount of edge cases. I wonder if it’s really worth it. I can understand why JS needed it (because it’s single-threaded) but do we realy need it in Rust? Do we really need to squeeze every single bit of performance in the world where JS/PH/Python/Ruby/Lua collectively dominate?
Do we really need to squeeze every single bit of performance in the world where JS/PH/Python/Ruby/Lua collectively dominate?
Perhaps JS/PH[P?]/Python/Ruby/Lua collectively dominate when considering programming as a collective whole, but I think it's pretty uncontroversial that they don't uniformly dominate among all programming subfields. Different specific use cases will favor different tradeoffs, so I think it's a good thing that performance-oriented options exist for those use cases where performance is important.
That being said,
I can understand why JS needed it (because it’s single-threaded) but do we realy need it in Rust?
"Need" is a pretty high bar to meet IMHO; I don't feel like there are many situations where a particular feature literally makes or breaks a language. Perhaps Rust would have survived without async, but since we don't have a time machine it's hard to say for certain to what extent Rust may or may not have thrived without async.
In any case, if you haven't seen it already you may find withoutboats' blog post Why async Rust? interesting. As the author (a primary designer of Rust's async feature) puts it, it's an "imperfectly organized and overly long explanation of how async Rust came to exist, what its purpose was, and why, in my opinion, for Rust there was no viable alternative", and I think it's well worth the read both for the historical and technical content.
It's worth bearing in mind that they needed a model that scaled down to systems that didn't have dynamic memory allocation (or the space for it) and that worked well with the borrow checker. And there's nothing stopping you from effectively using a subset of async functionality that only ever deals in 'static lifetimes, eg: using spawn everywhere, etc.
It is possible to build a kind of intuition about how it works that'll help you avoid most of the sharp edges, but like the borrow checker, it'll take a minute.
It definitely still has its sharp edges. A lot of the ecosystem also depends on traits like Stream (a.k.a. "AsyncIterator") that eventually need to be in std but aren't yet. And part of my thesis in this article is that Streams need to be fixed in some incompatible ways.
All that said, because the whole apparatus is designed to be no_std-compatible, the underlying machinery isn't actually all that big. Well, Tokio is big, but you can build a toy version of it with a relatively small amount of code. I have a three-part series on this: https://jacko.io/async_intro.html. If you're interested in the topic, maybe take a look at that, and there's a chance it'll "click" enough that you feel comfortable using it?
Do we really need to squeeze every single bit of performance
That's part of the story, but there are other parts too. Embedded environments that don't support threads want a way to represent concurrency. Plus threads can't be cancelled, and futures can be. A lot of real applications end up wishing they could cancel threads, which is part of why Raymond Chen has so many good quotes about how you should absolutely never do that :)
I liked this article a lot! Took me longer than usual to finish because I kept getting distracted by the excellent CSS lol. I gotta adapt some for my own blog...