parking_lot: ffffffffffffffff
62 points by jparise
62 points by jparise
love a technically juicy postmortem. Even if the Rust nuance was kind of lost on me as I don’t use the language.
This is why I dogmatically obtain locks and explicitly drop(...)
them more often than not anymore. I’ve been bitten by this a few times now.
Awesome writeup!
Very cool art! A bit concerned that the linked commit fixing this in the upstream repo does not seem to contain any changes to tests, esp despite fly.io referencing a good repro test case themselves
I notice I’m confused. Rust’s atomics have an atomic fetch_and, so why is necessary to use the dead-reckoned fetch_sub
trick?
Is fetch_and
not supported on some relevant hardware? I’m not familiar with e.g. ARM’s atomics.
On x86, lock xadd
(fetch_add) and lock xchg
(swap) are the only fetch_*
RMWs that return the previous value. The others like lock and
(fetch_and) do not. So to compensate, using the value of fetch_and
requires compiling it down to a cmpxchg
loop which is less efficient under contention than a xadd
& therefor undesirable.
I can’t pinpoint it exactly, but the way this is written really threw me off.
I agree. Part was the early statement of “you probably don’t use [Rust]”. It might need a month to stew and be re-written. However, I have some difficulty sometimes coming up from a deep dive under meters of water and making any kind of sense to the first person I see.
“There was an alien habitat and Ed Harris was there!” 🤿
What a fascinating write-up, I thoroughly enjoyed it.
Cool blog but a little concerned. I’ve never had to write anything this complex and multithreaded before but I would have thought that rust would guard against something like this better… isn’t fearless concurrency the tag line
It’s only ‘fearless concurrency’ if you are not afraid of deadlocks. There’s no protection against that and you will run into them regularly if you aren’t careful. This is a logic bug, the borrow-checker cannot help here. Rust is not a panacea that prevents all bugs. Despite that, the value that safe Rust provides even in multithreaded programs is invaluable.
Deadlocks are a walk in the park compared to data races. If a program deadlocks, you can attach a debugger and see where each thread is deadlocking. OTOH you won’t be able to pause on a data race, and may never be able to witness it in a debugger.
The problem in this post was especially nasty because it was more than a typical deadlock. It was a bug in parking_lot
’s locking logic. That is scary.
I do think Rust should try to help with deadlock more than now. “It is a logic bug” is a cope. Rust does runtime bound checking in addition to static borrow checking. It could do additional runtime checking to help with deadlock, maybe something like lockdep in Linux kernel.
“It is a logic bug” is a cope
No, it’s not. It’s an objectively true statement. Deadlocks are well-defined behavior that is regularly exhibited by memory-safe programs. People just constantly move goalposts, being surprised and exasperated that Rust does not prevent their favorite class of bugs. Could it do better? Sure it could, but let’s keep the discussion true & real.
Fwiw, we absolutely could have had standard locking primitives that make deadlocking impossible or at least significantly harder. Check out happylock. Whether it’s worth the loss in ergonomics is an interesting question, but Rust seems to have taken a ‘yes, it’s worth it’ position on a lot of similar issues anyway.
Whether it’s worth the loss in ergonomics is an interesting question, but Rust seems to have taken a ‘yes, it’s worth it’ position on a lot of similar issues anyway.
Which has no bearing on this one. That Rust tends to value correctness over convenience more than the average langage does not mean it does so in every situation (and it very much does not).