In defense of lock poisoning in Rust

71 points by claytonwramsey


kornel

I think poisoning is a bad solution to a very real problem.

Poisoning is persistent. Without implementing a specific mechanism to fix/restore/reset the state, the lock remains useless forever. The application can easily get permanently stuck in a livelock-like state, where it keeps tripping over poisoned locks, systematically aborting requests/operations/threads, but error handling that limits "blast radius" can prevent it from crashing hard enough to restart the whole process.

Poisoning is detected too late, after the state has already been lost. An outside observer can't know what really happened, the stack has been unwound, so strategies to recovery are very limited.

Compare it to something like a Drop guard used in the critical section. The guard can fix/restore the state right away, before the lock is even unlocked, while the necessary data may be still available! The critical section can still unexpectedly panic, but it doesn't have to leave a bad state behind. You can also "pre-poop your pants": when entering the critical section, swap the state with some tolerable temporary placeholder. If the critical section gets interrupted you'll have some data loss, but not corruption.

Instead of poisoning that can cause stable failure states, I'd prefer locks that automatically reset potentially-corrupt state (replacing it with some default value). Or just abort the whole process instead of poisoning — at least a process watchdog or crash reporter will restart it and restore some working state, instead of leaving a live application in a vegetative state.

Poison<T> doesn't prevent corruption, only contains it a bit, and adds extra work for handling the poisoning and recovery. There are more specific patterns that can prevent corruption and reduce the faff (native defer, lock.with(critical_section_callback, reset_state_callback), temporarily getting owned values from &mut, etc.)