Things that go wrong with disk IO

27 points by eatonphil


jjdh

This gets even weirder in hyperscaler clouds where it’s like.. what does it even mean to fsync? We were on a call with firmware engineers at GCP at some point where they said essentially: when you fsync, definitely that has nothing to do with if anything was written on disk, it means the battery-backed bit of hardware has your write in RAM and will start working on getting it over the network (and if you read it back to confirm it was written we will serve it from that RAM) but: if the hardware dies and we lose your data, we will make the whole disk unmountable. So, you will never lose a single block if you’ve had a successful fsync, you will lose the whole disk. This way you don’t have the problem of recovering from a crash and starting on a valid but outdated disk state, you will realize data was lost.