Pierre Zemb from Clever Cloud on FoundationDB
11 points by eatonphil
11 points by eatonphil
Interesting shipped vs operated software distinction - definitely knew the problem, but not these particular terms to describe it.
Don't know much about FoundationDB to know makes it so resilient - does it have a simpler design? A clever trick somewhere? Both?
Author here 👋 I’d say fdb’s resilience comes from a clever design: each process handles one or more role, and those roles can be scaled out independently as needed. On top of that, a deterministic simulation framework stress-tests the database against conditions worse than production to verify it behaves correctly.
I don't work on it but have followed the project for years. Like many highly resilient systems I think it's down to two things.
Getting something resilient requires both really careful planning and design but also a lot of upfront tooling investment for testing.
I was living in DC at the time of the FoundationDB beta, before Apple acquired them, and I remember them presenting the design and features at a meetup. Early 2014 maybe. I was impressed by the ambition of the project. When Apple open sourced it 5 years later, my first thought was whoa this is a lot of stuff to deploy. But it strikes me as all necessary stuff if you really want a big scaling distributed database.