Healthchecks.io Now Uses Self-hosted Object Storage
21 points by danlamanna
21 points by danlamanna
[on OVH:] Unfortunately, over time, I saw an increasing amount of performance and reliability issues.
[on UpCloud:] Unfortunately, over time, the performance of UpCloud object storage deteriorated as well.
Does someone know the reason why the object storage performance deteriorated over time at two different providers?
I don't but I can throw in Hetzner as another European provider having stability issues with their object storage offering. Their subreddit is full of people choosing to migrate away. Which is sad because the general consensus seems to be they are an otherwise excellent cloud provider.
My guess is just due to the current political climate, the european replacements are becoming more popular and are struggling under load.
healthchecks.io is so great. I love moving to self-hosted, particularly for a small service like this. Too bad it's more expensive, but they don't have access to the economy of scale.
A little surprised at the choice of btrfs. The post says it's because it's better than ext4fs, no argument here! I'd have gone with ZFS with two disks but maybe that's not an option? Even in 2026 it's still a PITA to add ZFS to arbitrary Linux environments. I self-host stuff with Proxmox now, which makes it easy.
At zeitkapsl we are also currently replacing Backblaze and Hetzner object storage with a similar solution, but self written with built in scrubbing on top of plain EXT4/XFS with xxh64 checksums and an sqlite db for quick lookup. This allows us to go for asymmetrical disk layouts.
All we needed was signed PUT/GET requests and scrubbing for bitrot detection.
ZFS does this, but is very picky regarding disk layouts (can’t have mixed sized disks).
We also evaluated seaweed, but it was just too complicated from operational perspective.
Garage looked a lot more promising in this regard, but the metadata DB was absurdly big. For 50 million keys it grew to 700GB.
The hand rolled solution now takes 2.3 GB including scrubbing indexes for the same amount of data and keys in an sqlite db. And if the db goes corrupt for whatever reason, we simply rebuild it from the filesytem because the checksums are written there as well.
For now it is just the 3rd replica, but ultimately will be the primary if tests continue to go that smoothly.