A one-line Kubernetes fix that saved 600 hours a year
23 points by knl
23 points by knl
TL;DR: securityContext.fsGroupChangePolicy defaults to Always for PVs, if you have many files it's more efficient to use OnRootMismatch to skip chmod all the files when the PV restarts.
shakes fist at clickbait
oh I don't know, I enjoyed it. I like knowing the debug steps used, it lets me check my own instincts for how I would have investigated the issue. I thought it was a good article.
why the did they wait until it was taking 30 minutes for restarts to come back up to start investigating? waiting until it was that bad to investigate seems really amateur
Not to mention: why did they use a tool that needs to be bounced for such trivial actions?
Yeah, that stood out to me as well... To me, personally, something like that would've been cause for concern as soon as it started obviously increasing in the total time; going from 1 minute 5 minutes would've been enough to warrant an investigation, but I guess Cloudflare has bigger fish to fry.
It would be interesting to see some benchmarks to see at what point the configuration change starts to have an impact.
Yes, exactly. Once it is has gone up an order of magnitude (or possibly two in this case, based on the original reboot time) I'd want to know its not going to become a bigger problem.
To be clear, I hope I'm not coming off as criticizing the article author, just wondering what's up with the company that this sort of thing happens in the first place. Bigger fish to fry, sure, but smells like a culture that creates tech-debt (which can cause something possibly bigger than this to happen).