Preventing outages with pkill's new --require-handler flag

59 points by cdown


jclulow

I feel like the log rotation example is a pretty egregious case of papering over one bug by creating a more insidious bug that will be much harder to diagnose. To wit: if you thought a process was going to reopen its log files on SIGHUP and engineers remove the handler by mistake, pkill -H will ignore this problem and allow the process to continue writing to the now rotated log file. It won’t be logging to the new log file, which will presumably then not contain any evidence that this has occurred, and it might corrupt or screw up any subsequent attempts to compress or archive the rotated log that it still has open.

It seems far preferable to have the process terminate unexpectedly, have the service management system restart it and log that it’s done it, and be able to actually detect this new bug ASAP and just get it fixed.