Don't Trip[wire] Yourself: Testing Error Recovery in Zig

53 points by asb


mxey

In FreeBSD this is called failpoints.

I built a small, internal Go package for this using Context values. I can inject an error return, but also a context cancellation or stall. In property tests, I break random code paths and check that invariants still hold.

quasi_qua_quasi

Yeah, this is a really great technique and you can adapt it for any language. I've even done this in an RPC environment: there's a field that will trigger a failure that will let you customize how it fails (returning an error vs throwing an uncaught exception) as well as include a message so you can test that the error recovery properly propagates error messages. The teardown verifying that the tripwire was hit is a good trick for in-process stuff, though, I'll have to remember that!

Though I wound up leaving it in in the release build because that way I don't have to worry about diverging codepaths. (They weren't in hot loops so the performance didn't matter.)

osa1

Interesting idea, I'd not seen this kind of thing before.

I wonder if this could be done at the process level, without any cooperation from the program being tested. E.g. somehow declare which call (using DWARF source locations or similar) should fail with what (e.g. specify the return value, or a function that should be called instead of the original one), and a process wrapper that uses ptrace and/or related system calls to insert a breakpoint at the call point and call another function or return the specified value instead of calling the original functions.

Can this be done? Is it already done?