Techniques for better software testing
31 points by amw-zero
31 points by amw-zero
Another technique is to enable sanitizers while running tests (e.g. TSan, ASan).
Go in particular makes this pretty easy: just do go test -race ./... to enable the data race detector while running unit tests.
Totally, they're like constantly running assertions. You may as well check everything you can while traversing your application
A well-known but underused instance of validation from within that I have found particularly useful is to have comprehensive internal consistency checks for custom data structures that only run on debugging builds; for example, visiting an entire tree to make sure the ordering invariants hold or the colouring is consistent. This works quite well with any form of randomised testing, as it forcefully exposes internal faults early that might otherwise not be externally visible.
(These kinds of assertions should crash execution as forcefully as possible instead of e.g. throwing an exception that can be innocently but disastrously caught by some overeager handler up-stack. Ideally the testing framework will run the tests in separate processes to catch these situations, which is also helpful to parallelise the test suite.)
And on the subject of debugging builds, make sure you also run your tests with production builds. Sounds obvious and it is, but you'd be surprised how easy it is to forget about it until you're caught out by some UB that never shows up without optimisations...
Also missing from the article are test oracles in the form of parallel, but usually simpler or less performant, implementations, which likewise play particularly well with randomised testing: you apply the same sequence of operations on both sides and compare the results at each step. This one is particularly popular with hardware projects, with a test oracle (usually called a model) implemented in software that's much easier to validate.
I’m a huge fan of this approach, I’ve written about it extensively. It’s surprising how often it’s useful.
My explanation is that you are never doing the truly simplest thing in practice, because even things like using a database are mired in the practical. Nothing is simpler than functions on pure data, so expressing your spec as that has a ton of benefits.
I wrote about this kind of thing last year. To emphasise a couple of points from that blog post:
I prefer not to run a comprehensive check_rep() on every operation because that makes the tests quadratically slow. Instead I like to do local consistency checks, eg, that an element is present or absent as expected, that it has the right neighbours, etc. I run check_rep() occasionally for more thorough validation.
Also, unlike the way this idea is usually presented, I tend to find it’s helpful if the model implementation doesn’t necessarily provide the same API as the production implementation. I like to tune it to support the kinds of local validity checks that are helpful in the tests but would be unworkable in production (eg, because the cross-references use too much memory, or because it’s based on a preallocated array of possible elements). This also helps me to think of a completely different style of implementation that is less tainted by what I know about the production code.
I had actually read your post sometime last year, it's a really interesting approach. Thanks for linking to it!
are you familiar with refinement mappings? In the TLA+ literature, this is how to resolve differences between spec and impl, though it focuses on differences in states and not exactly APIs. But a similar idea can be used to make them comparable.
Not sure if you're doing this or something else.
(OT, web design complaint: scrolling via page up/down and arrow keys only works after clicking on the page body in recent Chromium and Firefox on Linux.)