Property-Based Testing in Practice

26 points by natfu


Property-based testing (PBT) is a testing methodology where users write executable formal specifications of software components and an automated harness checks these specifications against many automatically generated inputs. From its roots in the QuickCheck library in Haskell, PBT has made significant inroads in mainstream languages and industrial practice at companies such as Amazon, Volvo, and Stripe. As PBT extends its reach, it is important to understand how developers are using it in practice, where they see its strengths and weaknesses, and what innovations are needed to make it more effective.

coxley

Joe Cutler (one of the authors) presented a PBT talk at NYC Systems over the summer. It references this paper a bit, and was highly entertaining in person. :)

https://youtu.be/ux49IvxKQR8?si=gO9SelYq5IwXlyIj

osa1

PBT is one of those things that can be really effective, but it can also be hard to use effectively, and can be misused. When misused it can give you a false sense of confidence with your testing.

Their research opportunity 6 ("Improve tools for evaluating testing effectiveness.") would improve this, I'd love to work on this one day. It's very easy to have generators that just generate basically the same valid values that are used the same way. For example, if you check validity of a string argument, it might make sense to select between just two string values (one valid, one invalid) instead of using a generic string generator and trying wasting testing time/iterations trying different valid and invalid strings.

(Then test your string validity check separately with random strings, if needed.)

The users here also seem to have side-stepped the issues with some of the older PBT libraries like Haskell's QuickCheck, which associates generators and shrinkers with types, which doesn't make sense. Most types are simply too general for what they're representing. It makes sense to test with everything that e.g. a function can take, but you also want to test with e.g. only valid values, or only with some arguments valid and others truly random etc. for good test coverage. This is related to the previous point, it's too easy to spend a lot of test time/iterations trying the same code paths.

I'd also love to see some code with properties checked, and complex generators etc. AFAIK there aren't a lot of open source code taking PBT seriously and applying it to large software that can't afford to go wrong or even stop working with an exception/panic/etc.

ahelwer

IMO a lot of the benefit of property-based testing comes from the process of harnessing your test target so it can be run in a PBT-supporting way. At that point you can just use the property-checking function to quickly but manually write a lot of test cases covering all the functionality you are interested in. Turning test cases into data instead of code is wonderful. Writing a generator for that data can be quite difficult; I am reminded of Knuth’s remarks in Stanford Graphbase about the difficulty of generating interesting random graphs, leading him to painstakingly curate appropriate real-world datasets instead.

cole-k

Not an expert myself, but I know of this related paper: https://dl.acm.org/doi/pdf/10.1145/3764068

(disclaimer: I know of it because I've talked to the authors before)