This post will largely concerned with retraversing property-testing basics using Hedgehog, such that a beginner or QuickCheck user can get started reasonably quickly and understand why things are the way they are. Eventually, I'll get to a more detailed post about using Hedgehog to test a typed programming language.
Hedgehog is ideal for complex generators. It has allowed me to fly by the seat of my pants, identifying capture bugs, colossal mistakes in the type checker, broken map merges, mistakes with code generation, and parser / pretty-printer ambiguities. In my opinion, it greatly improves the ergonomics of property testing, and is worth considering over existing alternatives.
If you're not familiar with property testing, now's a great time to learn. The idea is pretty simple: rather than writing unit tests, which apply some predicate to predefined points in the state space, we simply write a random generator for all good inputs, take a large sample, and apply the predicate to the entire sample.
Here's a Hedgehog version of the Hello World of property testing: reversing a list twice should produce the original list.
We generate a list of Int, where the list has anywhere from 0 to 100000 elements, and each Int lies somewhere between that type's minBound and maxBound as defined by the Enum class. So, INT_MIN and INT_MAX. We can run this property with the check function:
Although it takes some practice, property testing is an effective and cheap way to gain confidence that your program works.
When a property fails, the generated value becomes a counterexample, and it must be communicated to the operator. Since it came from a random generator, though, it is likely to be a colossal and inane example. Printing it to standard output and handing it off to the operator immediately is going to ruin their life for the next few hours.
For this reason, most property-testing libraries will include some notion of shrinking, a deterministic search that reduces the counterexample until the property stops failing. The user only sees the smallest counterexample.
In Hedgehog, a reasonably good1 shrink function is constructed for free alongside every generator. Shrink functions are coupled with every generator, and a lazy tree of all possible shrinks is constructed alongside each generated value. Shrinking is then little more than traversing the tree. This is Hedgehog's theoretical point of difference from QuickCheck, though it also differs in a few practical ways.
Let's play to my strengths here and write a piece of bad code, to demonstrate how genIntList shrinks. My bad code will be a subtly broken implementation of reverse. We can test it against an oracle function that we know to be correct, List.reverse.
Recall that genIntList has a linear Range from 0 to 100000 for its length, and the elements are generated using enumBounded, which uses a Range from minBound to maxBound. When this property inevitably fails, we'd expect to see this information used to inform the counterexample search, and indeed we do:
Generators made out of Hedgehog.Range are constructive; Range contains enough information to generate a correct value every time, and each Range implies a shrink step and gradient. Ranges are good!
We might like to write a more precise property to protect against dropped elements. A reverse function should preserve every element in the list!
While Range helps us construct generators in terms of bounds and steps, we may also wish to write generators in terms of negative invariants. It may be simpler to generate terms, apply some predicates to them, and discard the invalid ones. This gives us the flexibility to write much more complicated generators, though they may take longer to run, or diverge.
Generators that discard heavily are certain to be less productive than other constructive generators. They can generate an arbitrary number of invalid values for each valid one, at the expense of your CPU cycles and precious seconds you probably needed for your life. So, this is a tool to be used gently and infrequently.
In practice, many invariants can be expressed constructively in terms of ranges and bind. Regardless, negative generators that discard are necessary when a constructive generator is not feasible.
In QuickCheck (and Jack) we'd use the suchThat combinator for this.
Hedgehog's Gen implements Alternative and MonadPlus, and the user is free to use empty / mzero / (<|>) / asum to indicate failure. There are also a few helpful named wrappers and functions over these instances, like Gen.filter(which is similar to suchThat) and discard. Discarding is a first-class notion in Hedgehog, and invariants enforced this way are also enforced during shrinking.
I use discard all the time, usually during large choice / asum blocks. A good example is in Hedgehog's STLC example ; if no values of a certain type are currently bound in scope, the known-value generator fails, and the choice block higher up will generate a literal instead.
Here's a quite vacuous example: a generator for sized lists of unique elements. To provide a better demonstration, I'll use some fairly constraining invariants: elements must be valued between 0 and 10, and the list's length must be between 0 and 10:
Using Gen.sample to inspect some random values from this generator, we see that it works:
If you were to run the generator just once and inspect its shrinks using Gen.print, you would observe that this generator fails nearly all the time. We get slightly better results when rewritten with Gen.filter, which scales the ranges behind the scenes:
Hedgehog.Gen contains useful combinators for various containers, including maps and sets. These are pretty handy, and much more convenient than rolling it yourself:
In short, discarding in Hedgehog is easy, it still leads to meaningful and correct shrinking, and there are some helper functions in Hedgehog.Gen you should consult with before doing it yourself. As with QuickCheck, keep an eye on the values coming out of your generator, as discarding can negatively impact coverage.
Note that shrinking exists in QuickCheck, and is reasonably good. Every Arbitrary instance carries a shrink :: a -> [a] function. There's also forAllShrink if you wish to avoid Arbitrary and all its orphan works. All the Arbitraryinstances in popular use come with shrinking functions. However, since shrinking is not fundamental to Gen, we can't compose generators without also manually constructing a shrink function. If the generator has few invariants, we may get away with genericShrink, but in practice engineers often opt to forego shrinking altogether when working with complicated generators.
Hedgehog supports custom shrinking in the QuickCheck style via Hedgehog.Gen.shrink.
Hedgehog's predecessor, Jack, was built as a QuickCheck utility library. It reused QuickCheck's runner and notion of Property, and leaned on the wealth of generators out there on Hackage.
QuickCheck is beautiful, long-lived, and well-maintained, and I've used it a lot to great benefit. It's just a relatively old codebase, with lots of subtle niggles.
Hedgehog does not depend on QuickCheck. Starting from scratch with a new dependency base and no backwards-compatibility requirements allows a few fun improvements:
Hedgehog allows first-class effectful generators and effectful tests. The user must track their effects in the type, as you would in ordinary Haskell. There's no unsafePerformIO under the hood.
The type of generators, Gen, forms a monad transformer. It has instances for all the classes you'd expect, like MonadIO and MonadTrans. It also has MTL-style instances, MonadState and the like. When those are insufficient, it implements MFunctor, so you can hoist your way to victory.
Here's a pretty silly IO generator:
It will be interesting to see what kind of effectful generators people find useful. I've mostly been using this transformer redundantly, managing environments with ReaderT to clean up complicated code. I can't think of a use for an IO generator, but nor can I think of a reason they should be prohibited.
The type of tests, Test, also forms a monad transformer with the similar instances to Gen.
To write an IO test, we simply use liftIO to our heart's content:
To test functions in Either or ExceptT, liftEither and liftExceptT functions are provided. These use pretty-show on the error, and use GHC call stacks to isolate the line of code responsible for the failure:
The keen observer will notice that property :: Test IO () -> Property requires the user to finagle their monad stack into IO before it can be run via mmorph. This is one place the interface could be improved! Hedgehog is just getting started.
In addition to checkSequential, which behaves much like quickCheckAll, Hedgehog supplies checkConcurrent. It uses concurrent-output to wrangle the terminal.
Note that concurrently checking IO properties may lead to nondeterministic failure. The user must identify and sequentialise tests that cannot be run concurrently.
The API for this is in slight flux at the time of writing, so please check out the examples and documentation for more.
There is nothing worse than recreating a complex counterexample from stdout. Lots of libraries (including time) have Show instances that perform pretty-printing, rather than the derived version, and these can't be easily pasted into GHCi. If you're bad at writing manual shrink functions, a QuickCheck counterexample will often be huge, perhaps larger than your clipboard buffer! This is a bad place to be.
When a property fails, Hedgehog will print the seed used by the random number generator. This seed is portable across platforms, and can be pasted directly into GHCi:
Hedgehog uses the new call stack API in GHC 8 to great effect when reporting property failures. Most combinators carry a HasCallStack constraint. This enables the inline display of generated values:
Hedgehog depends on pretty-show, and uses it to print counterexamples where possible. The Show constraints on various combinators like (===) and liftExceptT are used in this way. You may still get a wall of text if your counterexample doesn't reduce well, but it'll be a heck of a lot neater.
This is a simple trick we can extend to regular ol' QuickCheck, by simply reimplementing (===). You should absolutely do this. Life changing.
This is a minor aesthetic difference, but Hedgehog has no equivalent to the Arbitrary typeclass. Since test frameworks typically live outside of a project's dependency tree, and are never in base, reliance on a typeclass to provide test coverage creates a slightly awkward situation. Users have to choose between orphan instances and spurious dependencies.
Arbitrary is a little bit questionable anyway, since it is a lawless class used mostly for overloading. Living without it is fairly easy; we just give all our generators names, and summon them by name using forAll. It does come at a price: we lose the pretty prop_abc x y z = ... shorthand.
© 2017-2019 Tim Humphries