24 Jan 2017
Haskell is an amazing language. With its extremely powerful type system and a pure functional paradigm it prevents programmers from introducing many kinds of bugs, that are notorious in other languages. Despite those powers, code is still written by humans, and bugs are inevitable, so writing quality test suites is just as important as writing an application itself.
Over the course of history buggy software has cost industry billions of dollars in damage and even lost human lives, so I cannot stress enough, how essential testing is for any project.
One of the ways to test software is through writing unit tests, but since it is not feasible to test all possible inputs exhaustively for most functions, we usually check some corner cases and occasionally test with other arbitrary values. Systematic generation of random input, that is biased towards corner cases, could be very helpful in that scenario, and that's where QuickCheck comes into play. This state of the art property testing library was originally invented in Haskell, and, because it turned out to be so powerful, it was later ported to other languages. However, the real power of random testing is unleashed when combined with purity of Haskell.
Properties
Let's start by looking at this exemplar properties of a reverse
function:
reverse (reverse xs) == xs reverse (xs ++ ys) == reverse ys ++ reverse xs
We know, that they will hold for all finite lists with total values. Naturally, there are ways to prove them manually and there are even tools for Haskell, such as LiquidHaskell, that can help you automate proving some properties. Formal proof of correctness of a program is not always possible: some properties are either too hard or impossible to prove. Regardless of ability to prove a property of a function, we at least need to check that it works correctly on some finite set of inputs.
import Test.QuickCheck prop_RevRev :: Eq a => [a] -> Bool prop_RevRev xs = reverse (reverse xs) == xs prop_RevApp :: [Int] -> [Int] -> Bool prop_RevApp xs ys = reverse (xs ++ ys) == reverse ys ++ reverse xs
We can load those properties into GHCi and run quickCheck
on them. Here is a
quick way on how to do it from a terminal, and a detailed guide
on how to get started with stack
.
$ stack --resolver lts-7.16 ghci --package QuickCheck
Configuring GHCi with the following packages:
GHCi, version 8.0.1: http://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /tmp/ghci3260/ghci-script
Prelude> :load examples.hs
[1 of 1] Compiling Main ( examples.hs, interpreted )
Ok, modules loaded: Main.
*Main> quickCheck prop_RevRev
+++ OK, passed 100 tests.
*Main> quickCheck prop_RevApp
+++ OK, passed 100 tests.
*Main>
What just happened? QuickCheck called prop_RevRev
and prop_RevApp
100 times
each, with random lists as arguments and declared those tests as passing,
because all calls resulted in True
. Far beyond what a common unit test could have done.
Worth noting, that in reality, not only prop_RevRev
, but both of those
properties are polymorphic and quickCheck
will be happy to work with such
functions, even if type signatures were inferred, and it will run just fine in
GHCi. On the other hand, while writing a test suite, we have to restrict the
type signature for every property to concrete type, such as [Int]
or Char
,
otherwise type checker will get confused. For example, this program will not
compile:
import Test.QuickCheck main :: IO () main = quickCheck (const True)
For the sake of example let's write couple more self-explanatory properties:
prop_PrefixSuffix :: [Int] -> Int -> Bool prop_PrefixSuffix xs n = isPrefixOf prefix xs && isSuffixOf (reverse prefix) (reverse xs) where prefix = take n xs prop_Sqrt :: Double -> Bool prop_Sqrt x | x < 0 = isNaN sqrtX | x == 0 || x == 1 = sqrtX == x | x < 1 = sqrtX > x | x > 1 = sqrtX > 0 && sqrtX < x where sqrtX = sqrt x
Now, this is great, but how did we just pass various functions with different
number of arguments of different types to quickCheck
, and how did it know what
to do with them? Let's look at it's type signature:
λ> :t quickCheck quickCheck :: Testable prop => prop -> IO ()
Testable
So, it seems, that QuickCheck can test anything that is Testable
:
λ> :i Testable class Testable prop where property :: prop -> Property exhaustive :: prop -> Bool instance [safe] Testable Property instance [safe] Testable prop => Testable (Gen prop) instance [safe] Testable Discard instance [safe] Testable Bool instance [safe] (Arbitrary a, Show a, Testable prop) => Testable (a -> prop)
The last instance is for a function (a -> prop)
, that returns a prop
, which,
in turn, must also be an instance of Testable
. This magic trick of a recursive
constraint for an instance definition allows quickCheck
to test a function
with any number of arguments, as long as each one of them is an instance of
Arbitrary
and Show
. So here is a check list of requirements for writing a
testable property:
- Zero or more arguments, which have an instance of
Arbitrary
, that is used for generating random input. More on that later. - Arguments must also be an instance of
Show
, so if a test fails, offending value can be displayed back to a programmer. Return value is either:
True
/False
- to indicate pass/fail of a test case.Discard
- to skip the test case (eg. precondition fails).Result
- to customize pass/fail/discard test result behavior, collect extra information about the test outcome, provide callbacks and other advanced features.Property
for a much finer control of test logic. Such properties can be used as combinators to construct more complex test cases.Prop
used to implementProperty
- Start with
prop_
orprop
, followed by the usualcamelCase
, but that is just a convention, not a requirement. - Has no side effects. Also not a requirement, but strongly suggested, since
referential transparency is lost with
IO
and test results can be inconsistent between runs. At the same time there are capabilities for testing Monadic code, which we will not go into here.
Preconditions
Here is another very simple property of lists xs !! n == head (drop n xs)
,
so let's define it as is:
prop_Index_v1 :: [Integer] -> Int -> Bool prop_Index_v1 xs n = xs !! n == head (drop n xs)
Naturally, you can see a problem with that function, it cannot accept just any
random Int
to be used for indexing, and quickCheck
quickly finds that
problem for us and prints out violating input along with an error:
λ> quickCheck prop_Index_v1 *** Failed! Exception: 'Prelude.!!: index too large' (after 1 test): [] 0
Interestingly, if you try to run this example on any computer, there is a very
good chance that it will give exactly the same output, so, it seems that input to
properties is not completely random. In fact, thanks to the function sized
,
the first input to our property will always be an empty list and an integer 0
,
which tend to be really good corner cases to test for. In our case, though, !!
and head
are undefined for empty lists and negative numbers. We could add some
guards, but there are facilities provided for such common cases:
prop_Index_v2 :: (NonEmptyList Integer) -> NonNegative Int -> Bool prop_Index_v2 (NonEmpty xs) (NonNegative n) = xs !! n == head (drop n xs)
This version is still not quite right, since we do have another precondition
n < length xs
. However, it would be a bit complicated to describe this relation
through the type system, so we will specify this precondition at a runtime using
implication operator (⇒). Note, that return type has changed too:
prop_Index_v3 :: (NonEmptyList Integer) -> NonNegative Int -> Property prop_Index_v3 (NonEmpty xs) (NonNegative n) = n < length xs ==> xs !! n == head (drop n xs)
Test cases with values, that do not satisfy the precondition, will simply get discarded, but not to worry, it will still generate the 100 tests. In fact it will generate up to a 1000 before giving up. An alternate way to achieve similar effect would be to generate a valid index within a property itself:
prop_Index_v4 :: (NonEmptyList Integer) -> Property prop_Index_v4 (NonEmpty xs) = forAll (choose (0, length xs-1)) $ \n -> xs !! n == head (drop n xs)
λ> quickCheck prop_Index_v3 >> quickCheck prop_Index_v4 +++ OK, passed 100 tests. +++ OK, passed 100 tests.
Just in case, let's quickly dissect this for all (∀) business. It takes a random
value generator, which choose
happens to produce, a property that operates on it's
values and returns a property, i.e. applies values from a specific generator
to the supplied property.
λ> :t forAll forAll :: (Show a, Testable prop) => Gen a -> (a -> prop) -> Property λ> sample' $ choose (0, 3) [0,2,2,3,3,3,0,1,0,1,3]
There is a very subtle difference between the last two versions, namely _v3
will discard tests that do not satisfy a precondition, while _v4
will always
generate a value for n
that is safe for passing to index function. This is not
important for this example, which is good, but that is not always the
case. Whenever precondition is too strict, QuickCheck might give up early while
looking for valid values for a test, but more importantly, it can give a false
sence of validity, since most of the values that it will find could be trivial
ones.
Pitfalls
For this section we will use prime numbers in our examples, but rather than
reinventing the wheel and writing functions for prime numbers ourselves we will
use primes package.
Just for fun, let's write a property for primeFactors
, which is based on
Fundamental Theorem of Arithmetic:
prop_PrimeFactors :: (Positive Int) -> Bool prop_PrimeFactors (Positive n) = isPrime n || all isPrime (primeFactors n)
That was incredibly easy and is almost a direct translation of a theorem itself. Let's consider a fact that every prime number larger than 2 is odd, thus we can easily derive a property that sum of any two prime numbers greater than 2 is even. Here is a naive way to test that property:
prop_PrimeSum_v1 :: Int -> Int -> Property prop_PrimeSum_v1 p q = p > 2 && q > 2 && isPrime p && isPrime q ==> even (p + q)
As you can imagine it is not too often that a random number will be prime, this certainly will affect the quality of this test:
λ> quickCheck prop_PrimeSum_v1 *** Gave up! Passed only 26 tests.
It only found 26 satisfiable tests out of a 1000 generated, that's bad. There is even more to it, in order to convince ourselves, that we are testing functions with data that resembles what we expect in real life, we should always try to inspect the values being generated for a property. An easy way to do that is to classify them by some shared traits:
prop_PrimeSum_v1' :: Int -> Int -> Property prop_PrimeSum_v1' p q = p > 2 && q > 2 && isPrime p && isPrime q ==> classify (p < 20 && q < 20) "trivial" $ even (p + q)
λ> quickCheck prop_PrimeSum_v1' *** Gave up! Passed only 29 tests (96% trivial). λ> quickCheckWith stdArgs { maxSuccess = 500 } prop_PrimeSum_v1' *** Gave up! Passed only 94 tests (44% trivial).
Almost all values this property was tested on are in fact trivial ones. Increasing number of tests was not much of a help, because, by default, values generated for integers are pretty small. We could try to fix that with appropriate types, but this time we will also generate a histogram of unique pairs of discovered prime numbers:
prop_PrimeSum_v2 :: (Positive (Large Int)) -> (Positive (Large Int)) -> Property prop_PrimeSum_v2 (Positive (Large p)) (Positive (Large q)) = p > 2 && q > 2 && isPrime p && isPrime q ==> collect (if p < q then (p, q) else (q, p)) $ even (p + q)
λ> quickCheck prop_PrimeSum_v2 *** Gave up! Passed only 24 tests: 16% (3,3) 8% (11,41) 4% (9413,24019) 4% (93479,129917) ...
This is better, there are less trivial values, but still, number of tests is far from satisfactory. It is also extremely inefficient to look for prime values that way, and, for any really large value passed to the property, it will take forever to check its primality. Much better approach would be to choose from a list of prime values, which we have readily available for us:
prop_PrimeSum_v3 :: Property prop_PrimeSum_v3 = forAll (choose (1, 1000)) $ \ i -> forAll (choose (1, 1000)) $ \ j -> let (p, q) = (primes !! i, primes !! j) in collect (if p < q then (p, q) else (q, p)) $ even (p + q)
λ> quickCheck prop_PrimeSum_v3 +++ OK, passed 100 tests: 1% (983,6473) 1% (953,5059) 1% (911,5471) ...
Arbitrary
There could be a scenario where we needed prime values for many tests, then it
would be a burden to generate them this way for each property. In such cases
solution is always to write an instance for Arbitrary
:
newtype Prime a = Prime a deriving Show instance (Integral a, Arbitrary a) => Arbitrary (Prime a) where arbitrary = do x <- frequency [ (10, choose (0, 1000)) , (5, choose (1001, 10000)) , (1, choose (10001, 50000)) ] return $ Prime (primes !! x)
Calculating large prime numbers is pretty expensive, so we could simply use
something like choose (0, 1000)
, similarly to how it was done in
prop_PrimeSum_v3
, but there is no reason why we should exclude generating
large prime numbers completely, instead, we can reduce their chance by
describing a custom distribution with frequency
function.
Now writing prop_PrimeSum
is a piece of cake:
prop_PrimeSum_v4 :: Prime Int -> Prime Int -> Property prop_PrimeSum_v4 (Prime p) (Prime q) = p > 2 && q > 2 ==> classify (p < 1000 || q < 1000) "has small prime" $ even (p + q)
λ> quickCheck prop_PrimeSum_v4 +++ OK, passed 100 tests (21% has small prime).
CoArbitrary
There are quite a few instances of Arbitrary
, many common data types from
base
are, but the most peculiar one is a function:
λ> :i Arbitrary class Arbitrary a where arbitrary :: Gen a shrink :: a -> [a] ... instance [safe] (CoArbitrary a, Arbitrary b) => Arbitrary (a -> b) ...
That's right, QuickCheck can even generate functions for us! One of restrictions
is that an argument to the function is an instance of CoArbitrary
, which also
has instance for a function, consequently functions of any arity can be
generated. Another caveat is that we need an instance of Show
for functions,
which is not a standard practice in Haskell, and wrapping a function in a
newtype
would be more appropriate. For clarity we will opt out from this
suggestion and instead demonstrate this cool feature in action. One huge benefit
is that it allows us to easily write properties for higher order functions:
instance Show (Int -> Char) where show _ = "Function: (Int -> Char)" instance Show (Char -> Maybe Double) where show _ = "Function: (Char -> Maybe Double)" prop_MapMap :: (Int -> Char) -> (Char -> Maybe Double) -> [Int] -> Bool prop_MapMap f g xs = map g (map f xs) == map (g . f) xs
HSpec
One of the first concerns, that programmers usually raise when coming from other languages to Haskell, is that there are situations when unit tests are invaluable, but QuickCheck does not provide an easy way to do that. Bear in mind, QuickCheck's random testing is not a limitation, but rather is a priceless feature of testing paradigm in Haskell. Regular style unit tests and other QA functionality (code coverage, continuous integration, etc.) can be done just as easily as they are done in any other modern language using specialized libraries. In fact, those libraries play beautifully together and complement each other in many ways.
Here is an example of how we can use hspec to create a test suite containing all properties we have discussed so far, plus few extra unit tests for completeness of the picture.
#!/usr/bin/env stack -- stack --resolver lts-7.16 runghc --package QuickCheck --package hspec --package primes module Main where import Test.Hspec import Test.QuickCheck ... main :: IO () main = hspec $ do describe "Reverse Properties" $ do it "prop_RevRev" $ property prop_RevRev it "prop_RevApp" $ property prop_RevApp it "prop_PrefixSuffix" $ property prop_PrefixSuffix describe "Number Properties" $ do it "prop_Sqrt" $ property prop_Sqrt describe "Index Properties" $ do it "prop_Index_v3" $ property prop_Index_v3 it "prop_Index_v4" $ property prop_Index_v4 it "unit_negativeIndex" $ shouldThrow (return $! ([1,2,3] !! (-1))) anyException it "unit_emptyIndex" $ shouldThrow (return $! ([] !! 0)) anyException it "unit_properIndex" $ shouldBe (([1,2,3] !! 1)) 2 describe "Prime Numbers" $ do it "prop_PrimeFactors" $ property prop_PrimeFactors it "prop_PrimeSum_v3" $ property prop_PrimeSum_v3 it "prop_PrimeSum_v4" $ property prop_PrimeSum_v4 describe "High Order" $ do it "prop_MapMap" $ property prop_MapMap
Conclusion
Random testing can be mistakenly regarded as an inferior way of software testing, but many studies have certainly shown that it is not the case. To quote D. Hamlet:
By taking 20% more points in a random test, any advantage a partition test might have had is wiped out.
It is very easy to start using QuickCheck to test properties of pure functions. There is also a very similar toolbox included in the library for testing monadic functions, thus allowing for a straightforward way of testing properties of functions that do mutations, depend on state, run concurrently and even perform I/O. Most importantly, this library provides yet another technique for making Haskell programs even safer.
Writing tests doesn't have to be a chore, it can be fun. We certainly find it fun at FPComplete and will be happy to provide training, consulting or development work.