Typesafe modular arithmetic in Haskell
Let us say, we want to have a calculator where all integers are defined with reference to a modulus ‘n’. This arises in certain cryptography algorithm implementation, where we often define integers over a Group over a prime p.
The question is, is there a way to represent them in a type safe way?
> let a = 10 `mod` 7
> let b = 16 `mod` 13
Though the results of these two moduli look the same, they are different numbers. One is over a modulus 7 and other, over 13. Now, is there a way to carry around the modulus in the resulting number itself, so that if one compares a to b, it should return a False
.
> a == b
False
But that is not all. This is still a run-time check. We would like to get a type error (at compile time) if we do a operation involving two numbers with a different modulus. The role of a type system is to represent the invariants and so it will be great to encode as many invariants of a particular value into the type.
While I was trying to see how I can use modern GHC with some of the Dependent Typed programming extensions, I stumbled on Tikhon Jelvis’ modular-arithmetic package on Hackage and thought it would be a good idea to study the package. What follows is a simplified version of the Data.Modular
module provided by the above package. I am skipping a bunch of code from the package and showing only the essence of the module that shows the power of types.
I am using GHC 8.0.2 for my experiments.
Let us look at a way to represent i `mod` n
where the modulus n
purely lives during compile time and not as a value during run time. A way to do that is via the Phantom Types.
newtype Mod i (n :: Nat) = Mod i deriving (Eq, Ord)
As you can see, the parameter n
does not appear on the right hand side. Using a Phantom type, one no longer need to represent the modulus as a value. The value of the type parameter n
is not used.
For instance, we want to represent 10 `mod` 7
in this form:
> 10 :: Mod Integer 7
As you can see, we are carrying around the modulus in the type. We are using the type constructor, not the data constructor.
Well, we are not there yet. But we will, by the end of the post.
What is the type Nat
in the above newtype
declaration? We get this by importing GHC.TypeLits
.
import GHC.TypeLits
which gives us type-level natural number constants.
One way to think of it would be to think of peano numbers:
data Z
data S n
We omit the data constructors as we are mostly interested in the type constructors here. The number 6 would have a type representation:
type six = S (S (S (S (S (S Z)))))
This is a fine representation. However, it becomes a little involved to represent large numbers. Hence GHC.TypeLits
, which gives us type-level natural numbers and ways to link the type level number to value level.
To do arithmetic, we need to carry the modulus n at run time as well. For example, we want to take any integer i and interpret it as a number i `mod` n
, i.e. as a member of the set ℤ/nℤ
. So, we create a Num
instance for our Mod
type.
instance (Integral i, KnownNat n) => Num (Mod i n) where
fromInteger = toMod . fromInteger
.. and we define toMod
as:
toMod :: forall n i. (Integral i, KnownNat n) => i -> Mod i n
toMod i = Mod $ i `mod` (fromInteger $ natVal (Proxy :: Proxy n))
Let us first understand toMod
and then try to understand fromInteger
.
natVal
’s type signature is defined as:
natVal :: forall n proxy. KnownNat n => proxy n -> Integer
i.e. Given a value of type proxy n
to natVal
, we will get a value of type Integer
back. What the heck is proxy n
?
The proxy
is a type variable of kind “* -> *
” - just like the classic type constructors like Maybe
. It could as well be written as f a
but it is best to call it proxy
here. So, the above type signature says, natVal
takes a single value as input of kind * -> *
and gives out an Integer.
natVal
gives the value of the type level integer at run time. To pass the type level integer, we use the mysteriously looking Proxy
value type annotation with Proxy n
. Proxy is somewhat like the Unit type, () in that there is only one value inhabiting the type. Its use is purely in passing the type annotation into the function natVal so that the type level integer n representing the modulus can be extracted and used at value level to actually do the modulus operation.
Now, natVal
could have used the Proxy
type from Data.Proxy
. i.e.
natVal :: forall n. KnownNat n => Proxy n -> Integer
In the above type signature, the concrete Proxy
type replaces the proxy
type variable. But this requires users of natVal
to import Data.Proxy
, so the usage of the type variable is a common pattern to help the people use it without the need for importing the Data.Proxy
module.
With that in place, fromInteger
is easy to understand. Coming back to the example above:
> > 10 :: Mod Integer 7
We get the answer:
> 3
we are using the above Num
instance to read in the number 10 and convert it into the Mod
type. From the type annotations, we read the number 7 and link that into “value” 7, use it to do the actual mod and give the result 3.
The code with the necessary compiler extensions is below. As I stated above, the code is incomplete and the reader can refer to Data.Modular module to look at the entire code from Tikhon’s package. (For instance, I don’t even define the entire Num
instance)
Credits
I thank Tikhon Jelvis for the modular-arithmetic which educated me on these topics and also for reviewing this blog post.
{-# LANGUAGE KindSignatures #-} -- for (n :: Nat) declaration
{-# LANGUAGE DataKinds #-} -- for Nat (lifting integer to type)
{-# LANGUAGE RankNTypes #-} -- for forall
{-# LANGUAGE ScopedTypeVariables #-} -- when type signature is used inside a fn with a top level forall, we need this for scoping
module Mod where
import GHC.TypeLits
import Data.Proxy (Proxy(..))
newtype Mod i (n :: Nat) = Mod i
instance (Integral i, KnownNat n) => Num (Mod i n) where
fromInteger = toMod . fromInteger
Mod i1 + Mod i2 = toMod (i1 + i2)
Mod i1 * Mod i2 = toMod (i1 * i2)
toMod :: forall n i. (Integral i, KnownNat n) => i -> Mod i n
toMod i = Mod $ i `mod` (fromInteger $ natVal (Proxy :: Proxy n))
unMod :: Mod i n -> i
unMod (Mod i) = i
instance Show i => Show (Mod i n) where
show (Mod i) = show i