Tour of the Haskell Syntax

Author: Arjan van IJzendoorn ( afie@cs.uu.nl )

This document gives an informal overview of the Haskell syntax. A formal syntax can be found at the Haskell homepage. I've made this document because the book we use for teaching here (Haskell School of Expression, Paul Hudak) introduces language constructs by example and the reader may not get a feel for for the language as a whole. For example, to find out what things may serve as a pattern you would have to look through many chapters.

In the informal syntax rules I use a "..."-notation and subscripts. For example

	name pattern₁ pattern₂ ... pattern_n = expression		(n>=0)

This means that there can be zero or more parameters. Valid instantiations are:

	pi = 3.14159265
	nextChar c = chr (ord c + 1)
	simple x y z = x * (y + z)

Types are printed in italic.

Lines printed in bold are not valid Haskell, but show what the language construct looks like in general.

Most examples are Prelude functions which means that if you try to load them the Haskell interpreter or compiler will complain about functions being multiply defined. If you want to try them, you would have to rename them. I chose to use mostly Prelude functions because they are familiar.

Topics that are not discussed: monads, do-notation, class and instance declarations.

Index:

Functions

A simple function definition is formed by the name of the functions followed by zero or more parameters, an equals sign and then an expression. Example:

	simple x y z = x * (y + z)

In general the parameters can be arbitrary patterns:

	name pattern₁ pattern₂ ... pattern_n = expression		(n>=0)

Note that there are no parentheses around or commas between the parameters. It is good practice to write a type signature above the definition. In function definitions you can use guards and pattern-matching to make choices.

You can also bind a pattern to an expression.

Pattern bindings

Instead of simply binding an expression to a variable you can bind a pattern to it. It is wise to choose patterns that always succeed in this case:

	(xs, ys) = unzip listOfPairs

	(x:y:z:rest) = [1..]

In general,

	pattern = expression

For another example, see the let-expression.

Type signatures

It is a good idea to always write a type signature above a definition. It tells the reader what types of parameters to provide, serves as documentation and type errors are detected closer to their origin. A type signature is formed by the name of the function, two colons and then a type:

	simple :: Int -> Int -> Int

In general:

	name :: type

If the function is an operator, parentheses are used around the name:

	(++) :: [a] -> [a] -> [a]

The type signature does not have to placed directly above the definition but it is customary to do so. One signature can declare the type of several functions if you use commas to separate them:

	take, drop :: Int -> [a] -> [a]

Guards

Guards can be used to make choices in functions based on boolean expressions:

	-- max x y returns the largest of the two
	max :: Ord a => a -> a -> a
	max x y
	  | x > y = x
	  | otherwise = y

In general:

	name pattern₁ pattern₂ ... pattern_n 
	  | guard₁ = expression₁
	  | guard₂ = expression₂
	  ...
	  | guard_m = expression_n 		(n>=0, m>=1)

Instead of an equals sign directly after the parameters you write several guarded expressions. A guarded expression consists of a vertical bar followed by a boolean expression, an equals sign and then an expression. The guards are tried in textual order until one of them results True. The corresponding expression is then evaluated. otherwise is a guard that always results True. There is nothing special about it; in the Prelude it is defined as:

	otherwise :: Bool
	otherwise = True

If none of the guards succeeds a program error is raised and evaluation stops.

	sign :: Int -> String
	sign x 
	  | x > 0 = "+"
	  | x < 0 = "-"

In Hugs:

Main> sign 0
"
Program error: {sign 0}

Guards can be combined with pattern-matching.

Pattern matching

In a function definition you can match arguments against patterns:

	fst :: (a, b) -> a
	fst (x, y) = x

This pattern will always succeed, but there are patterns that may fail. In that case you write several clauses. Each clause looks like a function definition and together they form the complete function definition:

	length :: [a] -> Int
	length [ ] = 0
	length (x:xs) = 1 + length xs

The clauses are tried in textual order until one is found that matches. You can perform pattern-matching on several parameters at the same time:

	zip :: [a] -> [b] -> [(a, b)]
	zip [] ys = []
	zip xs [] = []
	zip (x:xs) (y:ys) = (x, y) : zip xs ys

If none of the clauses match, a program error is raised:

	head :: [a] -> a
	head (x:xs) = x

In Hugs:

	Main> head []

	Program error: {head []}

Pattern matching can be combined with guards.

Combining guards and pattern matching

In a function you can use a combination of guards and pattern matching:

	filter :: (a -> Bool) -> [a] -> [a]
	filter p [] = []
	filter p (x:xs)
	  | p x = x : filter p xs
	  | otherwise = filter p xs

The patterns are tried in textual order. If a match found there may be guards that have to be tested; the expression behind the first guard that evaluates to True is then evaluated. If no guard holds then the next clause is tried (if there is any). This fact is used in the following function:

	allToUpper :: [Char] -> [Char]
	allToUpper (x:xs) 
	  | isLower x = toUpper x : allToUpper xs
	allToUpper (x:xs) = x : allToUpper xs
	allToUpper [] = []

Even though it is allowed, I personally find this code confusing.

Patterns

The table below shows the things that can appear as patterns. Pattern is abbreviated to pat and constructor to Con. As you can see, patterns can be arbitrarely nested.

*Pattern in general*	*Examples*	*Description*
[ ]	[ ]	Matches an empty list.
pat₁:pat₂	x:xs	Matches a non-empty list if the head matches `pattern₁` and the tail matches `pattern₂`.
(pat₁,pat₂ ... , pat_n) (n>=2)	(x, y) (x:xs, 3, y)	Matches a tuple provided that the components match the corresponding patterns.
name	x ys list	Matches always. Binds the variable to the expression it is matched against.
_	_	Matches always. No bindings are made.
name@pat	l@(x:xs) database@(name, fields, records)	Matches when the pattern matches. Additionally binds the name to the whole expression.
"string"	"abc"	Matches the given string.
constant	0 'b' 3.14	Matches when the parameter evaluates to the constant
Con pat₁,pat₂ ... , pat_n (n>=0)	Branch value left right True Just 3	Matches if the expression evaluates to the constructor and the parameters match the corresponding patterns.
[pat₁,pat₂ ... , pat_n]	[1,2,3] [(x,y),(3,'a')]	Matches a list if the elements match the corresponding patterns.

Expressions

Expression is abbreviated to expr. In the table, some forms of expressions are listed. Other forms are discussed elsewhere: let, if-then-else, case, lists, tuples, list comprehensions, lambda.

In general	Example	Description
name	x xs map foldl'	variable, function
constant	0 'b' 3.14	constant
expr expr₁ expr₂ ... expr_n	map f xs foldr (:) [] xs plus 3 4 (fst funs) 5	function application
expr1 operator expr2	3+4 3 : list x ^^ 2	operator application

List comprehensions

List comprehensions can often be used instead of map, filter and concat to produce nicer results:

	[ x*x | x <- [1..3] ]
	[ (x, y) | x <- [1..3], y <- "ab"]

The expression to the left of the vertical bar is computed for each combination of values from the generators on the right. The right-most generator changes more quickly, i.e. in the 2nd example first all possibilities of y are tried before we look at the next x. The expressions above have the following results:

	[1,4,9]
	[(1,'a'),(1,'b'),(2,'a'),(2,'b'),(3,'a'),(3,'b')]

It is also possible to filter elements in a list comprehension. This is done by writing a expression of type Bool among the generators. The expression can only refer to earlier generators:

	[ (x,y) | x <- [1..4], even x, y <- "ab" ]

results in

	[(2,'a'),(2,'b'),(4,'a'),(4,'b')]

You can use list comprehensions to write elegant definitions for the Prelude functions map and filter:

	map :: (a -> b) -> [a] -> [b]
	map f xs = [ f x | x <- xs ]

	filter :: (a -> Bool) -> [a] -> [a]
	filter p xs = [ x | x <- xs, p x ]

Lambda expressions

Anonymous function can be made using lambda expressions. For example

	\x -> x + 1

is a function with one parameter which it will add one to. Lambda expressions prove useful as arguments to for example map:

	map (\xs -> zip xs [1..]) list

In general:

	\pattern₁ pattern₂ ... pattern_n -> expression		(n>=1)

The dot dot notation

You can use the .. notation as a short-hand to write down lists:

Expression	Result	Note
[1..10]	[1,2,3,4,5,6,7,8,9,10]
[1,4..20]	[1,4,7,10,13,16,19]	Specify second element to change distance between elements
[10, 9.. 1]	[10,9,8,7,6,5,4,3,2,1]	Step size may be negative
[2..]	[2,3,4,5...	You can build infinite lists, too
[1,3..]	[1,3,5,7...
['a'..'z']	"abcdefghijklmnopqrstuvwxyz"	It works for all types in the class Enum

Lists

Lists are ordered collections of elements of the same type:

	[3,1,4,1,59,265]	['a', 'b']	[(3,'a'), (4,'m')]

The type of the above expressions are:

	[Int]			[Char]		[(Int, Char)]

Lists are viewed by Haskell as being empty or having a head (the first element) and a tail (the rest of the elements). Using pattern-matching you can find out whether the list is empty or not and if not continue pattern-matching on the head and the tail:

	length :: [a] -> Int
	length [] = 0
	length (_:xs) = 1 + length xs

The constructors [] and : can not only be used as patterns but also to construct lists:

	map :: (a -> b) -> [a] -> [b]
	map f [] = []
	map f (x:xs) = f x : map f xs

Actually, [1,2,3] is a short-hand for 1:2:3:[].

There are several special notations for lists; see String, list comprehensions and the dot dot notation.

Tuples

With tuples you can combine values of (possibly) different types:

	(10, 20)	('a', 10)	([1..3], "a string", True, 'c')

The type of a tuple mentions the types of the components. Above expressions have the following types:

	(Int, Int)	(Char, Int)	([Int], String, Bool, Char)

Tuples can be used if you want to return multiple results from a function:

	headAndTail :: [a] -> (a, [a])
	headAndTail (x:xs) = (x, xs)

case

With a case expression you can perform pattern matching at the expression level (as opposed to at the function definition level):

	length :: [a] -> Int
	length xs =
	  case xs of
	    [] -> 0
	    (x:xs) -> 1 + length xs

You can match one expression against several patterns.

let

Using let you can define local functions:

	sum :: Num a => [a] -> a
	sum ys = 
	  let sum' [] total = total
	      sum' (x:xs) total = sum' xs (total+x)
	  in sum' ys 0

It also comes in handy when you define a recursive function that returns a tuple. You can unwrap the tuple immediately:

	unzip :: [(a, b)] -> ([a], [b])
	unzip [] = ([], [])
	unzip ((a, b):rest) = 
	  let (as, bs) = unzip rest 
	  in (a:as, b:bs)

In general a let-expression looks like this:

	let functionDefinition₁
	    functionDefinition₂
	    ...
	    functionDefinition_n
	in expression			(n>=1)

The scope of the local functions is the expression to the right of the keyword "in". The "where"-construct is another way to define local functions. The two differ only in scoping.

where

Using where you can define local functions:

	sum :: Num a => [a] -> a
	sum ys = sum' ys 0
	  where
	    sum' [] total = total
	    sum' (x:xs) total = sum' xs (total+x)

The functions defined in a where can be seen only in the clause it is next to:

	sum :: Num a => [a] -> a
	sum [] = sum' [] 0 -- wrong, sum' can not be seen here!
	sum ys = sum' ys 0
	  where
	    sum' [] total = total
	    sum' (x:xs) total = sum' xs (total+x)

In general, the where construct looks like this:

	clauseOfFunctionDefinition 
	  where 
	    functionDefinition₁
	    functionDefinition₂
	    ...
	    functionDefinition_n 		(n>=1)

The let-expression is another way to define local functions. The two differ only in scoping.

if

At the expression level you can make choices using the if-then-else construct. Here is an alternative definition of filter (see also Combining guards and pattern matching):

	filter :: (a -> Bool) -> [a] -> [a]
	filter p [] = []
	filter p (x:xs) = if p x then x : filter p xs else filter p xs

In general:

	if expression₁ then expression₂ else expression₃

The first expression must be of type Bool. The other two expressions must be of the same type. If the first expression evaluates to True the second expression is returned, otherwise the third.

When many choices have to made guards can come in handy. Instead of:

	kindOfChar :: Char -> String
	kindOfChar c = 
	  if isLower c 
	  then "lower" 
	  else if isUpper c 
	       then "upper" 
	       else if isDigit c 
	            then "digit" 
	            else "other"

you can write:

	kindOfChar :: Char -> String
	kindOfChar c
	  | isLower c = "lower"
	  | isUpper c = "upper"
	  | isDigit c = "digit"
	  | otherwise = "other"

Types

A type can be any of the following:

A type without parameters, e.g.Int, Float, Char, Piece
A tuple type, e.g. (Int, Char)
A list type, e.g. [Int]
A function type, e.g. Int -> Int
A type with parameters, e.g. Tree Char, Maybe
A type variable, e.g. a

The -> in types associates to the right.

A type can also be overloaded which means that a polymorphic type variable is restricted in which types are acceptable:

  max :: Ord a => a -> -> a -> a

Only types that are in the Ord class can be used when calling max. For more than one restriction the tuple notation is used:

  showLargest :: (Ord a, Show a) => a -> a -> String
  showLargest x y = show (max x y)

Data types

You can make new types in Haskell:

	data Piece = Pawn | Rook | Knight | Bishop | Queen | King

Pawn, Rook and so on are called constructors of the type Piece. Constructors are valid patterns in function definitions and valid expressions:

	promotesTo :: Piece -> [Piece]
	promotesTo Pawn = [ Rook, Knight, Bishop, Queen ]
	promotesTo piece = [ piece]

Constructors can have parameters:

	data Shape
	  = Ellipse Float Float
	  | Square Float
	  | Polygon [(Float, Float)]

Ellipse is now a constructor function of type Float -> Float -> Shape, so Ellipse 3.0 4.5 is a valid expression of type Shape.

Data types can be parametrised:

	data Maybe a = Just a | Nothing

Nothing is of type Maybe a, Just is of type a -> Maybe a, so Just 'a' is of type Maybe Char.

	safeDivision :: Float -> Float -> Maybe Float 
	safeDivision x y = if y == 0 then Nothing else Just (x/y)

Finally, data types may be recursive:

	data Tree a 
	  = Leaf a
	  | Node (Tree a) (Tree a)

And here is a function that computes the sum of all the elements of a tree containing Ints:

	sumTree :: Tree Int -> Int
	sumTree (Leaf value) = value
	sumTree (Node left right) = sumTree left + sumTree right

In general, a data type looks like this:

	data TypeName typeVariable₁ typeVariable₂ ... typeVariable_n 
	  = Constructor₁ type_1,1 type_1,2 ... type_1,m1
	  | Constructor₂ type_2,1 type_2,2 ... type_2,m2
	  ...
	  | Constructor_p type_p,1 type_p,2 ... type_p,mp(n>=0, p>=1, m₁..m_p >=0)

Type synonyms

Using type synonyms you can give a new name to an existing type. String is a good example of this:

	type String = [Char]

Type synonyms may also be parametrised with type variables:

	type Pair a b = (a, b)

Now Pair Int Char is the same as (Int, Char).

In general a type synonym looks like this:

	type Name parameter₁ parameter₂ ... parameter_n = type		(n>=0)

Char

Values of the built-in type Char are characters. You write them between single quotes:

	'a' '\n' (newline) ' ' '7'

Int

Values of the built-in type Int are whole numbers. They are limited in size (32 bits):

	Main> 100 * (100::Int)
	10000
	Main> 10000000 * (10000000 :: Int)
	276447232

The type annotation is necessary. Otherwise, Hugs will chose Integer as the type of the numbers. If you want to compute with really big numbers you should use the Integer type.

You can convert an Int to a floating-point number (Floator Double) using fromInt.

Integer

Values of the built-in type Integer are whole numbers. They are unlimited in size (apart from memory limitations):

	Main> 100 * (100::Integer)
	10000
	Main> 10000000 * (10000000 :: Integer)
	100000000000000

Compare this to the behaviour of Int.

You can convert an Integer to a floating-point number (Floator Double) using fromInteger.

Float

Values of the built-in type Float are floating-point numbers:

	Main> 10 / 2.4
	4.16667

You can convert a floating-point number to an Int or Integer using truncate and round.

Double

See Float. Doubles offer more precision.

Bool

Booleans are built-in but they can easily be defined using a data type definition:

	data Bool = True | False

Expressions of type Bool can be used as the condition in an if-then-else or as a guard.

String

The type String is simply a type synonym for a list of characters (Char):

	type String = [Char]

There is a convenient notation for strings. Instead of writing

	['h','e','l','l','o',' ','w','o','r','l','d']

you can write

	"hello world"

You can also use this notation in patterns.

Type annotations

You can annotate an expression with its type:

	length [] = 0 :: Int
	length (x:xs) = 1 + length xs

You may want to use this if you don't want a constant to be overloaded, like in the example.

Operators

Operators are functions with a special name. Their name may consist of the following symbols !#$%&*+./<=>?@\^|-~ . The name of a constructor operator must start with a colon (:). When you write down the type of an operator you have to put the name between parentheses:

	(+++) :: (a, b) -> (b, c) -> (a, b, c)
	(x, y) +++ (y, z) = (x, y, z)

Putting an operator between parentheses in expressions allows you to use it as a prefix function:

	(+) 3 4		(:) 3 [1,4]

On the other hand, you can write a function between back quotes and treat it as an operator:

	10 `mod` 3	4 `elem` listOfNumbers

Parentheses

In Haskell you don't need parenthesis around and commas between parameters. You can write them but it means something different:

	plus :: (Int, Int) -> Int
	plus (x, y) = x + y

This is a function with one parameter, namely a pair of values. Normally, you would define it like this:

 	plus :: Int -> Int -> Int
	plus x y = x + y

This allows you to partially parametrise the function:

	map (plus 3) [1..10]

In function application you need parentheses only if Haskell would misinterpret it. The rules are that application binds stronger than anything else and associates to the left:

 	map f filter g xs

This is seen as map receiving four arguments which would be a type error. Probably, this is meant:

 	map f (filter g xs)

A simple rule of thumb is to write parentheses when you pass something 'complex'. Simple things are constants, variables and things that have brackets of their own (tuples, lists, list comprehensions).

Quotes

Here is an overview of all the uses of quotes in Haskell and their meaning:

'a'	a value of type Char
"yada"	a value of type String
`mod`	function mod used as an operator, e.g. 12 `mod` 5
foldl'	a single quote as part of the name of a function

Comments

One-line comments start with two minus signs and end at the end of the line. Multi-line comments start with {- and end with -}.

	{- Filter selects those elements for
           which the given predicate returns True 
        -}
	filter :: (a -> Bool) -> [a] -> [a] -- filter is polymorphic!
	filter p [] = []
	filter p (x:xs) = if p x then x : filter p xs else filter p xs

Names

Normal names (as opposed to operator names) may contain letters, digits and the two symbols ' and _. Names of functions, parameters, variables and type variables must start with a small letter or a _. Names of types and constructors must start with an upper case letter.

Valid names:

	x foldl' long_name unzip3 concatMap _yada 
	Bool Cons TreeNode Int

See also operators for rules about their names.