I have the following sum type

import Data.Aeson
import Data.Aeson.Casing
import GHC.Generics

data OrderType = Confirmed | AwaitingShipping | Shipped
  deriving (Eq, Generic)

instance ToJSON OrderType where
  toJSON = genericToJSON $ (aesonPrefix snakeCase){constructorTagModifier=(camelTo2 '_')}

This results in the following conversion during JSON encoding:

Confirmed => confirmed
AwaitingShipping => awaiting_shipping
Shipped => shipped

How do I quickly generate a Show instance that has the exact same OrderType => String conversion?

Please note, I know I can do the following, but I'm looking for a way to avoid this boilerplate.

instance Show OrderType where
  show Confirmed = "confirmed"
  show AwaitingShipping = "awaiting_shipping"
  show Shipped = "shipped"
share|improve this question
    
Why does it have to be an instance of Show? How about deriving the standard Show, and then composing camelTo2 '_' . show? – Li-yao Xia May 26 at 5:12
    
Clever! Let me try... – Saurabh Nanda May 26 at 5:16
    
I redefined ToJSON via the standard Show instance and it works perfectly fine. instance ToJSON TxnSource where toJSON = toJSON.(camelTo2 '_').show So, this technique can be used at other places as well. Any idea how to do this for FromJSON? – Saurabh Nanda May 26 at 5:28
    
Taking the inverses I would look for something like read . toCamel . fromJSON, except the last bit is actually parseJSON and it is monadic: instance FromJSON TxnSource where parseJSON = fmap (read . toCamel) . parseJSON. You need to define toCamel doing the inverse of camelTo2 '_'. You probably want to use readMaybe too. – Li-yao Xia May 26 at 5:42
    
That's exactly what I'm wondering. How does FromJSON use the same camelTo function during parsing? Here's the FromJSON instance that does the camel<>Haskell conversion absolutely fine -- ` parseJSON = genericParseJSON $ (aesonPrefix snakeCase){constructorTagModifier=(camelTo2 '_')}` – Saurabh Nanda May 26 at 5:46

I assumed there was a reason it had to be an instance of Show, otherwise camelTo2 '_' . show seems to do the job.

In any case, here's how you can get the constructor name with GHC.Generics. Then you can write camelTo2 '_' . constructorName with no additional setup required; in particular, you are free to use this as an implementation of show.

import GHC.Generics

-- Constructor name of the value of an ADT.
-- Using 'Generic.from', we map it to a generic representation.
constructorName :: (Generic a, CName (Rep a)) => a -> String
constructorName = cname . from

-- Class of generic representations of ADTs, built using
-- types in GHC.Generics.
-- 'cname' extracts the constructor name from it.
class CName f where
  cname :: f p -> String

-- M1 is a newtype to attach metadata about the type
-- being represented at the type level.
-- The first parameter marks the kind of the data
-- in the second one. 'D' indicates general information
-- like the type name and whether it is a newtype.
-- Here we ignore it and look in the rest of the representation.
instance CName f => CName (M1 D c f) where
  cname (M1 f) = cname f

-- '(:+:)' represents sums.
instance (CName f, CName g) => CName (f :+: g) where
  cname (L1 f) = cname f
  cname (R1 g) = cname g

-- 'M1' again, but 'C' indicates information about a specific
-- constructor, we extract it using the 'GHC.Generics.Constructor'
-- type class.
instance Constructor c => CName (M1 C c f) where
  cname = conName
share|improve this answer
    
I'm trying to understand this code. Would it be possible to please add some comments to explain what is conceptually happening here? – Saurabh Nanda May 26 at 5:52
    
May I trouble you for the reverse, i.e. given a text/string, how to get a constructor name and value from it? – Saurabh Nanda Jun 4 at 7:02

(I've edited my answer to have significantly more explanations. If you just want a module containing the code, it's still available.

The question aims at changing the default Show and Read instances for enumeration types such as OrderType and providing custom ones. I'll show below how that can be done, although in principle I advise against doing so, because Show and Read are typically supposed to produce Haskell representations of the values. I'll also suggest a different solution, however, by going via new type classes.

My solution is similar to the one proposed by Li-yao Xia, but based on generics-sop rather than built-in GHC generics.

We are using the following module header.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
module CustomShowEnum where

import Data.Aeson
import Data.Aeson.Types
import Data.Maybe
import Generics.SOP
import Generics.SOP.NS
import Generics.SOP.TH
import Text.Read

Let's start with a function that computes a product (a list with a statically known number of elementS) of all the constructor names.

conNames ::
  forall a proxy .
  (Generic a, HasDatatypeInfo a)
  => proxy a -> NP (K String) (Code a)
conNames _ =
  hmap
    (K . constructorName)
    (constructorInfo (datatypeInfo (Proxy @a)))

The datatypeInfo function provides all meta-information about a given datatype, the constructorInfo function extracts from that a product with meta-information about each constructor. We are only interested in the names, nothing else, so we use hmap over the product to extract the constructor names in each position.

Let's see how we can use it:

GHCi> conNames (Proxy @Bool)
K "False" :* (K "True" :* Nil)

Read Nil as the empty product, and :* as "cons". Each element is wrapped in a K constructor, because it is a product containing a (constant) String for every constructor of the datatype.

The same works on other datatypes:

GHCi> conNames (Proxy @Ordering)
K "LT" :* (K "EQ" :* (K "GT" :* Nil))
GHCi> conNames (Proxy @(Maybe ()))
K "Nothing" :* (K "Just" :* Nil)

We can also make it work on the OrderType mentioned in the question:

data OrderType = Confirmed | AwaitingShipping | Shipped

But if we try this blindly, then we get an error that we have no instances for the Generic and HasDatatypeInfo classes. For generics-sop functions to work, types must be an instance of these classes. One way to achieve this is to use Template Haskell:

deriveGeneric ''OrderType

(Another way for people who dislike Template Haskell is mentioned in the library documentation.)

Now, we can use conNames:

GHCi> conNames (Proxy @OrderType)
K "Confirmed" :* (K "AwaitingShipping" :* (K "Shipped" :* Nil))

A variant of this is a function that takes a specific value, and computes the outermost constructor that built that value.

conName ::
  forall a .
  (Generic a, HasDatatypeInfo a)
  => a -> String
conName x =
  hcollapse
    (hzipWith
      const
      (conNames (Proxy @a))
      (unSOP (from x))
    )

Here, we use from to compute the generic representation of the given value, which is a sum of products. The sum encodes a choice between one of the constructors of the datatype. We can use hzipWith to combine a compatible product (a product of n values) and sum (a choice of option i out of n possible options), and it will select the i'th position of the product and combine the two. By using const to combine the two, the effect is that we'll just return the constructor name corresponding to the given constructor from our conNames product. The hcollapse application in the end extracts the single String value.

Let's look at a number of examples again:

GHCi> conName Confirmed
"Confirmed"
GHCi> conName (Just 3)
"Just"
GHCi> conName [1,2,3]
":"

Note that in the last example, the list is, on the top-level, just an application of cons.

Next, we define a function enum that computes a product of all the values of an enumeration type. This is similar to conNames, but rather than returning the constructor names (as strings), we return the actual constructors.

enum ::
  forall a .
  (Generic a, HasDatatypeInfo a, All ((~) '[]) (Code a))
  => NP (K a) (Code a)
enum =
  hmap
    (mapKK to)
    (apInjs'_POP (POP (hcpure (Proxy @((~) '[])) Nil)))

The apInks'_POP function produces a product of all the constructor functions in their generic representation. These still have to be applied to a representation of their arguments, and we need to provide these arguments as a product of products (a two-dimensional table with one row per constructor, each row containing the arguments to be applied to that particular constructors).

Fortunately, we restrict ourselves to enumeration types here. These are types without any constructor arguments. This is expressed by the constraint All ((~) '[]) (Code a). The code of a type is a list of list of types. The outer list contains an entry per constructor, the inner lists give the types of the constructor arguments. The constraint states that each of the inner lists must be empty, which is equivalent to each of the constructors having no arguments.

We can therefore produce a product of empty argument lists, which is what we do via POP (hcpure (Proxy (@((~) '[])) Nil)).

Finally, we use hmap with to to turn each of the constructed values back from their generic representation into their original shape.

Let's look at examples:

GHCi> enum @Bool
K False :* (K True :* Nil)

Compare this again with

GHCi> conNames (Proxy @Bool)
K "False" :* (K "True" :* Nil)

and note that in one case, we return strings, in the other, we return actual values.

GHCi> enum @Ordering
K LT :* (K EQ :* (K GT :* Nil))

If we try to apply enum to a type that is not an enumeration type, we get a type error.

If we try to apply enum to OrderType, we get an error that a Show instance for OrderType is lacking.

If we derive one via

deriving instance Show OrderType

we obtain:

GHCi> enum @OrderType
K Confirmed :* (K AwaitingShipping :* (K Shipped :* Nil))

If we use the custom Show instance that was desired in the question and that I show how to define below, we instead get

GHCi> enum @OrderType
K confirmed :* K awaiting_shipping :* K shipped :* Nil

This also demonstrates why it might not be such a good idea to change the instance, because we now see that the output of show used by GHCi to print back the result mixes standard Haskell notation with special notation intended to be used in a particular domain.

Before we go there, however, let's first implement one final utility function that we'll need for the parsing direction:

conTable ::
  forall a .
  (Generic a, HasDatatypeInfo a, All ((~) '[]) (Code a))
  => [(String, a)]
conTable =
  hcollapse
    (hzipWith
      (mapKKK (,))
      (conNames (Proxy @a))
      enum
    )

The conTable function computes a lookup table associating the string constructor names with the actual values. We have the functions to compute the two products, conNames and enum. We use hzipWith with (,) to pair them. The result is another product, but because the product contains the same type in every position, we can use hcollapse to turn it into a normal Haskell list.

GHCi> conTable @Bool
[("False", False), ("True", True)]
GHCi> conTable @Ordering
[("LT", LT), ("EQ", EQ), ("GT", GT)]
GHCi> conTable @OrderType
[("Confirmed", Confirmed), ("AwaitingShipping", AwaitingShipping), ("Shipped", Shipped)]

The final example is using the default / derived Show instance.

With these ingredients in place, we are now able to implement custom show and read replacements for enumeration types. The show direction quite easy:

customShowEnum ::
  forall a .
  (Generic a, HasDatatypeInfo a, All ((~) '[]) (Code a))
  => (String -> String)
  -> a -> String
customShowEnum f = f . conName

Given a value, we use conName to determine its constructor and then apply the given conversion function to the result.

This function would work for all types in Generic and HasDatatypeInfo, so the All ((~) '[]) (Code a) constraint that restricts it to enumeration types is optional.

Here are a few examples:

GHCi> customShowEnum id AwaitingShipping
"AwaitingShipping"
GHCi> customShowEnum reverse Confirmed
"demrifnoC"
GHCi> customShowEnum (camelTo2 '_') AwaitingShipping
"awaiting_shipping"

For the read replacement, we implement a function that can be used to defined the readPrec method of the Read class. This produces a parser of type ReadPrec a:

readPrec :: Read a => ReadPrec a

The basic strategy is as follows: We start from the lookup table given by conTable. We adjust the strings in this lookup table using the same conversion function that we also used in customShowEnum. Given an input string, we then try to find it in the adjusted lookup table, and if we find it, we return the associated value. The code looks as follows:

customReadEnum ::
  forall a .
  (Generic a, HasDatatypeInfo a, All ((~) '[]) (Code a))
  => (String -> String)
  -> ReadPrec a
customReadEnum f =
  let
    adjustedTable :: [(Lexeme, a)]
    adjustedTable = map (\ (n, x) -> (Ident (f n), x)) conTable
  in
    parens $ do
      n <- lexP
      maybe pfail return (lookup n adjustedTable)

This essentially follows the description above: parens additionally allows the constructor name to be wrapped in parentheses, as is generally allowed by read, and the use of lexP additionally handles whitespace. If the lookup in the table fails, we let the parser fail using pfail.

If we want to try this, we have to run the ReadPrec parser by applying readPrec_to_S, which then expects a precedence level (irrelevant in this case) and an input string and returns a list of pairs containing pairs of possible parses and remaining strings:

GHCi> readPrec_to_S (customReadEnum @OrderType id) 0 "AwaitingShipping"
[(AwaitingShipping, "")]
GHCi> readPrec_to_S (customReadEnum @OrderType (camelTo2 '_')) 0 "AwaitingShipping"
[]
GHCi> readPrec_to_S (customReadEnum @OrderType (camelTo2 '_')) 0 "awaiting_shipping"
[(AwaitingShipping, "")]
>>> readPrec_to_S (customReadEnum @OrderType (camelTo2 '_')) 0 "   ( awaiting_shipping)  "
[(AwaitingShipping, "  ")]

If we now wanted to use customReadShow and customReadEnum to define the Show and Read instance for OrderType, we could do this simply as follows:

instance Show OrderType where
  show = customShowEnum (camelTo2 '_')

instance Read OrderType where
  readPrec = customReadEnum (camelTo2 '_')

However, as I said above, I'd advise to just use the derived instances here to avoid confusion, and for the domain-specific textual representation, I would just introduce new classes, for example:

class ToString a where
  toString :: a -> String

class FromString a where
  fromString :: String -> Maybe a

instance ToString OrderType where
  toString = customShowEnum (camelTo2 '_')

customFromString ::
  forall a .
  (Generic a, HasDatatypeInfo a, All ((~) '[]) (Code a))
  => (String -> String)
  -> String -> Maybe a
customFromString f x =
  case readPrec_to_S (customReadEnum f) 0 x of
    [(r, "")] -> Just r
    _         -> Nothing

instance FromString OrderType where
  fromString = customFromString (camelTo2 '_')

We could also go a few steps further:

  • We could use default signatures that map toString and fromString either to the default Show / Read behaviour, or to our custom behaviour, so that we can provide empty instances or using deriving in the more common of the two cases.

  • We could use a third class to associate a particular conversion function with a given type and use this class in our generic definitions, to make it more obvious that the same function is used for both directions and thereby remove a source of potential errors.

share|improve this answer
    
This is a gem of an answer. I'm going through youtube.com/watch?v=sQxH349HOik to be able to understand and completely internalise Generics SOP. Is it alright if I ask follow-up questions via these comments? – Saurabh Nanda Jun 1 at 13:29
    
First question -- what's with @Foo syntax. It's the first time I'm seeing this. – Saurabh Nanda Jun 1 at 13:29
    
In the video, most examples deal with the I constructor. What is the K constructor for? What others are there and is there a document describing the differences? – Saurabh Nanda Jun 1 at 13:31
1  
The @Foo syntax is a type application, available since GHC 8. If you have any polymorphic function, you can use @<sometype> to instantiate it. So, for example id @Char :: Char -> Char. In practice, this can often be used in situation where you would otherwise need extra type annotations or even proxies in order to help the GHC type checker. – kosmikus Jun 1 at 17:47
    
The I constructor is identity, defined newtype I a = I a. The K constructor is for "constant, define newtype K c a = K c. In practice, if you parameterize one of the NP-like structures by I, you get a heterogeneous structure, containing in every position what the index list indicates. If you instantiate it to K Something, you get a homogeneous structure containing a Something in every position. – kosmikus Jun 1 at 17:50

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.