I also blog frequently on the Yesod Web Framework blog, as well as the FP Complete blog.

Proposed conduit reskin

September 23, 2016

See a typo? Have a suggestion? Edit this page on Github

In a few different conversations I've had with people, the idea of reskinning some of the surface syntax of the conduit library has come up, and I wanted to share the idea here. I call this "reskinning" since all of the core functionality of conduit would remain unchanged in this proposal, we'd just be changing operators and functions a bit.

The idea here is: conduit borrowed the operator syntax of $$, =$ and $= from enumerator, and it made sense at the beginning of its lifecycle. However, for quite a while now conduit has evolved to the point of having a unified type for Sources, Conduits, and Sinks, and the disparity of operators adds more confusion than it may be worth. So without further ado, let's compare a few examples of conduit usage between the current skin:

import Conduit
import qualified Data.Conduit.Binary as CB

main :: IO ()
main = do
    -- copy files
    runResourceT $ CB.sourceFile "source.txt" $$ sinkFile "dest.txt"

    -- sum some numbers
    print $ runIdentity $ enumFromToC 1 100 $$ sumC

    -- print a bunch of numbers
    enumFromToC 1 100 $$ mapC (* 2) =$ takeWhileC (< 100) =$ mapM_C print

With a proposed reskin:

import Conduit2
import qualified Data.Conduit.Binary as CB

main :: IO ()
main = do
    -- copy files
    runConduitRes $ CB.sourceFile "source.txt" .| sinkFile "dest.txt"

    -- sum some numbers
    print $ runConduitPure $ enumFromToC 1 100 .| sumC

    -- print a bunch of numbers
    runConduit $ enumFromToC 1 100 .| mapC (* 2) .| takeWhileC (< 100) .| mapM_C print

This reskin is easily defined with this module:

{-# LANGUAGE FlexibleContexts #-}
module Conduit2
    ( module Conduit
    , module Conduit2
    ) where

import Conduit hiding (($$), (=$), ($=), (=$=))
import Data.Void (Void)

infixr 2 .|
(.|) :: Monad m
     => ConduitM a b m ()
     -> ConduitM b c m r
     -> ConduitM a c m r
(.|) = fuse

runConduitPure :: ConduitM () Void Identity r -> r
runConduitPure = runIdentity . runConduit

runConduitRes :: MonadBaseControl IO m
              => ConduitM () Void (ResourceT m) r
              -> m r
runConduitRes = runResourceT . runConduit

To put this in words:

Replace the $=, =$, and =$= operators - which are all synonyms of each other - with the .| operator. This borrows intuition from the Unix shell, where the pipe operator denotes piping data from one process to another. The analogy holds really well for conduit, so why not borrow it? (We call all of these operators "fusion.")
Get rid of the $$ operator - also known as the "connect" or "fuse-and-run" operator - entirely. Instead of having this two-in-one action, separate it into .| and runConduit. The advantage is that no one needs to think about whether to use .| or $$, as happens today. (Note that runConduit is available in the conduit library today, it's just not very well promoted.)
Now that runConduit is a first-class citizen, add in some helper functions for two common use cases: running with ResourceT and running a pure conduit.

The goals here are to improve consistency, readability, and intuition about the library. Of course, there are some downsides:

There's a slight performance advantage (not benchmarked recently unfortunately) to foo $$ bar versus runConduit $ foo =$= bar, since the former combines both sets of actions into one. We may be able to gain some of this back with GHC rewrite rules, but my experience with rewrite rules in conduit has been less than reliable.
Inertia: there's a lot of code and material out there using the current set of operators. While we don't need to ever remove (or even deprecate) the current operators, having two ways of writing conduit code in the wild can be confusing.
Conflicting operator: doing a quick Hoogle search reveals that the parallel package already uses .|. We could choose a different operator instead (|. for instance seems unclaimed), but generally I get nervous any time I'm defining new operators.
For simple cases like source $$ sink, code is now quite a few keystrokes longer: runConduit $ source .| sink.

Code wise, this is a trivial change to implement. Updating docs to follow this new convention wouldn't be too difficult either. The question is: is this a good idea?

Blog archive

A Very Naive Overview of Exercise (Part 3) June 15, 2017
A Very Naive Overview of Nutrition (Part 2) June 14, 2017
A Very Naive Overview of Nutrition and Exercise (Part 1) June 13, 2017
How to send me a pull request June 6, 2017
Why I lift June 1, 2017
Playing with lens-aeson May 29, 2017
The Worst Function in Conduit May 7, 2017
Stackage's no-revisions (experimental) field April 27, 2017
Haskell Success Stories April 24, 2017
Generalizing Type Signatures April 20, 2017
Enough with Backwards Compatibility April Fools', 2017
Better Exception Messages February 16, 2017
Hackage Security and Stack February 14, 2017
Stackage design choices: making Haskell curated package sets January 23, 2017
Follow up on mapM_ January 19, 2017
safe-prelude: a thought experiment January 16, 2017
Foldable.mapM_, Maybe, and recursive functions January 10, 2017
Conflicting Module Names January 5, 2017
Functors, Applicatives, and Monads January 3, 2017
Beware of readFile December 22, 2016
Call for new Stackage Curator December 19, 2016
An extra benefit of open sourcing December 13, 2016
Haskell Documentation, 2016 Update November 28, 2016
Haskell for Dummies November 23, 2016
Spreading the Gospel of Haskell November 22, 2016
Haskell's Missing Concurrency Basics November 16, 2016
Designing APIs for Extensibility November 3, 2016
New Conduit Tutorial October 13, 2016
Proposed conduit reskin September 23, 2016
Monads are like Lannisters September 12, 2016
Using AppVeyor for Haskell+Windows CI August 31, 2016
Restarting this blog August 24, 2016
XSLT Rant Explained April 9, 2012
Open Letter to XSLT Fans April 5, 2012
Dysfunctional Programming: FindMimeFromData March 22, 2012
First Post January 31, 2012