Archive for October, 2009

Introduction to attempt error reporting library

October 25, 2009

I’ve just released the attempt package on hackage. It is meant to address the issue of error handling, which is currently rather ad-hoc in Haskell. It’s my hope that by putting it in its own package, we can start to standardize between packages and get some nice composable error handling between packages.

The library is built on extensible exceptions to give users the ability to return more complex exception values than is afforded by either a Maybe or (Either String). It’s similar to an (Either SomeException), but provides many class instances, a monad transformer and helper functions.

Below is an HTML version of the literate Haskell example file that is in the attempt repository. Hopefully it will give you a running start on how to use it.

This library should be considered unstable, in that the API is still open to change. As such, I’d appreciate any feedback people have.


This file is an example of how to use the attempt library, as literate
Haskell. We’ll start off with some import statements.

> {-# LANGUAGE DeriveDataTypeable #-}
> {-# LANGUAGE ExistentialQuantification #-}
> import Data.Attempt
> import Control.Monad.Attempt
> import qualified Data.Attempt.Helper as A
> import System.Environment (getArgs)
> import Safe (readMay)
> import Data.Generics
> import qualified Control.Exception as E

We’re going to deal with a very simplistic example. Let’s say you have some
text files that need processing. The files are each three lines long. The
first and last line are integers; the second is a mathematical operator (one
of +, -, * and /). Your goal with each file is to simply perform the
mathematical operator on the two numbers. Let’s start with the Operator data
type.

> data Operator = Add | Sub | Mul | Div
> instance Read Operator where
>   readsPrec _ "+" = [(Add, "")]
>   readsPrec _ "-" = [(Sub, "")]
>   readsPrec _ "*" = [(Mul, "")]
>   readsPrec _ "/" = [(Div, "")]
>   readsPrec _ s = []
>
> toFunc :: Operator -> Int -> Int -> Int
> toFunc Add = (+)
> toFunc Sub = (-)
> toFunc Mul = (*)
> toFunc Div = div

Nothing special here (besides some sloppy programming). Let’s go ahead and
write the first version of our process function.

> process1 :: FilePath -> IO Int
> process1 filePath = do
>   contents <- readFile filePath -- IO may fail for some reason
>   let [num1S, opS, num2S] = lines contents -- maybe there aren't 3 lines?
>       num1 = read num1S -- read might fail
>       op   = read opS   -- read might fail
>       num2 = read num2S -- read might fail
>   return $ toFunc op num1 num2

If you test this function out on a valid file, it works just fine. But what
happens when you call it with invalid data? In fact, there are five things
which could go wrong that I’d be interested in dealing with in the above code.

So now we need some way to deal with these issues. There’s a few standard ones
in the Haskell toolbelt:

  1. Wrap the response in Maybe. Disadvantage: can’t give any indication what he
    error was.
  2. Wrap the response in an Either String. Disadvantage: error type is simply a
    string, which isn’t necesarily very informative. Also, Either is not defined
    by the standard library to be a Monad, making this type of processing clumsy.
  3. Wrap in a more exotic Either SomeException or some such. Disadvantage:
    still not a Monad.
  4. Declare your own error type. Disadvantage: ad-hoc, and makes it very
    difficult to compose different libraries together.

In steps the attempt library. It’s essentially option 4 wrapped in a library
for general consumption. Features include:

  1. Uses extensible exceptions so you can report whatever information you want.
  2. Exceptions are not explicitly typed, so you don’t need to wrap insanely
    long function signatures to explain what exceptions you might be throwing.
  3. Defines all the standard instances you want, including providing a monad
    transformers.

    1. Attempt is a Monad.
    2. There is a Data.Attempt.Helper module which provides a special read
      function.
  4. Let’s transform the above example to use the attempt library in its most basic
    form:

    > data ProcessError = NotThreeLines String | NotInt String | NotOperator String
    >   deriving (Show, Typeable)
    > instance E.Exception ProcessError
    >
    > process2 :: FilePath -> IO (Attempt Int)
    > process2 filePath =
    >   E.handle (\e -> return $ Failure (e :: E.IOException)) $ do
    >       contents <- readFile filePath
    >       return $ case lines contents of
    >           [num1S, opS, num2S] ->
    >               case readMay num1S of
    >                   Just num1 ->
    >                       case readMay opS of
    >                           Just op ->
    >                               case readMay num2S of
    >                                   Just num2 -> Success $ toFunc op num1 num2
    >                                   Nothing -> Failure $ NotInt num2S
    >                           Nothing -> Failure $ NotOperator opS
    >                   Nothing -> Failure $ NotInt num1S
    >           _ -> Failure $ NotThreeLines contents

    If you run these on the sample files in the input directory, you’ll see that
    we’re getting the right result; the program in not erroring out, simply
    returning a failure message. However, this wasn’t very satisfactory with all of
    those nested case statements. Let’s use two facts to our advantage:

    > data ProcessErrorWrapper =
    >   forall e. E.Exception e => BadIntWrapper e
    >   | forall e. E.Exception e => BadOperatorWrapper e
    >   deriving (Typeable)
    > instance Show ProcessErrorWrapper where
    >   show (BadIntWrapper e) = "BadInt: " ++ show e
    >   show (BadOperatorWrapper e) = "BadOperator: " ++ show e
    > instance E.Exception ProcessErrorWrapper
    > process3 :: FilePath -> IO (Attempt Int)
    > process3 filePath =
    >   E.handle (\e -> return $ Failure (e :: E.IOException)) $ do
    >       contents <- readFile filePath
    >       return $ case lines contents of
    >           [num1S, opS, num2S] -> do
    >               num1 <- wrapFailure BadIntWrapper $ A.read num1S
    >               op   <- wrapFailure BadOperatorWrapper $ A.read opS
    >               num2 <- wrapFailure BadIntWrapper $ A.read num2S
    >               return $ toFunc op num1 num2
    >           _ -> Failure $ NotThreeLines contents

    That certainly cleaned stuff up. The special read function works just as you
    would expected: if the read succeeds, it returns a Success value. Otherwise,
    it returns a Failure.

    But what’s going on with that wrapFailure stuff? This is just to clean up the
    output. The read function will return an exception of type “CouldNotRead”,
    which let’s you know that you failed a read attempt, but doesn’t let you know
    what you were trying to read.

    So far, so good. But that “case lines contents” bit is still a little
    annoying. Let’s get rid of it.

    > process4 :: FilePath -> IO (Attempt Int)
    > process4 filePath =
    >   E.handle (\e -> return $ Failure (e :: E.IOException)) $ do
    >       contents <- readFile filePath
    >       return $ do
    >           let contents' = lines contents
    >           [num1S, opS, num2S] <-
    >               A.assert (length contents' == 3)
    >                        contents'
    >                        (NotThreeLines contents)
    >           num1 <- wrapFailure BadIntWrapper $ A.read num1S
    >           op   <- wrapFailure BadOperatorWrapper $ A.read opS
    >           num2 <- wrapFailure BadIntWrapper $ A.read num2S
    >           return $ toFunc op num1 num2

    There’s unfortunately no simple way to catch pattern match fails, but an
    assertion works almost as well. The only thing which is still a bit irksome is
    the whole exception handling business. Let’s be rid of that next.

    > process5 :: FilePath -> AttemptT IO Int
    > process5 filePath = do
    >   contents <- A.readFile filePath
    >   let contents' = lines contents
    >   [num1S, opS, num2S] <-
    >       A.assert (length contents' == 3)
    >                contents'
    >                (NotThreeLines contents)
    >   num1 <- wrapFailure BadIntWrapper $ A.read num1S
    >   op   <- wrapFailure BadOperatorWrapper $ A.read opS
    >   num2 <- wrapFailure BadIntWrapper $ A.read num2S
    >   return $ toFunc op num1 num2

    There’s a built-in readFile function that handles all that handling of error
    garbage for you. If you compare this version of the function to the first, you
    should notice that it’s very similar. You can avoid a lot of the common
    sources of runtime errors by simply replacing unsafe functions (Prelude.read)
    with safe ones (Data.Attempt.Helper.read).

    However, there’s still one other different between process5 and process2-4:
    the return type. process2-4 return (IO (Attempt Int)), while process5 returns
    an (AttemptT IO Int). This is the monad transformer version of Attempt; read
    the documentation for more details. To get back to the same old return type as
    before:

    > process6 :: FilePath -> IO (Attempt Int)
    > process6 = runAttemptT . process5

    Below is a simple main function for testing out these various functions. Try
    them out on the files in the input directory. Also, to simulate an IO error,
    call them on a non-existant file.

    > main = do
    >   args <- getArgs
    >   if length args /= 2
    >       then error "Usage: Example.lhs <process> <file path>"
    >       else return ()
    >   let [processNum, filePath] = args
    >   case processNum of
    >       "1" -> process1 filePath >>= print
    >       "2" -> process2 filePath >>= print
    >       "3" -> process3 filePath >>= print
    >       "4" -> process4 filePath >>= print
    >       "5" -> runAttemptT (process5 filePath) >>= print
    >       "6" -> process6 filePath >>= print
    >       x -> error $ "Invalid process function: " ++ x

Monadic pairs and Kleisli arrows

October 19, 2009

While working on my data-object library, I needed to apply some monadic functions to a tuple, and get back a monaidc tuple. In code:

f :: Monad m => (a -> m b) -> (c -> m d) -> (a, c) -> m (b, d)

The most obvious thing to do is just long-hand it with do notation:

test1 :: Monad m => (a -> m b) -> (c -> m d) -> (a, c) -> m (b, d)
test1 f g (a, c) = do
    b <- f a
    d <- g c
    return (b, d)

But who wants to write that? I got a recommendation instead to try out liftM2. After playing with it a bit, I came out with:

test2 :: Monad m => (a -> m b) -> (c -> m d) -> (a, c) -> m (b, d)
test2 f g (a, c) = uncurry (liftM2 (,)) $ (f *** g) (a, c)

Which is definitely more respectable (though arguably more line noise). After staring at that for a little bit, I realized that there was nothing particularly monadic about this, and could instead be expressed Applicatively:

test3 :: Applicative f => (a -> f b) -> (c -> f d) -> (a, c) -> f (b, d)
test3 f g (a, c) = uncurry (liftA2 (,)) $ (f *** g) (a, c)

Then of course comes the eta-reduction, so you get:

test4 :: Applicative f => (a -> f b) -> (c -> f d) -> (a, c) -> f (b, d)
test4 f g = uncurry (liftA2 (,)) . (f *** g)

Kleisli

I’m sure you noticed the use of *** in test2, test3 and test4. That’s not too surprising; often times we want to use Data.Arrow functions when operating on tuples. As I was staring at the documentation for Data.Arrow, I decided to see what could be done with Kleisli. I came up with:

test5 :: Monad m => (a -> m b) -> (c -> m d) -> (a, c) -> m (b, d)
test5 f g = runKleisli $ Kleisli f *** Kleisli g

This, in my opinion, is much more readable than the above. For those who don’t know, Kleisli allows turning any monadic function into an arrow. Another advantage of this is I can now do things like:

test6 :: Monad m => (a -> m b) -> (a, c) -> m (b, c)
test6 f = runKleisli $ first $ Kleisli f

The downside of test5 versus test4 is that it only works Monadically, not Applicatively. And in case you were wondering, you can’t define a KleisliA type value which will work on all Applicatives. This boils down to where the real extra power of Monads versus Applicatives lies.

All Arrows must be Categorys. One of the functions of a Category is (.), or essentially function composition. The definition for Kleisli monads goes, after unwrapping:

f . g = \b -> g b >>= f

There is no equivalent to this in Applicatives. So sadly, if I want to make my tuple lifting functions work on applicatives, I’m stuck with liftA2.