Working with IO

In previous chapters, we were able to build a parser from a text string to a Haskell representation of our markup language, and we built an EDSL for easy writing of HTML code. However, our program is still not useful to other users because we did not make this functionality accessible via some sort of user interface.

In our program, we'd like to take user input and then convert it to HTML. There are many ways to design this kind of interface, for example:

  • Get text input via the standard input and output HTML via the standard output
  • Receive two file names as command-line arguments, read the contents of the first one, and write the output to the second one
  • Ask for fancier command-line arguments parsing and prefix the file names with flags indicating what they are
  • Some fancy GUI interface
  • Combination of all of the above

To make this interesting, we will start with the following interface:

  1. If the user calls the program without arguments, we will read from the standard input, and write to the standard output
  2. If the user calls the program with two arguments, the first one will be the input file name, and the second one will be the output file name
  3. If the output file already exists, we'll ask the user if they want to overwrite the file
  4. On any other kind of input, we'll print a generic message explaining the proper usage

In a later chapter, we will add a fancier command-line interface using a library, and also read whole directories in addition to single files.

But first, we need to learn about I/O in Haskell, what makes it special, and why it's different from other programming languages.

Purely functional

Originally, Haskell was designed as an open standard functional programming language with non-strict semantics, to serve as a unifying language for future research in functional language design.

In GHC Haskell, we use a lazy evaluation strategy to implement non-strict semantics (We've talked about laziness before).

The requirement for non-strict semantics raises interesting challenges: How do we design a language that can do more than just evaluate expressions, how do we model interaction with the outside world, how do we do I/O?

The challenge with doing I/O operations in a language with a lazy evaluation strategy is that as programs grow larger, the order of evaluation becomes less trivial to figure out. Consider this hypothetical code example (which won't actually type-check in Haskell, we'll see why soon):

addWithInput :: Int -> Int
addWithInput n = readIntFromStdin + n

main =
  let
    result1 = addWithInput 1
    result2 = addWithInput 2
  in
    print (result2 - result1)

This hypothetical program will read 2 integers from the standard input, and then will subtract the second one (+2) from the first one (+1), or so we would expect if this was a strict language. In a strict language, we expect the order of operations to happen from the top down.

But in a lazy language, we don't evaluate an expression until it is needed, and so neither result1 nor result2 are evaluated until we wish to print the result of subtracting one from the other. And then, when we try to evaluate -, it evaluates the two arguments from left to right, so we first evaluate result2.

Evaluating result2 with substitution means replacing occurrences of n with the input 2, and then evaluate the top-level function (+), which is a primitive function. We then evaluate its arguments, readIntFromStdin and then n; at this point we are reading the first integer from the stdin.

After calculating the result, we can move to evaluate result1, which will read the second integer from stdin. This is the complete opposite of what we wanted!

Issues like these make lazy evaluation hard to work within the presence of side effects - when the evaluation of an expression can affect or be affected by the outside world, this includes reading/writing from mutable memory or performing I/O operations.

We call functions that have side-effects, such as addWithInput, impure functions. And an unfortunate consequence of impure functions is that they can return different results even when they take the same input.

The presence of impure functions makes it harder for us to reason about lazy evaluation, and also messes up our ability to use equational reasoning to understand programs.

Therefore, in Haskell, it was decided to only allow pure functions and expressions - ones that have no side effects. Pure functions will always return the same output (given the same input) and evaluating pure expressions is deterministic.

But now, how can we do I/O operations? There are many possible solutions.

For Haskell, it was decided to design an interface with an accompanied type called IO. IO's interface will force a distinction from non-I/O expressions and will also require that in order to combine multiple IO operations, we will have to specify the order of the operations.

IO

IO is an opaque type, like our Html type, in which we hide its internal representation from the user behind an interface. But in this case, IO is a built-in type and its internals are hidden by the Haskell language rather than a module.

Similar to Maybe, IO has a payload type that represents the result of an IO operation. When there isn't a meaningful result, we use the unit type, () (which only has one value: ()) to represent that.

Here are a few IO operations and functions that return IO operations:

putStrLn :: String -> IO ()

getLine :: IO String

getArgs :: IO [String]

lookupEnv :: String -> IO (Maybe String)

writeFile :: FilePath -> String -> IO ()

Notice that each function returns an IO <something>, but what does that mean?

The meaning behind IO a is that it is a description of a program (or subroutine) that, when executed, will produce some value of type a, and may do some I/O effects during execution.

Executing an IO a is different from evaluating it. Evaluating an IO a expression is pure - the evaluation will always reduce to the same description of a program. This helps us keep purity and equational reasoning!

The Haskell runtime will execute the entry point to the program (the expression main that must have the type IO ()). For our IO operation to also run - it has to be combined into the main expression - let's see what that means.

Combining IO expressions

Like our Html.Structure type, the IO interface provides combinators for composing small IO operations into bigger ones. This interface also ensures that the order of operations is well-defined!

Note that, just like the <> we've defined for Html.Structure, the combinators for IO are implemented as type-class instances rather than specialized variants (for example our append_ function was a specialized version of <> tailored only for Structure).

In this section, I will introduce specialized type signatures rather than generalized ones, because I think it'll be easier to digest, but we'll talk about the generalized versions later.

>>=

Our first combinator is >>= (pronounced bind), and is the most useful of the bunch:

(>>=) :: IO a -> (a -> IO b) -> IO b

This combinator takes two arguments, the first is an IO operation, and the second is a function that takes as input the result of the first IO operation and returns a new IO b, which is the final result.

Here are a few examples using the functions we described above:

  1. Echo

    getLine >>= (\line -> putStrLn line)
    

    We are reading a line from the standard input on the left of >>=, we receive the input to the right of >>= as an argument to the lambda function, and then write it to the standard output in the body of the lambda function. >>='s role here is to pass the result of the IO operation on the left to the function returning an IO operation on the right.

    Notice how >>= defines an order of operations - from left to right.

    The type of each sub-expression here is:

    getLine :: IO String
    
    putStrLn :: String -> IO ()
    
    (>>=) :: IO String -> (String -> IO ()) -> IO ()
    
    line :: String
    
    • Question: what is the type of the whole expression?
      AnswerIO ()

    Also, note that this example can be written in a more concise manner in point-free style getLine >>= putStrLn.

  2. Appending two inputs

    getLine >>= (\honorific -> getLine >>= (\name -> putStrLn ("Hello " ++ honorific ++ " " ++ name)))
    

    This subroutine combines multiple operations together; it reads two lines from the standard input and prints a greeting. Note that:

    • Using >>= defines the order of operation from left to right
    • Because of the scoping rules in Haskell, honorific will be in scope in the body of the function for which it is its input, including the most inner function

    This is a bit hard to read, but we can remove the parenthesis and add indentation to make it a bit easier to read:

    getLine >>= \honorific ->
      getLine >>= \name ->
        putStrLn ("Hello " ++ honorific ++ " " ++ name)
    

Let's see a few more combinators!

*> and >>

(*>) :: IO a -> IO b -> IO b
(>>) :: IO a -> IO b -> IO b

*> and >> have the same type signature for IO and mean the same thing. In fact, *> is a slightly more generalized version of >> and can always be used instead of >>, which only still exists to avoid breaking backward compatibility.

*> for IO means run the first IO operation, discard the result then run the second operation. It can be implemented using >>=:

a *> b = a >>= \_ -> b

This combinator is useful when we want to run several IO operations one after the other that might not return anything meaningful, such as putStrLn:

putStrLn "hello" *> putStrLn "world"

pure and return

pure :: a -> IO a

like *> and >>, pure is a more general version of return. pure also has the advantage of not having a resemblance to an unrelated keyword in other languages.

Remember that we said IO a is a description of a program that, when executed, will produce some value of type a and may do some I/O effects during execution?

With pure, we can build an IO a that does no I/O and will produce a specific value of type a, the one we supply to pure!

This function is useful when we want to do some uneffectful computation that depends on IO.

For example:

confirm :: IO Bool
confirm =
  putStrLn "Are you sure? (y/n)" *>
    getLine >>= \answer ->
      case answer of
        "y" -> pure True
        "n" -> pure False
        _ ->
          putStrLn "Invalid response. use y or n" *>
            confirm

Trying to return just True or False here wouldn't work because of the type of >>=:

(>>=) :: IO a -> (a -> IO b) -> IO b

The right side of >>= in our code example (\answer -> case ...) must be of type String -> IO Bool. This is because:

  1. getLine :: IO String, so the a in the type signature of >>= should be the same as String in this instance, and
  2. confirm :: IO Bool, so b should be Bool

fmap and <$>

fmap :: (a -> b) -> IO a -> IO b

<$> is the infix version of fmap. Use it at your discretion.

What if we want a function that reads a line from stdin and returns it with ! at the end? We could use a combination of >>= and pure:

getLine >>= \line -> pure (line ++ "!")

The pattern is unified to the fmap function:

fmap (\line -> line ++ "!") getLine

fmap applies a function to the value to be returned from the IO operation, also known as "mapping" over it.

(By the way, Have you noticed the similarities between fmap and map :: (a -> b) -> [a] -> [b]?)

Summary

Here's a list of IO combinators we ran into:

-- chaining IO operations: passing the *result* of the left IO operation
-- as an argument to the function on the right.
-- Pronounced "bind".
(>>=) :: IO a -> (a -> IO b) -> IO b

-- sequence two IO operations, discarding the payload of the first.
(*>) :: IO a -> IO b -> IO b

-- "lift" a value into IO context, does not add any I/O effects.
pure :: a -> IO a

-- "map" (or apply a function) over the payload value of an IO operation.
fmap :: (a -> b) -> IO a -> IO b

IO is first class

The beauty of IO is that it's a completely first-class construct in the language, and is not really different from Maybe, Either, or Structure. We can pass it to functions, put it in a container, etc. Remember that it represents a description of a program, and without combining it into main in some way won't actually do anything. It is just a value!

Here's an example of a function that takes IO actions as input:

whenIO :: IO Bool -> IO () -> IO ()
whenIO cond action =
  cond >>= \result ->
    if result
      then action
      else pure ()

And how it can be used:

main :: IO ()
main =
  putStrLn "This program will tell you a secret" *>
    whenIO confirm (putStrLn "IO is actually pretty awesome") *>
      putStrLn "Bye"

Notice how putStrLn "IO is actually pretty awesome" isn't executed right away, but only if it is what whenIO returns, and in turn, is combined with *> as part of the main expression.

Getting out of IO?

What we've seen above has great consequences for the Haskell language. In our Html type, we had a function render :: Html -> String that could turn an Html into a string value.

In Haskell, there is no way to implement a function such as execute :: IO a -> a that preserves purity and equational reasoning!

Also, IO is opaque. It does not let us examine it. So we are really bound to what the Haskell API for IO allows us to do.

This means that we need to think about using IO differently!

In Haskell, once we get into IO, there is no getting out. The only thing we can do is to build bigger IO computations by combining it with more IO computations.

We also can't use IO a in place of an a. For example, we can't write getLine ++ "!" because ++ expects both sides to be String, but getLine's type is IO String. The types do not match! We have to use fmap, and the return type must be IO String, as we've seen before.

In Haskell, we like to keep IO usage minimal, and we like to push it to the edges of the program. This pattern is often called Functional core, imperative shell.

Functional core, imperative shell

In our blog generator program, we want to read a file, parse it, convert it to HTML, and then print the result to the console.

In many programming languages, we might interleave reading from the file with parsing, and writing to the file with the HTML conversion. But we don't mix these here. Parsing operates on a String value rather than some file handle, and Html is being converted to a String rather than being written to the screen directly.

This approach of separating IO and pushing it to the edge of the program gives us a lot of flexibility. These functions without IO are easier to test and examine (because they are guaranteed to have deterministic evaluation!), and they are more modular and can work in many contexts (reading from stdin, reading from a network socket, writing to an HTTP connection, and more).

This pattern is often a good approach for building Haskell programs, especially batch programs.

Building a blog generator

We'd like to start building a blog generator, and we want to have the following interface:

  1. If the user calls the program without arguments, we will read from the standard input, and write to the standard output
  2. If the user calls the program with two arguments, the first one will be the input file name, and the second one will be the output file name
  3. If the output file already exists, we'll ask the user if they want to overwrite the file
  4. On any other kind of input, we'll print a generic message explaining the proper usage

We are going to need a few functions:

getArgs :: IO [String] -- Get the program arguments

getContents :: IO String -- Read all of the content from stdin

readFile :: FilePath -> IO String -- Read all of the content from a file

writeFile :: FilePath -> String -> IO () -- Write a string into a file

doesFileExist :: FilePath -> IO Bool -- Checks whether a file exists

And the following imports:

import System.Directory (doesFileExist)
import System.Environment (getArgs)

We don't need to add the following import because Prelude already imports these functions for us:

-- imported by Prelude
import System.IO (getContents, readFile, writeFile)

  1. Implement a function process :: Title -> String -> String which will parse a document to markup, convert it to HTML, and then render the HTML to a string.

    Answer
    process :: Html.Title -> String -> String
    process title = Html.render . convert title . Markup.parse
    
  2. Try implementing the "imperative shell" for our blog generator program. Start with main, pattern match on the result of getArgs, and decide what to do. Look back at the examples above for inspiration.

    Answer
    -- Main.hs
    module Main where
    
    import qualified Markup
    import qualified Html
    import Convert (convert)
    
    import System.Directory (doesFileExist)
    import System.Environment (getArgs)
    
    main :: IO ()
    main =
      getArgs >>= \args ->
        case args of
          -- No program arguments: reading from stdin and writing to stdout
          [] ->
            getContents >>= \content ->
              putStrLn (process "Empty title" content)
    
          -- With input and output file paths as program arguments
          [input, output] ->
            readFile input >>= \content ->
              doesFileExist output >>= \exists ->
                let
                  writeResult = writeFile output (process input content)
                in
                  if exists
                    then whenIO confirm writeResult
                    else writeResult
    
          -- Any other kind of program arguments
          _ ->
            putStrLn "Usage: runghc Main.hs [-- <input-file> <output-file>]"
    
    process :: Html.Title -> String -> String
    process title = Html.render . convert title . Markup.parse
    
    confirm :: IO Bool
    confirm =
      putStrLn "Are you sure? (y/n)" *>
        getLine >>= \answer ->
          case answer of
            "y" -> pure True
            "n" -> pure False
            _ -> putStrLn "Invalid response. use y or n" *>
              confirm
    
    whenIO :: IO Bool -> IO () -> IO ()
    whenIO cond action =
      cond >>= \result ->
        if result
          then action
          else pure ()
    

Do notation

While using >>= to chain IO actions is manageable, Haskell provides an even more convenient syntactic sugar called do notation which emulates imperative programming.

A do block starts with the do keyword and continues with one or more "statements" which can be one of the following:

  1. An expression of type IO (), such as:
    • putStrLn "Hello"
    • if True then putStrLn "Yes" else putStrLn "No"
  2. A let block, such as
    • let x = 1
    • or multiple let declarations:
      let
        x = 1
        y = 2
      
      Note that we do not write the in here.
  3. A binding <variable> <- <expresion>, such as
    line <- getLine
    

And the last "statement" must be an expression of type IO <something> - this will be the result type of the do block.

These constructs are desugared (translated) by the Haskell compiler to:

  1. <expression> *>,
  2. let ... in and
  3. <expression> >>= \<variable>

respectively.

For example:

greeting :: IO ()
greeting = do
  putStrLn "Tell me your name."
  let greet name = "Hello, " ++ name ++ "!"
  name <- getLine
  putStrLn (greet name)

Is just syntactic sugar for:

greeting :: IO ()
greeting =
  putStrLn "Tell me your name." *>
    let
      greet name = "Hello, " ++ name ++ "!"
    in
      getLine >>= \name ->
        putStrLn (greet name)

It's important to note the difference between let and <- (bind). let is used to give a new name to an expression that will be in scope for subsequent lines, and <- is used to bind the result a in an IO a action to a new name that will be in scope for subsequent lines.

code operator type of the left side type of the right side comment
let greeting = "hello"
=
String
String
Both sides are interchangeable
let mygetline = getLine
=
IO String
IO String
We just create a new name for getLine
name <- getLine
<-
String
IO String
Technically <- is not an operator, but just a syntactic sugar for >>= + lambda, where we bind the result of the computation to a variable

Do notation is very very common and is often preferable to using >>= directly.


  1. Exercise: Translate the examples in this chapter to do notation.

  2. Exercise: Translate our glue code for the blog generator to do notation.

    Solution
    -- Main.hs
    module Main where
    
    import qualified Markup
    import qualified Html
    import Convert (convert)
    
    import System.Directory (doesFileExist)
    import System.Environment (getArgs)
    
    main :: IO ()
    main = do
      args <- getArgs
      case args of
        -- No program arguments: reading from stdin and writing to stdout
        [] -> do
          content <- getContents
          putStrLn (process "Empty title" content)
    
        -- With input and output file paths as program arguments
        [input, output] -> do
          content <- readFile input
          exists <- doesFileExist output
          let
            writeResult = writeFile output (process input content)
          if exists
            then whenIO confirm writeResult
            else writeResult
    
        -- Any other kind of program arguments
        _ ->
          putStrLn "Usage: runghc Main.hs [-- <input-file> <output-file>]"
    
    process :: Html.Title -> String -> String
    process title = Html.render . convert title . Markup.parse
    
    confirm :: IO Bool
    confirm = do
      putStrLn "Are you sure? (y/n)"
      answer <- getLine
      case answer of
        "y" -> pure True
        "n" -> pure False
        _ -> do
          putStrLn "Invalid response. use y or n"
          confirm
    
    whenIO :: IO Bool -> IO () -> IO ()
    whenIO cond action = do
      result <- cond
      if result
        then action
        else pure ()
    

Summary

In this chapter, we discussed what "purely functional" means, where the initial motivation for being purely functional comes from, and how Haskell's I/O interface allows us to create descriptions of programs without breaking purity.

We have also achieved a major milestone. With this chapter, we have implemented enough pieces to finally run our program on a single document and get an HTML-rendered result!

However, our command-line interface is still sub-par. We want to render a blog with multiple articles, create an index page, and more. We still have more to do to be able to call our program a blog generator.

Let's keep going!

You can view the git commit of the changes we've made and the code up until now.