Thursday, March 12, 2009

enumerator/iteratees and output

I have recently received a few questions about writing output while using enumerator-based I/O. In some cases users have attempted to make enumerators (like enumFd) that will handle output as well, but have difficulty actually making it work.

I think these problems stem from an incorrect application of the enumerator model of I/O. When using enumerators, a file (or other data source) is a resource that can be enumerated over to process data, exactly as a list can be enumerated over in order to access the data contained in the list. Compare the following:

foldl f init xs

enumFd "SomeFile" ==<< stream2list

In the first function, 'xs' is the data to be processed, 'foldl' tells how to access individual items in the data collection, and 'f' and 'init' do the actual processing. In the second, "SomeFile" is the data, 'enumFd' tells how to access the data, and 'stream2list' does the processing. So how does writing fit in? The output file obviously isn't the data source, and it doesn't make sense to enumerate over your output file as there's no data there to process. So it must go within the Iteratee. It turns out that making an iteratee to write data is relatively simple:

> import Data.Iteratee
> import System.IO
> import Control.Monad
>
> writeOut :: FilePath -> IterateeGM [] Char IO ()
> writeOut file = do
> h <- liftIO $ openFile file WriteMode
> loop h
> where
> loop :: Handle -> IterateeGM [] Char IO ()
> loop h = do
> next <- Data.Iteratee.head
> case next of
> Just c -> liftIO $ hPutChar h c >> loop
> Nothing -> liftIO $ hClose h


Add some error handling and you've got a writer. This version could be polymorphic over different StreamChunk instances by generalizing the type (FlexibleContexts may be required as well). Other stream-specific versions could be written that would take advantage of the specific StreamChunk instance (e.g. using Data.ByteString.hPut instead of hPutChar).

I hope this will serve as a very basic introduction to output when using enumerators. In addition to a generic writer like this, it may frequently be beneficial to define special-purpose writers. In a future post I will show a writer that seeks within the output file using a threaded State monad.

No comments:

Post a Comment