| Safe Haskell | Safe |
|---|---|
| Language | Haskell2010 |
System.IO.Streams.Tutorial
Contents
Synopsis
Introduction
The io-streams package defines two "smart handles" for stream processing:
InputStream: a read-only smart handleOutputStream: a write-only smart handle
The InputStream type implements all the core operations we
expect for a read-only handle. We consume values using read, which returns a
Nothing when the resource is done:
read::InputStreamc ->IO(Maybec)
... and we can push back values using unRead:
unRead:: c ->InputStreamc ->IO()
The OutputStream type implements the
write operation which feeds it output, supplying Nothing
to signal resource exhaustion:
write::Maybec ->OutputStreamc ->IO()
These streams slightly resemble Haskell Handles, but support a
wider range of sources and sinks. For example, you can convert an ordinary list
to an InputStream source and interact with it using the
handle-based API:
ghci> import qualified System.IO.Streams as S ghci> listHandle <- S.fromList[1, 2] ghci> S.readlistHandle Just 1 ghci> S.readlistHandle Just 2 ghci> S.readlistHandle Nothing
Additionally, IO streams come with a library of stream transformations that
preserve their handle-like API. For example, you can map a function over an
InputStream, which generates a new handle to the same
stream that returns transformed values:
ghci> oldHandle <- S.fromList[1, 2, 3] ghci> newHandle <- S.mapM(\x ->return(x * 10)) oldHandle ghci> S.readnewHandle 10 ghci> -- We can still view the stream through the old handle ghci> S.readoldHandle 2 ghci> -- ... and switch back again ghci> S.readnewHandle 30
IO streams focus on preserving the convention of traditional handles while offering a wider library of stream-processing utilities.
Build Input Streams
The io-streams library provides a simple interface for creating your own
InputStreams and OutputStreams.
You can build an InputStream from any IO action that
generates output, as long as it wraps results in Just and uses Nothing to
signal EOF:
makeInputStream::IO(Maybea) ->IO(InputStreama)
As an example, let's wrap an ordinary read-only Handle in an
InputStream:
import Data.ByteString (ByteString) import qualified Data.ByteString as S import System.IO.Streams (InputStream) import qualified System.IO.Streams as Streams import System.IO (Handle,hFlush) bUFSIZ = 32752 upgradeReadOnlyHandle ::Handle->IO(InputStreamByteString) upgradeReadOnlyHandle h = Streams.makeInputStreamf where f = do x <- S.hGetSomeh bUFSIZreturn$! if S.nullx thenNothingelseJustx
We didn't even really need to write the upgradeReadOnlyHandle function,
because System.IO.Streams.Handle already provides one that uses the exact
same implementation given above:
handleToInputStream::Handle->IO(InputStreamByteString)
Build Output Streams
Similarly, you can build any OutputStream from an IO
action that accepts input, as long as it interprets Just as more input and
Nothing as EOF:
makeOutputStream:: (Maybea ->IO()) ->IO(OutputStreama)
A simple OutputStream might wrap putStrLn for ByteStrings:
import Data.ByteString (ByteString) import qualified Data.ByteString as S import System.IO.Streams (OutputStream) import qualified System.IO.Streams as Streams writeConsole ::IO(OutputStreamByteString) writeConsole = Streams.makeOutputStream$ \m -> case m ofJustbs -> S.putStrLnbsNothing->return()
The Just wraps more incoming data, whereas Nothing indicates the data is
exhausted. In principle, you can feed OutputStreams more
input after writing a Nothing to them, but IO streams only guarantee a
well-defined behavior up to the first Nothing. After receiving the first
Nothing, an OutputStream could respond to additional
input by:
- Using the input
- Ignoring the input
- Throwing an exception
Ideally, you should adhere to well-defined behavior and ensure that after you
write a Nothing to an OutputStream, you don't write
anything else.
Connect Streams
io-streams provides two ways to connect an InputStream
and OutputStream:
connect::InputStreama ->OutputStreama ->IO()supply::InputStreama ->OutputStreama ->IO()
connect feeds the OutputStream
exclusively with the given InputStream and passes along the
end-of-stream notification to the OutputStream.
supply feeds the OutputStream
non-exclusively with the given InputStream and does not
pass along the end-of-stream notification to the
OutputStream.
You can combine both supply and connect
to feed multiple InputStreams into a single
OutputStream:
import qualified System.IO.Streams as Streams import System.IO (IOMode(WriteMode)) main = do Streams.withFileAsOutput"out.txt"WriteMode$ \outStream -> Streams.withFileAsInput"in1.txt" $ \inStream1 -> Streams.withFileAsInput"in2.txt" $ \inStream2 -> Streams.withFileAsInput"in3.txt" $ \inStream3 -> Streams.supplyinStream1 outStream Streams.supplyinStream2 outStream Streams.connectinStream3 outStream
The final connect seals the
OutputStream when the final InputStream
terminates.
Keep in mind that you do not need to use connect or
supply at all: io-streams mainly provides them for user
convenience. You can always build your own abstractions on top of the
read and write operations.
Transform Streams
When we build or use IO streams we can tap into all the stream-processing
features the io-streams library provides. For example, we can decompress any
InputStream of ByteStrings:
import Control.Monad ((>=>)) import Data.ByteString (ByteString) import System.IO (Handle) import System.IO.Streams (InputStream,OutputStream) import qualified System.IO.Streams as Streams import qualified System.IO.Streams.File as Streams unzipHandle ::Handle->IO(InputStreamByteString) unzipHandle = Streams.handleToInputStream>=> Streams.decompress
... or we can guard it against a denial-of-service attack:
protectHandle ::Handle->IO(InputStreamByteString) protectHandle = Streams.handleToInputStream>=> Streams.throwIfProducesMoreThan1000000
io-streams provides many useful functions such as these in its standard
library and you take advantage of them by defining IO streams that wrap your
resources.
Resource and Exception Safety
IO streams use standard Haskell idioms for resource safety. Since all
operations occur in the IO monad, you can use catch,
bracket, or various "with..." functions to guard any
read or write without any special
considerations:
import qualified Data.ByteString as S import System.IO import System.IO.Streams (InputStream,OutputStream) import qualified System.IO.Streams as Streams import qualified System.IO.Streams.File as Streams main =withFile"test.txt"ReadMode$ \handle -> do stream <- Streams.handleToInputStreamhandle mBytes <- Streams.readstream case mBytes ofJustbytes -> S.putStrLnbytesNothing->putStrLn"EOF"
However, you can also simplify the above example by using the convenience
function withFileAsInput from
System.IO.Streams.File:
withFileAsInput::FilePath-> (InputStreamByteString->IOa) ->IOa
Pushback
All InputStreams support pushback, which simplifies many
types of operations. For example, we can peek at an
InputStream by combining read and
unRead:
peek::InputStreamc ->IO(Maybec)peeks = do x <- Streams.reads case x ofNothing->return()Justc -> Streams.unReadc sreturnx
... although System.IO.Streams already exports the above function.
InputStreams can customize pushback behavior to support
more sophisticated support for pushback. For example, if you protect a stream
using throwIfProducesMoreThan and
unRead input, it will subtract the unread input from the
total byte count. However, these extra features will not interfere with the
basic pushback contract, given by the following law:
unReadc stream >>readstream ==return(Justc)
When you build an InputStream using
makeInputStream, it supplies the default pushback behavior
which just saves the input for the next read call. More
advanced users can use System.IO.Streams.Internal to customize their own
pushback routines.
Thread Safety
IO stream operations are not thread-safe by default for performance reasons. However, you can transform an existing IO stream into a thread-safe one using the provided locking functions:
lockingInputStream::InputStreama ->IO(InputStreama)lockingOutputStream::OutputStreama ->IO(OutputStreama)
These functions do not prevent access to the previous IO stream, so you must take care to not save the reference to the previous stream.
Examples
The following examples show how to use the standard library to implement traditional command-line utilities:
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad ((>=>), join)
import qualified Data.ByteString.Char8 as S
import Data.Int (Int64)
import Data.Monoid ((<>))
import System.IO.Streams (InputStream)
import qualified System.IO.Streams as Streams
import System.IO
import Prelude hiding (head)
cat :: FilePath -> IO ()
cat file = withFile file ReadMode $ \h -> do
is <- Streams.handleToInputStream h
Streams.connect is Streams.stdout
grep :: S.ByteString -> FilePath -> IO ()
grep pattern file = withFile file ReadMode $ \h -> do
is <- Streams.handleToInputStream h >>=
Streams.lines >>=
Streams.filter (S.isInfixOf pattern)
os <- Streams.unlines Streams.stdout
Streams.connect is os
data Option = Bytes | Words | Lines
len :: InputStream a -> IO Int64
len = Streams.fold (\n _ -> n + 1) 0
wc :: Option -> FilePath -> IO ()
wc opt file = withFile file ReadMode $
Streams.handleToInputStream >=> count >=> print
where
count = case opt of
Bytes -> \is -> do
(is', cnt) <- Streams.countInput is
Streams.skipToEof is'
cnt
Words -> Streams.words >=> len
Lines -> Streams.lines >=> len
nl :: FilePath -> IO ()
nl file = withFile file ReadMode $ \h -> do
nats <- Streams.fromList [1..]
ls <- Streams.handleToInputStream h >>= Streams.lines
is <- Streams.zipWith
(\n bs -> S.pack (show n) <> " " <> bs)
nats
ls
os <- Streams.unlines Streams.stdout
Streams.connect is os
head :: Int64 -> FilePath -> IO ()
head n file = withFile file ReadMode $ \h -> do
is <- Streams.handleToInputStream h >>= Streams.lines >>= Streams.take n
os <- Streams.unlines Streams.stdout
Streams.connect is os
paste :: FilePath -> FilePath -> IO ()
paste file1 file2 =
withFile file1 ReadMode $ \h1 ->
withFile file2 ReadMode $ \h2 -> do
is1 <- Streams.handleToInputStream h1 >>= Streams.lines
is2 <- Streams.handleToInputStream h2 >>= Streams.lines
isT <- Streams.zipWith (\l1 l2 -> l1 <> "\t" <> l2) is1 is2
os <- Streams.unlines Streams.stdout
Streams.connect isT os
yes :: IO ()
yes = do
is <- Streams.fromList (repeat "y")
os <- Streams.unlines Streams.stdout
Streams.connect is os