| Copyright | (c) 2009 2010 Bryan O'Sullivan (c) 2009 Simon Marlow |
|---|---|
| License | BSD-style |
| Maintainer | bos@serpentine.com |
| Portability | GHC |
| Safe Haskell | Trustworthy |
| Language | Haskell2010 |
Data.Text.IO
Description
Efficient locale-sensitive support for text I/O.
The functions in this module obey the runtime system's locale, character set encoding, and line ending conversion settings.
If you want to do I/O using the UTF-8 encoding, use Data.Text.IO.Utf8, which is faster than this module.
If you know in advance that you will be working with data that has a specific encoding, and your application is highly performance sensitive, you may find that it is faster to perform I/O with bytestrings and to encode and decode yourself than to use the functions in this module.
Synopsis
- readFile :: FilePath -> IO Text
- writeFile :: FilePath -> Text -> IO ()
- appendFile :: FilePath -> Text -> IO ()
- hGetContents :: Handle -> IO Text
- hGetChunk :: Handle -> IO Text
- hGetLine :: Handle -> IO Text
- hPutStr :: Handle -> Text -> IO ()
- hPutStrLn :: Handle -> Text -> IO ()
- interact :: (Text -> Text) -> IO ()
- getContents :: IO Text
- getLine :: IO Text
- putStr :: Text -> IO ()
- putStrLn :: Text -> IO ()
File-at-a-time operations
readFile :: FilePath -> IO Text Source #
The readFile function reads a file and returns the contents of
the file as a string. The entire file is read strictly, as with
getContents.
Beware that this function (similarly to readFile) is locale-dependent.
Unexpected system locale may cause your application to read corrupted data or
throw runtime exceptions about "invalid argument (invalid byte sequence)"
or "invalid argument (invalid character)". This is also slow, because GHC
first converts an entire input to UTF-32, which is afterwards converted to UTF-8.
If your data is UTF-8,
using decodeUtf8 . readFile
is a much faster and safer alternative.
writeFile :: FilePath -> Text -> IO () Source #
Write a string to a file. The file is truncated to zero length before writing begins.
Operations on handles
hGetContents :: Handle -> IO Text Source #
Read the remaining contents of a Handle as a string. The
Handle is closed once the contents have been read, or if an
exception is thrown.
Internally, this function reads a chunk at a time from the lower-level buffering abstraction, and concatenates the chunks into a single string once the entire file has been read.
As a result, it requires approximately twice as much memory as its result to construct its result. For files more than a half of available RAM in size, this may result in memory exhaustion.
hGetChunk :: Handle -> IO Text Source #
Read a single chunk of strict text from a
Handle. The size of the chunk depends on the amount of input
currently buffered.
This function blocks only if there is no data available, and EOF has not yet been reached. Once EOF is reached, this function returns an empty string instead of throwing an exception.
Behavior
Unlike byte-oriented functions, hGetChunk operates on complete UTF-8
characters. Since UTF-8 characters can occupy 1 to 4 bytes, this function
cannot guarantee reading an exact number of bytes. Instead, it reads
complete characters up to the handle's internal buffer limit.
Buffer Size
The maximum chunk size is determined by the handle's internal character
buffer, which is set to 8192 bytes (2048 characters) by the GHC runtime
constant dEFAULT_CHAR_BUFFER_SIZE. This buffer size cannot be modified
through any public API.
UTF-8 Considerations
When working with UTF-8 encoded text:
- The function will never return a partial character
- The actual number of bytes read may vary depending on the character encoding (ASCII characters = 1 byte, other Unicode characters = 2-4 bytes)