dataframe-1.0.0.1: A fast, safe, and intuitive DataFrame library.
Safe HaskellNone
LanguageHaskell2010

DataFrame.IO.Parquet.Seeking

Description

This module contains low-level utilities around file seeking

potentially also contains all Streamly related low-level utilities.

later this module can be renamed / moved to an internal module.

Synopsis

Documentation

data SeekableHandle Source #

This handle carries a proof that it must be seekable. Note: Handle and SeekableHandle are not thread safe, should not be shared across threads, beaware when running parallel/concurrent code.

Not seekable: - stdin / stdout - pipes / FIFOs

But regular files are always seekable. Parquet fundamentally wants random access, a non-seekable source will not support effecient access without buffering the entire file.

data SeekMode #

A mode that determines the effect of hSeek hdl mode i.

Constructors

AbsoluteSeek

the position of hdl is set to i.

RelativeSeek

the position of hdl is set to offset i from the current position.

SeekFromEnd

the position of hdl is set to offset i from the end of the file.

Instances

Instances details
Enum SeekMode

Since: base-4.2.0.0

Instance details

Defined in GHC.IO.Device

Ix SeekMode

Since: base-4.2.0.0

Instance details

Defined in GHC.IO.Device

Read SeekMode

Since: base-4.2.0.0

Instance details

Defined in GHC.IO.Device

Show SeekMode

Since: base-4.2.0.0

Instance details

Defined in GHC.IO.Device

Eq SeekMode

Since: base-4.2.0.0

Instance details

Defined in GHC.IO.Device

Ord SeekMode

Since: base-4.2.0.0

Instance details

Defined in GHC.IO.Device

data FileBufferedOrSeekable Source #

If we truely want to support non-seekable files, we need to also consider the case to buffer the entire file in memory.

Not thread safe, contains mutable reference (as Handle already is).

If we need concurrent / parallel parsing or something, we need to read into ByteString first, not sharing the same handle.

type ForceNonSeekable = Maybe Bool Source #

For testing only

advanceBytes :: Int -> FileBufferedOrSeekable -> IO ByteString Source #

Note: this does not guarantee n bytes (if it ends early)

mkFileBufferedOrSeekable :: ForceNonSeekable -> Handle -> IO FileBufferedOrSeekable Source #

Smart constructor for FileBufferedOrSeekable, tries to keep in the seekable case if possible.

mkSeekableHandle :: Handle -> IO (Maybe SeekableHandle) Source #

Smart constructor for SeekableHandle

readLastBytes :: Integer -> FileBufferedOrSeekable -> IO ByteString Source #

Read from the end, useful for reading metadata without loading entire file

seekAndReadBytes :: Maybe (SeekMode, Integer) -> Int -> FileBufferedOrSeekable -> IO ByteString Source #

Note: this does not guarantee n bytes (if it ends early)

seekAndStreamBytes :: MonadIO m => Maybe (SeekMode, Integer) -> Int -> FileBufferedOrSeekable -> m (Stream m Word8) Source #

Warning: the stream produced from this function accesses to the mutable handler. if multiple streams are pulled from the same handler at the same time, chaos happen. Make sure there is only one stream running at one time for each SeekableHandle, and streams are not read again when they are not used anymore.

withFileBufferedOrSeekable :: ForceNonSeekable -> FilePath -> IOMode -> (FileBufferedOrSeekable -> IO a) -> IO a Source #

With / bracket pattern for FileBufferedOrSeekable

Warning: do not return the FileBufferedOrSeekable outside the scope of the action as it will be closed.