Copyright | (c) 2018 Composewell Technologies |
---|---|
License | BSD3 |
Maintainer | streamly@composewell.com |
Portability | GHC API Design notes: |
Safe Haskell | None |
Language | Haskell2010 |
Streamly.Internal.FileSystem.DirIO
Description
The paths returned by "read" can be absolute (usrbin/ls), relative to current directory (.binls) or path segments relative to current dir (bin/ls). To accomodate all the cases we can provide a prefix to attach to the paths being generated. Alternatively, we could take the approach of the higher layer doing that, but it is more efficient to allocate the path buffer once rather than modifying it later. We can do this by passing a fold to transform the output.
Also it may be more efficient to apply a filter to the paths right here instead of applying it in a layer above. Cut the output at the source rather than generate and then discard it later. We can do this by passing a fold to filter the input.
When reading a symlink directory we can resolve the symlink and read the destination directory or we can just emit the file it is pointing to and the read can happen next at the higher level, in the traversal logic (concatIterate). Not sure if one approach has any significant perf impact over the other. Similar thinking applies to a mount point as well. Also, if we resolve the symlinks in concatIterate, then each resolution will be counted as depth level increment whereas if we resolve that at lower level then it won't. We can do this by passing an option to modify the behavior.
When resolving cyclic directory symlinks one way to curtail it is ELOOP which gives up if it encounters too many level. Another way is to use the inode information to check if we are traversing an already traversed inode, this is in general helpful in a graph traversal. We can ignore ELOOP by passing an option but it may be inefficient because we may encounter the loop from any node in the cycle.
If we encounter an error reading a directory because of permission issues should we ignore it in this low level API or catch it in the higher level traversal functionality? Similarly, if there are broken symlinks, where to handle the error? Need to check performance when handling it in ListDir. Suppressing the error at the lower level may be more efficient than propagating it up and then handling it there. We can do this by passing an option.
Returning the metadata:
Specific scans can be used to return the metadata in the output stream if needed. However, we may need three different APIs: one with fast metadata, and another with full metadata. In the two cases the fold input would be different.
- readMinimal: read only the path names, no metadata
- readStandard: read the path and minimal metadata
- readFull: read full metadata
NOTE: Full metadata can be read by mapping a stat call to a stream of paths rather than via readdir API. Does it help the performance to do it in the readdir API?
Synopsis
- data ReadOptions = ReadOptions {}
- followSymlinks :: Bool -> ReadOptions -> ReadOptions
- ignoreMissing :: Bool -> ReadOptions -> ReadOptions
- ignoreSymlinkLoops :: Bool -> ReadOptions -> ReadOptions
- ignoreInaccessible :: Bool -> ReadOptions -> ReadOptions
- defaultReadOptions :: ReadOptions
- read :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => Path -> Stream m Path
- readFiles :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m Path
- readDirs :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m Path
- readEither :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m (Either Path Path)
- readEitherPaths :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m (Either Path Path)
- readEitherChunks :: forall (m :: Type -> Type). MonadIO m => (ReadOptions -> ReadOptions) -> [PosixPath] -> Stream m (Either [PosixPath] [PosixPath])
- reader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => Unfold m Path Path
- fileReader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path Path
- dirReader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path Path
- eitherReader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path (Either Path Path)
- eitherReaderPaths :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path (Either Path Path)
Metadata
Configuration
data ReadOptions Source #
Options controlling the behavior of directory read.
Constructors
ReadOptions | |
Fields
|
followSymlinks :: Bool -> ReadOptions -> ReadOptions Source #
Control how symbolic links are handled when determining the type of a directory entry.
- If set to
True
, symbolic links are resolved before classification. This means a symlink pointing to a directory will be treated as a directory, and a symlink pointing to a file will be treated as a non-directory. - If set to
False
, all symbolic links are classified as non-directories, without attempting to resolve their targets.
Enabling resolution may cause additional errors to occur due to insufficient permissions, broken links, or symlink loops. Such errors can be ignored or handled using the appropriate options.
The default is False
.
On Windows this option has no effect as of now, symlinks are not followed to determine the type.
ignoreMissing :: Bool -> ReadOptions -> ReadOptions Source #
When the followSymlinks
option is enabled and a directory entry is a
symbolic link, we resolve it to determine the type of the symlink target.
This option controls the behavior when encountering broken symlink errors
during resolution.
When set to True
, broken symlink errors are ignored, and the type of the
entry is reported as not a directory. When set to False
, the directory
read operation fails with an error.
The default is True
.
On Windows this option has no effect as of now, symlinks are not followed to determine the type.
ignoreSymlinkLoops :: Bool -> ReadOptions -> ReadOptions Source #
When the followSymlinks
option is enabled and a directory entry is a
symbolic link, we resolve it to determine the type of the symlink target.
This option controls the behavior when encountering symlink loop errors
during resolution.
When set to True
, symlink loop errors are ignored, and the type of the
entry is reported as not a directory. When set to False
, the directory
read operation fails with an error.
The default is True
.
On Windows this option has no effect as of now, symlinks are not followed to determine the type.
ignoreInaccessible :: Bool -> ReadOptions -> ReadOptions Source #
When the followSymlinks
option is enabled and a directory entry is a
symbolic link, we resolve it to determine the type of the symlink target.
This option controls the behavior when encountering permission errors
during resolution.
When set to True
, any permission errors are ignored, and the type of the
entry is reported as not a directory. When set to False
, the directory
read operation fails with an error.
The default is True
.
On Windows this option has no effect as of now, symlinks are not followed to determine the type.
Streams
read :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => Path -> Stream m Path Source #
Raw read of a directory.
Pre-release
readFiles :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m Path Source #
Read files only.
Internal
readDirs :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m Path Source #
Read directories only.
Internal
readEither :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m (Either Path Path) Source #
Read directories as Left and files as Right. Filter out "." and ".." entries. The output contains the names of the directories and files.
Pre-release
readEitherPaths :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Path -> Stream m (Either Path Path) Source #
Like readEither
but prefix the names of the files and directories with
the supplied directory path.
readEitherChunks :: forall (m :: Type -> Type). MonadIO m => (ReadOptions -> ReadOptions) -> [PosixPath] -> Stream m (Either [PosixPath] [PosixPath]) Source #
Unfolds
Use the more convenient stream APIs instead of unfolds where possible.
fileReader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path Path Source #
Read files only.
Internal
dirReader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path Path Source #
Read directories only. Filter out "." and ".." entries.
Internal
eitherReader :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path (Either Path Path) Source #
Read directories as Left and files as Right. Filter out "." and ".." entries.
Internal
eitherReaderPaths :: forall (m :: Type -> Type). (MonadIO m, MonadCatch m) => (ReadOptions -> ReadOptions) -> Unfold m Path (Either Path Path) Source #