Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
Hledger.Read
Description
This is the entry point to hledger's reading system, which can read Journals from various data formats. Use this module if you want to parse journal data or read journal files. Generally it should not be necessary to import modules below this one.
Journal reading
Reading an input file (in journal, csv, timedot, or timeclock format..) involves these steps:
- select an appropriate file format "reader"
based on filename extensionfile path prefixfunction parameter.
A reader contains a parser and a finaliser (usually
journalFinalise
). - run the parser to get a ParsedJournal (this may run additional sub-parsers to parse included files)
- run the finaliser to get a complete Journal, which passes standard checks
- if reading multiple files: merge the per-file Journals into one overall Journal
- if using -s/--strict: run additional strict checks
- if running print --new: save .latest files for each input file. (import also does this, as its final step.)
Journal merging
Journal implements the Semigroup class, so two Journals can be merged
into one Journal with j1 <> j2
. This is implemented by the
journalConcat
function, whose documentation explains what merging
Journals means exactly.
Journal finalising
This is post-processing done after parsing an input file, such as
inferring missing information, normalising amount styles,
checking for errors and so on - a delicate and influential stage
of data processing.
In hledger it is done by journalFinalise
, which converts a
preliminary ParsedJournal to a validated, ready-to-use Journal.
This is called immediately after the parsing of each input file.
It is not called when Journals are merged.
Journal reading API
There are three main Journal-reading functions:
- readJournal to read from a Text value. Selects a reader and calls its parser and finaliser, then does strict checking if needed.
- readJournalFile to read one file, or stdin if the file path is
-
. Uses the file path/file name to help select the reader, calls readJournal, then writes .latest files if needed. - readJournalFiles to read multiple files. Calls readJournalFile for each file (without strict checking or .latest file writing) then merges the Journals into one, then does strict checking and .latest file writing at the end if needed.
Each of these also has an easier variant with ' suffix, which uses default options and has a simpler type signature.
One more variant, readJournalFilesAndLatestDates
, is like
readJournalFiles but exposing the latest transaction date
(and how many on the same day) seen for each file.
This is used by the import command.
Synopsis
- type PrefixedFilePath = FilePath
- defaultJournal :: IO Journal
- defaultJournalPath :: IO String
- requireJournalFileExists :: FilePath -> IO ()
- ensureJournalFileExists :: FilePath -> IO ()
- runExceptT :: ExceptT e m a -> m (Either e a)
- readJournal :: InputOpts -> Maybe FilePath -> Handle -> ExceptT String IO Journal
- readJournalFile :: InputOpts -> PrefixedFilePath -> ExceptT String IO Journal
- readJournalFiles :: InputOpts -> [PrefixedFilePath] -> ExceptT String IO Journal
- readJournalFilesAndLatestDates :: InputOpts -> [PrefixedFilePath] -> ExceptT String IO (Journal, [LatestDatesForFile])
- readJournal' :: Handle -> IO Journal
- readJournal'' :: Text -> IO Journal
- readJournalFile' :: PrefixedFilePath -> IO Journal
- readJournalFiles' :: [PrefixedFilePath] -> IO Journal
- orDieTrying :: MonadIO m => ExceptT String m a -> m a
- saveLatestDates :: LatestDates -> FilePath -> IO ()
- saveLatestDatesForFiles :: [LatestDatesForFile] -> IO ()
- tmpostingrulep :: Maybe Year -> JournalParser m TMPostingRule
- findReader :: MonadIO m => Maybe StorageFormat -> Maybe FilePath -> Maybe (Reader m)
- splitReaderPrefix :: PrefixedFilePath -> (Maybe StorageFormat, FilePath)
- runJournalParser :: Monad m => JournalParser m a -> Text -> m (Either HledgerParseErrors a)
- module Hledger.Read.Common
- module Hledger.Read.InputOptions
- tests_Read :: TestTree
Journal files
type PrefixedFilePath = FilePath Source #
A file path optionally prefixed by a reader name and colon (journal:, csv:, timedot:, etc.).
defaultJournal :: IO Journal Source #
Read the default journal file specified by the environment, or raise an error.
defaultJournalPath :: IO String Source #
Get the default journal file path specified by the environment.
Like ledger, we look first for the LEDGER_FILE environment
variable, and if that does not exist, for the legacy LEDGER
environment variable. If neither is set, or the value is blank,
return the hard-coded default, which is .hledger.journal
in the
users's home directory (or in the current directory, if we cannot
determine a home directory).
requireJournalFileExists :: FilePath -> IO () Source #
If the specified journal file does not exist (and is not "-"), give a helpful error and quit. (Using "journal file" generically here; it could be in any of hledger's supported formats.)
ensureJournalFileExists :: FilePath -> IO () Source #
Ensure there is a journal file at the given path, creating an empty one if needed.
On Windows, also ensure that the path contains no trailing dots
which could cause data loss (see isWindowsUnsafeDotPath
).
Journal parsing
runExceptT :: ExceptT e m a -> m (Either e a) #
The inverse of ExceptT
.
readJournal :: InputOpts -> Maybe FilePath -> Handle -> ExceptT String IO Journal Source #
readJournal iopts mfile txt
Read a Journal from some handle, with strict checks if enabled, or return an error message.
The reader (data format) is chosen based on, in this order:
- a reader name provided in
iopts
- a reader prefix in the
mfile
path - a file extension in
mfile
If none of these is available, or if the reader name is unrecognised, the journal reader is used.
If a file path is not provided, "-" is assumed (and may appear in error messages,
files
output etc, where it will be a slight lie: it will mean "not from a file",
not necessarily "from standard input".
readJournalFile :: InputOpts -> PrefixedFilePath -> ExceptT String IO Journal Source #
Read a Journal from this file, or from stdin if the file path is -, with strict checks if enabled, or return an error message. (Note strict checks are disabled temporarily here when this is called by readJournalFiles). The file path can have a READER: prefix.
The reader (data format) to use is determined from (in priority order):
the mformat_
specified in the input options, if any;
the file path's READER: prefix, if any;
a recognised file name extension.
if none of these identify a known reader, the journal reader is used.
The input options can also configure balance assertion checking, automated posting generation, a rules file for converting CSV data, etc.
If using --new, and if latest-file writing is enabled in input options, and not deferred by readJournalFiles, and after passing strict checks if enabled, a .latest.FILE file will be created/updated (for the main file only, not for included files), to remember the latest transaction date processed.
readJournalFiles :: InputOpts -> [PrefixedFilePath] -> ExceptT String IO Journal Source #
Read a Journal from each specified file path (using readJournalFile
)
and combine them into one; or return the first error message.
Combining Journals means concatenating them, basically. The parse state resets at the start of each file, which means that directives & aliases do not affect subsequent sibling or parent files. They do affect included child files though. Also the final parse state saved in the Journal does span all files.
Strict checks, if enabled, are temporarily deferred until all files are read, to ensure they see the whole journal, and/or to avoid redundant work. (Some checks, like assertions and ordereddates, might still be doing redundant work ?)
Writing .latest files, if enabled, is also deferred till the end, and is done only if strict checks pass.
readJournalFilesAndLatestDates :: InputOpts -> [PrefixedFilePath] -> ExceptT String IO (Journal, [LatestDatesForFile]) Source #
Easy journal parsing
readJournal' :: Handle -> IO Journal Source #
An easy version of readJournal
which assumes default options, and fails
in the IO monad.
readJournal'' :: Text -> IO Journal Source #
An even easier version of readJournal
which additionally to readJournal'
also takes a Text
instead of a Handle
.
readJournalFile' :: PrefixedFilePath -> IO Journal Source #
An easy version of readJournalFile
which assumes default options, and fails
in the IO monad.
readJournalFiles' :: [PrefixedFilePath] -> IO Journal Source #
An easy version of readJournalFiles'
which assumes default options, and fails
in the IO monad.
orDieTrying :: MonadIO m => ExceptT String m a -> m a Source #
Extract ExceptT to the IO monad, failing with an error message if necessary.
Misc
saveLatestDates :: LatestDates -> FilePath -> IO () Source #
Save the given latest date(s) seen in the given data FILE, in a hidden file named .latest.FILE, creating it if needed.
saveLatestDatesForFiles :: [LatestDatesForFile] -> IO () Source #
Save each file's latest dates.
Re-exported
tmpostingrulep :: Maybe Year -> JournalParser m TMPostingRule Source #
findReader :: MonadIO m => Maybe StorageFormat -> Maybe FilePath -> Maybe (Reader m) Source #
findReader mformat mpath
Find the reader named by mformat
, if provided.
("ssv" and "tsv" are recognised as alternate names for the csv reader,
which also handles those formats.)
Or, if a file path is provided, find the first reader that handles
its file extension, if any.
splitReaderPrefix :: PrefixedFilePath -> (Maybe StorageFormat, FilePath) Source #
If a filepath is prefixed by one of the reader names and a colon, split that off. Eg "csv:-" -> (Just "csv", "-"). These reader prefixes can be used to force a specific reader, overriding the file extension.
runJournalParser :: Monad m => JournalParser m a -> Text -> m (Either HledgerParseErrors a) Source #
Run a journal parser in some monad. See also: parseWithState.
module Hledger.Read.Common
module Hledger.Read.InputOptions