| Copyright | © 2015–2018 Megaparsec contributors © 2007 Paolo Martini © 1999–2001 Daan Leijen | 
|---|---|
| License | FreeBSD | 
| Maintainer | Mark Karpov <markkarpov92@gmail.com> | 
| Stability | experimental | 
| Portability | portable | 
| Safe Haskell | None | 
| Language | Haskell2010 | 
Text.Megaparsec
Contents
Description
This module includes everything you need to get started writing a parser. If you are new to Megaparsec and don't know where to begin, take a look at the tutorials https://markkarpov.com/learn-haskell.html#megaparsec-tutorials.
In addition to the Text.Megaparsec module, which exports and re-exports
 most everything that you may need, we advise to import
 Text.Megaparsec.Char if you plan to work with a stream of Char tokens
 or Text.Megaparsec.Byte if you intend to parse binary data.
It is common to start working with the library by defining a type synonym like this:
type Parser = Parsec Void Text
                     ^    ^
                     |    |
Custom error component    Type of inputThen you can write type signatures like Parser Int—for a parser that
 returns an Int for example.
Similarly (since it's known to cause confusion), you should use
 ParseError type parametrized like this:
ParseError Char Void
           ^    ^
           |    |
  Token type    Custom error component (the same you used in Parser)Token type for String and Text (strict and lazy) is Char,
 for ByteStrings it's Word8.
Megaparsec uses some type-level machinery to provide flexibility without
 compromising on type safety. Thus type signatures are sometimes necessary
 to avoid ambiguous types. If you're seeing a error message that reads
 like “Type variable e0 is ambiguous …”, you need to give an explicit
 signature to your parser to resolve the ambiguity. It's a good idea to
 provide type signatures for all top-level definitions.
Megaparsec is capable of a lot. Apart from this standard functionality you can parse permutation phrases with Text.Megaparsec.Perm, expressions with Text.Megaparsec.Expr, do lexing with Text.Megaparsec.Char.Lexer and Text.Megaparsec.Byte.Lexer. These modules should be imported explicitly along with the modules mentioned above.
Synopsis
- module Text.Megaparsec.Pos
- module Text.Megaparsec.Error
- module Text.Megaparsec.Stream
- module Control.Monad.Combinators
- data State s = State {- stateInput :: s
- statePos :: NonEmpty SourcePos
- stateTokensProcessed :: !Int
- stateTabWidth :: Pos
 
- type Parsec e s = ParsecT e s Identity
- data ParsecT e s m a
- parse :: Parsec e s a -> String -> s -> Either (ParseError (Token s) e) a
- parseMaybe :: (Ord e, Stream s) => Parsec e s a -> s -> Maybe a
- parseTest :: (ShowErrorComponent e, Ord (Token s), ShowToken (Token s), Show a) => Parsec e s a -> s -> IO ()
- parseTest' :: (ShowErrorComponent e, ShowToken (Token s), LineToken (Token s), Show a, Stream s) => Parsec e s a -> s -> IO ()
- runParser :: Parsec e s a -> String -> s -> Either (ParseError (Token s) e) a
- runParser' :: Parsec e s a -> State s -> (State s, Either (ParseError (Token s) e) a)
- runParserT :: Monad m => ParsecT e s m a -> String -> s -> m (Either (ParseError (Token s) e) a)
- runParserT' :: Monad m => ParsecT e s m a -> State s -> m (State s, Either (ParseError (Token s) e) a)
- class (Stream s, Alternative m, MonadPlus m) => MonadParsec e s m | m -> e s where
- (<?>) :: MonadParsec e s m => m a -> String -> m a
- unexpected :: MonadParsec e s m => ErrorItem (Token s) -> m a
- customFailure :: MonadParsec e s m => e -> m a
- match :: MonadParsec e s m => m a -> m (Tokens s, a)
- region :: MonadParsec e s m => (ParseError (Token s) e -> ParseError (Token s) e) -> m a -> m a
- takeRest :: MonadParsec e s m => m (Tokens s)
- atEnd :: MonadParsec e s m => m Bool
- getInput :: MonadParsec e s m => m s
- setInput :: MonadParsec e s m => s -> m ()
- getPosition :: MonadParsec e s m => m SourcePos
- getNextTokenPosition :: forall e s m. MonadParsec e s m => m (Maybe SourcePos)
- setPosition :: MonadParsec e s m => SourcePos -> m ()
- pushPosition :: MonadParsec e s m => SourcePos -> m ()
- popPosition :: MonadParsec e s m => m ()
- getTokensProcessed :: MonadParsec e s m => m Int
- setTokensProcessed :: MonadParsec e s m => Int -> m ()
- getTabWidth :: MonadParsec e s m => m Pos
- setTabWidth :: MonadParsec e s m => Pos -> m ()
- setParserState :: MonadParsec e s m => State s -> m ()
- dbg :: forall e s m a. (Stream s, ShowToken (Token s), ShowErrorComponent e, Show a) => String -> ParsecT e s m a -> ParsecT e s m a
Re-exports
Note that we re-export monadic combinators from
 Control.Monad.Combinators because these are more efficient than
 Applicative-based ones. Thus many and some may clash with the
 functions from Control.Applicative. You need to hide the functions like
 this:
import Control.Applicative hiding (many, some)
Also note that you can import Control.Monad.Combinators.NonEmpty if you
 wish that combinators like some return NonEmpty lists. The module
 lives in the parser-combinators package (you need at least version
 0.4.0).
This module is intended to be imported qualified:
import qualified Control.Monad.Combinators.NonEmpty as NE
module Text.Megaparsec.Pos
module Text.Megaparsec.Error
module Text.Megaparsec.Stream
module Control.Monad.Combinators
Data types
This is the Megaparsec's state parametrized over stream type s.
Constructors
| State | |
| Fields 
 | |
Instances
| Eq s => Eq (State s) Source # | |
| Data s => Data (State s) Source # | |
| Defined in Text.Megaparsec.State Methods gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> State s -> c (State s) # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c (State s) # toConstr :: State s -> Constr # dataTypeOf :: State s -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c (State s)) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c (State s)) # gmapT :: (forall b. Data b => b -> b) -> State s -> State s # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> State s -> r # gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> State s -> r # gmapQ :: (forall d. Data d => d -> u) -> State s -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> State s -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> State s -> m (State s) # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> State s -> m (State s) # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> State s -> m (State s) # | |
| Show s => Show (State s) Source # | |
| Generic (State s) Source # | |
| NFData s => NFData (State s) Source # | |
| Defined in Text.Megaparsec.State | |
| type Rep (State s) Source # | |
| Defined in Text.Megaparsec.State type Rep (State s) = D1 (MetaData "State" "Text.Megaparsec.State" "megaparsec-6.5.0-4VKBtSFJhna3iLscGKIZAP" False) (C1 (MetaCons "State" PrefixI True) ((S1 (MetaSel (Just "stateInput") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 s) :*: S1 (MetaSel (Just "statePos") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (NonEmpty SourcePos))) :*: (S1 (MetaSel (Just "stateTokensProcessed") SourceUnpack SourceStrict DecidedStrict) (Rec0 Int) :*: S1 (MetaSel (Just "stateTabWidth") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 Pos)))) | |
ParsecT e s m ae, stream type s, underlying monad m and return type a.
Instances
Running parser
Arguments
| :: Parsec e s a | Parser to run | 
| -> String | Name of source file | 
| -> s | Input for parser | 
| -> Either (ParseError (Token s) e) a | 
parse p file inputp over Identity (see runParserT
 if you're using the ParsecT monad transformer; parse itself is just a
 synonym for runParser). It returns either a ParseError (Left) or a
 value of type a (Right). parseErrorPretty can be used to turn
 ParseError into the string representation of the error message. See
 Text.Megaparsec.Error if you need to do more advanced error analysis.
main = case (parse numbers "" "11,2,43") of
         Left err -> putStr (parseErrorPretty err)
         Right xs -> print (sum xs)
numbers = integer `sepBy` char ','parseMaybe :: (Ord e, Stream s) => Parsec e s a -> s -> Maybe a Source #
parseMaybe p inputp on input and returns the
 result inside Just on success and Nothing on failure. This function
 also parses eof, so if the parser doesn't consume all of its input, it
 will fail.
The function is supposed to be useful for lightweight parsing, where error messages (and thus file name) are not important and entire input should be parsed. For example, it can be used when parsing of a single number according to a specification of its format is desired.
Arguments
| :: (ShowErrorComponent e, Ord (Token s), ShowToken (Token s), Show a) | |
| => Parsec e s a | Parser to run | 
| -> s | Input for parser | 
| -> IO () | 
The expression parseTest p inputp against
 input input and prints the result to stdout. Useful for testing.
Arguments
| :: (ShowErrorComponent e, ShowToken (Token s), LineToken (Token s), Show a, Stream s) | |
| => Parsec e s a | Parser to run | 
| -> s | Input for parser | 
| -> IO () | 
A version of parseTest that also prints offending line in parse
 errors.
Since: megaparsec-6.0.0
Arguments
| :: Parsec e s a | Parser to run | 
| -> String | Name of source file | 
| -> s | Input for parser | 
| -> Either (ParseError (Token s) e) a | 
runParser p file inputp on the input stream of
 tokens input, obtained from source file. The file is only used in
 error messages and may be the empty string. Returns either a ParseError
 (Left) or a value of type a (Right).
parseFromFile p file = runParser p file <$> readFile file
Arguments
| :: Monad m | |
| => ParsecT e s m a | Parser to run | 
| -> String | Name of source file | 
| -> s | Input for parser | 
| -> m (Either (ParseError (Token s) e) a) | 
runParserT p file inputp on the input list of tokens
 input, obtained from source file. The file is only used in error
 messages and may be the empty string. Returns a computation in the
 underlying monad m that returns either a ParseError (Left) or a
 value of type a (Right).
Arguments
| :: Monad m | |
| => ParsecT e s m a | Parser to run | 
| -> State s | Initial state | 
| -> m (State s, Either (ParseError (Token s) e) a) | 
This function is similar to runParserT, but like runParser' it
 accepts and returns parser state. This is thus the most general way to
 run a parser.
Since: megaparsec-4.2.0
Primitive combinators
class (Stream s, Alternative m, MonadPlus m) => MonadParsec e s m | m -> e s where Source #
Type class describing monads that implement the full set of primitive parsers.
Note carefully that the following primitives are “fast” and should be
 taken advantage of as much as possible if your aim is a fast parser:
 tokens, takeWhileP, takeWhile1P, and takeP.
Minimal complete definition
failure, fancyFailure, label, try, lookAhead, notFollowedBy, withRecovery, observing, eof, token, tokens, takeWhileP, takeWhile1P, takeP, getParserState, updateParserState
Methods
Arguments
| :: Maybe (ErrorItem (Token s)) | Unexpected item (if any) | 
| -> Set (ErrorItem (Token s)) | Expected items | 
| -> m a | 
The most general way to stop parsing and report a trivial
 ParseError.
Since: megaparsec-6.0.0
Arguments
| :: Set (ErrorFancy e) | Fancy error components | 
| -> m a | 
The most general way to stop parsing and report a fancy ParseError.
 To report a single custom parse error, see customFailure.
Since: megaparsec-6.0.0
label :: String -> m a -> m a Source #
The parser label name pp, but whenever the
 parser p fails without consuming any input, it replaces names of
 “expected” tokens with the name name.
hidden pp, but it doesn't show any
 “expected” tokens in error message when p fails.
The parser try pp, except that it
 backtracks the parser state when p fails (either consuming input or
 not).
This combinator is used whenever arbitrary look ahead is needed. Since
 it pretends that it hasn't consumed any input when p fails, the
 (<|>) combinator will try its second alternative even if the first
 parser failed while consuming input.
For example, here is a parser that is supposed to parse the word “let” or the word “lexical”:
>>>parseTest (string "let" <|> string "lexical") "lexical"1:1: unexpected "lex" expecting "let"
What happens here? The first parser consumes “le” and fails (because it
 doesn't see a “t”). The second parser, however, isn't tried, since the
 first parser has already consumed some input! try fixes this behavior
 and allows backtracking to work:
>>>parseTest (try (string "let") <|> string "lexical") "lexical""lexical"
try also improves error messages in case of overlapping alternatives,
 because Megaparsec's hint system can be used:
>>>parseTest (try (string "let") <|> string "lexical") "le"1:1: unexpected "le" expecting "let" or "lexical"
Please note that as of Megaparsec 4.4.0, string backtracks
 automatically (see tokens), so it does not need try. However, the
 examples above demonstrate the idea behind try so well that it was
 decided to keep them. You still need to use try when your
 alternatives are complex, composite parsers.
lookAhead :: m a -> m a Source #
If p in lookAhead pp succeeded without consuming anything
 (parser state is not updated as well). If p fails, lookAhead has no
 effect, i.e. it will fail consuming input if p fails consuming input.
 Combine with try if this is undesirable.
notFollowedBy :: m a -> m () Source #
notFollowedBy pp fails. This
 parser never consumes any input and never modifies parser state. It
 can be used to implement the “longest match” rule.
Arguments
| :: (ParseError (Token s) e -> m a) | How to recover from failure | 
| -> m a | Original parser | 
| -> m a | Parser that can recover from failures | 
withRecovery r pp
 fails. In this case r is called with the actual ParseError as its
 argument. Typical usage is to return a value signifying failure to
 parse this particular object and to consume some part of the input up
 to the point where the next object starts.
Note that if r fails, original error message is reported as if
 without withRecovery. In no way recovering parser r can influence
 error messages.
Since: megaparsec-4.4.0
Arguments
| :: m a | The parser to run | 
| -> m (Either (ParseError (Token s) e) a) | 
observing pp parser, should
 it happen, without actually ending parsing, but instead getting the
 ParseError in Left. On success parsed value is returned in Right
 as usual. Note that this primitive just allows you to observe parse
 errors as they happen, it does not backtrack or change how the p
 parser works in any way.
Since: megaparsec-5.1.0
This parser only succeeds at the end of the input.
Arguments
| :: (Token s -> Either (Maybe (ErrorItem (Token s)), Set (ErrorItem (Token s))) a) | Matching function for the token to parse, it allows to construct arbitrary error message on failure as well; things in the tuple are: unexpected item (if any) and expected items | 
| -> Maybe (Token s) | Token to report when input stream is empty | 
| -> m a | 
The parser token test mrept with result x
 when the function test t returns Right xmrep may provide
 representation of the token to report in error messages when input
 stream in empty.
This is the most primitive combinator for accepting tokens. For
 example, the satisfy parser is implemented as:
satisfy f = token testChar Nothing
  where
    testChar x =
      if f x
        then Right x
        else Left (pure (Tokens (x:|[])), Set.empty)Arguments
| :: (Tokens s -> Tokens s -> Bool) | Predicate to check equality of chunks | 
| -> Tokens s | Chunk of input to match against | 
| -> m (Tokens s) | 
The parser tokens testtest is used to check equality of given and parsed
 chunks after a candidate chunk of correct length is fetched from the
 stream.
This can be used for example to write string:
string = tokens (==)
Note that beginning from Megaparsec 4.4.0, this is an auto-backtracking
 primitive, which means that if it fails, it never consumes any input.
 This is done to make its consumption model match how error messages for
 this primitive are reported (which becomes an important thing as user
 gets more control with primitives like withRecovery):
>>>parseTest (string "abc") "abd"1:1: unexpected "abd" expecting "abc"
This means, in particular, that it's no longer necessary to use try
 with tokens-based parsers, such as string and
 string'. This feature does not affect
 performance in any way.
Arguments
| :: Maybe String | Name for a single token in the row | 
| -> (Token s -> Bool) | Predicate to use to test tokens | 
| -> m (Tokens s) | A chunk of matching tokens | 
Parse zero or more tokens for which the supplied predicate holds.
 Try to use this as much as possible because for many streams the
 combinator is much faster than parsers built with many and
 satisfy.
The following equations should clarify the behavior:
takeWhileP (Just "foo") f = many (satisfy f <?> "foo") takeWhileP Nothing f = many (satisfy f)
The combinator never fails, although it may parse an empty chunk.
Since: megaparsec-6.0.0
Arguments
| :: Maybe String | Name for a single token in the row | 
| -> (Token s -> Bool) | Predicate to use to test tokens | 
| -> m (Tokens s) | A chunk of matching tokens | 
Similar to takeWhileP, but fails if it can't parse at least one
 token. Note that the combinator either succeeds or fails without
 consuming any input, so try is not necessary with it.
Since: megaparsec-6.0.0
Arguments
| :: Maybe String | Name for a single token in the row | 
| -> Int | How many tokens to extract | 
| -> m (Tokens s) | A chunk of matching tokens | 
Extract the specified number of tokens from the input stream and return them packed as a chunk of stream. If there is not enough tokens in the stream, a parse error will be signaled. It's guaranteed that if the parser succeeds, the requested number of tokens will be returned.
The parser is roughly equivalent to:
takeP (Just "foo") n = count n (anyChar <?> "foo") takeP Nothing n = count n anyChar
Note that if the combinator fails due to insufficient number of tokens
 in the input stream, it backtracks automatically. No try is necessary
 with takeP.
Since: megaparsec-6.0.0
getParserState :: m (State s) Source #
Return the full parser state as a State record.
updateParserState :: (State s -> State s) -> m () Source #
updateParserState ff to the parser state.
Instances
Derivatives of primitive combinators
(<?>) :: MonadParsec e s m => m a -> String -> m a infix 0 Source #
A synonym for label in the form of an operator.
unexpected :: MonadParsec e s m => ErrorItem (Token s) -> m a Source #
The parser unexpected itemitem without consuming any input.
unexpected item = failure (pure item) Set.empty
customFailure :: MonadParsec e s m => e -> m a Source #
Report a custom parse error. For a more general version, see
 fancyFailure.
Since: megaparsec-6.3.0
match :: MonadParsec e s m => m a -> m (Tokens s, a) Source #
Return both the result of a parse and a chunk of input that was
 consumed during parsing. This relies on the change of the
 stateTokensProcessed value to evaluate how many tokens were consumed.
 If you mess with it manually in the argument parser, prepare for
 troubles.
Since: megaparsec-5.3.0
Arguments
| :: MonadParsec e s m | |
| => (ParseError (Token s) e -> ParseError (Token s) e) | How to process  | 
| -> m a | The “region” that the processing applies to | 
| -> m a | 
Specify how to process ParseErrors that happen inside of this
 wrapper. As a side effect of the current implementation changing
 errorPos with this combinator will also change the final statePos in
 the parser state (try to avoid that because statePos will go out of
 sync with factual position in the input stream, which is probably OK if
 you finish parsing right after that, but be warned).
Since: megaparsec-5.3.0
takeRest :: MonadParsec e s m => m (Tokens s) Source #
Consume the rest of the input and return it as a chunk. This parser never fails, but may return an empty chunk.
takeRest = takeWhileP Nothing (const True)
Since: megaparsec-6.0.0
atEnd :: MonadParsec e s m => m Bool Source #
Return True when end of input has been reached.
Since: megaparsec-6.0.0
Parser state combinators
getInput :: MonadParsec e s m => m s Source #
Return the current input.
setInput :: MonadParsec e s m => s -> m () Source #
getPosition :: MonadParsec e s m => m SourcePos Source #
Return the current source position.
See also: setPosition, pushPosition, popPosition, and SourcePos.
getNextTokenPosition :: forall e s m. MonadParsec e s m => m (Maybe SourcePos) Source #
Get the position where the next token in the stream begins. If the
 stream is empty, return Nothing.
Since: megaparsec-5.3.0
setPosition :: MonadParsec e s m => SourcePos -> m () Source #
setPosition pospos.
See also: getPosition, pushPosition, popPosition, and SourcePos.
pushPosition :: MonadParsec e s m => SourcePos -> m () Source #
Push a position into stack of positions and continue parsing working with this position. Useful for working with include files and the like.
See also: getPosition, setPosition, popPosition, and SourcePos.
Since: megaparsec-5.0.0
popPosition :: MonadParsec e s m => m () Source #
Pop a position from the stack of positions unless it only contains one
 element (in that case the stack of positions remains the same). This is
 how to return to previous source file after pushPosition.
See also: getPosition, setPosition, pushPosition, and SourcePos.
Since: megaparsec-5.0.0
getTokensProcessed :: MonadParsec e s m => m Int Source #
Get the number of tokens processed so far.
Since: megaparsec-6.0.0
setTokensProcessed :: MonadParsec e s m => Int -> m () Source #
Set the number of tokens processed so far.
Since: megaparsec-6.0.0
getTabWidth :: MonadParsec e s m => m Pos Source #
Return the tab width. The default tab width is equal to
 defaultTabWidth. You can set a different tab width with the help of
 setTabWidth.
setTabWidth :: MonadParsec e s m => Pos -> m () Source #
Set tab width. If the argument of the function is not a positive
 number, defaultTabWidth will be used.
setParserState :: MonadParsec e s m => State s -> m () Source #
setParserState stst.
Debugging
Arguments
| :: (Stream s, ShowToken (Token s), ShowErrorComponent e, Show a) | |
| => String | Debugging label | 
| -> ParsecT e s m a | Parser to debug | 
| -> ParsecT e s m a | Parser that prints debugging messages | 
dbg label pp, but when it's evaluated
 it also prints information useful for debugging. The label is only used
 to refer to this parser in the debugging output. This combinator uses the
 trace function from Debug.Trace under the hood.
Typical usage is to wrap every sub-parser in misbehaving parser with
 dbg assigning meaningful labels. Then give it a shot and go through the
 print-out. As of current version, this combinator prints all available
 information except for hints, which are probably only interesting to
 the maintainer of Megaparsec itself and may be quite verbose to output in
 general. Let me know if you would like to be able to see hints in the
 debugging output.
The output itself is pretty self-explanatory, although the following abbreviations should be clarified (they are derived from the low-level source code):
- COK—“consumed OK”. The parser consumed input and succeeded.
- CERR—“consumed error”. The parser consumed input and failed.
- EOK—“empty OK”. The parser succeeded without consuming input.
- EERR—“empty error”. The parser failed without consuming input.
Finally, it's not possible to lift this function into some monad
 transformers without introducing surprising behavior (e.g. unexpected
 state backtracking) or adding otherwise redundant constraints (e.g.
 Show instance for state), so this helper is only available for
 ParsecT monad, not MonadParsec in general.
Since: megaparsec-5.1.0