| Copyright | (C) 2026 - Eitan Chatav |
|---|---|
| License | BSD-style (see the file LICENSE) |
| Maintainer | Eitan Chatav <eitan.chatav@gmail.com> |
| Stability | provisional |
| Portability | non-portable |
| Safe Haskell | None |
| Language | Haskell2010 |
Control.Lens.Grammar.Machine
Contents
Description
Synopsis
- class Matching word pattern | pattern -> word where
- transducer :: Bnf (RegEx token) -> Transducer token
- parseForest :: Categorized token => Transducer token -> [token] -> ([Tree (String, Int, Int, [token])], [token])
- languageSample :: (TokenAlgebra token (f token), Applicative f) => Transducer token -> f [[token]]
- expectNext :: Categorized token => Transducer token -> [token] -> TokenClass token
- unreachableRules :: Transducer token -> Set String
- data Transducer token = Transducer {
- transducerRelations :: IntMap (TransducerStep token)
- transducerRules :: Map String (IntSet, Bool)
- transducerStarts :: IntSet
- data TransducerStep token
Matching
class Matching word pattern | pattern -> word where Source #
Does a word match a pattern?
Instances
| Matching String RegBnf Source # | |
| Matching String RegString Source # | |
| Matching s k => Matching s (Grammor k a b) Source # | |
| AsEmpty s => Matching s (Parsor s [] a b) Source # | |
| Matching s (APrism s t a b) Source # | |
| Categorized token => Matching [token] (Bnf (RegEx token)) Source # | |
| Categorized token => Matching [token] (RegEx token) Source # | |
| Categorized token => Matching [token] (Transducer token) Source # | |
Defined in Control.Lens.Grammar.Machine Methods (=~) :: [token] -> Transducer token -> Bool Source # | |
Transducer
transducer :: Bnf (RegEx token) -> Transducer token Source #
Compile a RegExtended Bnf into a Transducer,
using a combination of Thompson's algorithm for regular expressions
and Earley's algorithm for context-free grammars. See Jim & Mandelbaum,
Efficient Earley Parsing with Regular Right-hand Sides,
and McIlroy, Enumerating the strings of regular languages.
A transducer is a form of finite state machine
that can be run in various ways like
=~, expectNext, languageSample, parseForest & unreachableRules.
Arguments
| :: Categorized token | |
| => Transducer token | |
| -> [token] | string |
| -> ([Tree (String, Int, Int, [token])], [token]) | parse forest & remaining unparsed tokens |
The parse forest of a string of tokens.
Arguments
| :: (TokenAlgebra token (f token), Applicative f) | |
| => Transducer token | transducer |
| -> f [[token]] |
languageSample lazily produces all words in a language from shortest to longest.
However since TokenClasses can resolve to infinite sets of tokens,
and the relevant case of Char tokens while not infinite is huge,
it samples tokens in an Applicative TokenAlgebra.
Arguments
| :: Categorized token | |
| => Transducer token | |
| -> [token] | prefix |
| -> TokenClass token |
What token is expected next?
The scanner frontier, expectNext returns the TokenClass
that can be scanned next after the given input prefix.
A falseB result means the current chart has no scanner transitions,
i.e. the prefix is a dead end for recognition.
unreachableRules :: Transducer token -> Set String Source #
Rule names that can never be entered from the start expression — dead productions. A non-empty result is a grammar-hygiene warning: those rules can be deleted without changing the recognized language.
data Transducer token Source #
A Transducer is a tuple
T = (Σ, Δ, Q, I ⊆ Q, F ∈ Q, transition ⊆ Q × (Σ ∪ ∆) × Q, output ⊆ Q × ∆)
Σis a (possibly infinite) set of terminal token classes, represented byTokenClasses.Δis a finite set of nonterminals, represented by the key set oftransducerRules.Qis a set of states, which is represented by the key set oftransducerRelations.Iare initial states represented bytransducerStarts.Fis a final state represented by0.transitionis a relation represented bytransducerRelationswithTransitionTokenClassandTransitionNonTerminaltransitions.outputis a relation represented bytransducerRelationswithEmitNonTerminaloutputs.
Constructors
| Transducer | |
Fields
| |
Instances
| Categorized token => Matching [token] (Transducer token) Source # | |
Defined in Control.Lens.Grammar.Machine Methods (=~) :: [token] -> Transducer token -> Bool Source # | |
data TransducerStep token Source #
A TransducerStep in a Transducer.
Constructors
| TransitionTokenClass (TokenClass token) IntSet | |
| TransitionNonTerminal String IntSet | |
| EmitNonTerminal String |