Copyright	(C) 2025 - Eitan Chatav
License	BSD-style (see the file LICENSE)
Maintainer	Eitan Chatav <eitan.chatav@gmail.com>
Stability	provisional
Portability	non-portable
Safe Haskell	None
Language	Haskell2010

Text.Grammar.Distributor

Contents

Grammar
Generators
RegEx

Description

See Joachim Breitner, Showcasing Applicative for idea to unify grammars.

Synopsis

type Grammar a = forall (p :: Type -> Type -> Type). Grammatical p => p a a
type Grammarr a b = forall (p :: Type -> Type -> Type). Grammatical p => p a a -> p b b
class (Alternator p, Filtrator p, Tokenized Char Char p, forall t. t ~ p () () => IsString t) => Grammatical (p :: Type -> Type -> Type) where
- inClass :: String -> p Char Char
- notInClass :: String -> p Char Char
- inCategory :: GeneralCategory -> p Char Char
- notInCategory :: GeneralCategory -> p Char Char
- rule :: String -> p a a -> p a a
- ruleRec :: String -> (p a a -> p a a) -> p a a
genReadS :: Grammar a -> ReadS a
readGrammar :: Grammar a -> String -> [a]
genShowS :: Alternative f => Grammar a -> a -> f ShowS
showGrammar :: Alternative f => Grammar a -> a -> f String
genRegEx :: Grammar a -> RegEx
genGrammar :: Grammar a -> [(String, RegEx)]
printGrammar :: Grammar a -> IO ()
data RegEx
- = Terminal String
- | Sequence RegEx RegEx
- | Fail
- | Alternate RegEx RegEx
- | KleeneOpt RegEx
- | KleeneStar RegEx
- | KleenePlus RegEx
- | AnyChar
- | InClass String
- | NotInClass String
- | InCategory GeneralCategory
- | NotInCategory GeneralCategory
- | NonTerminal String
regexNorm :: RegEx -> RegEx
regexParse :: String -> RegEx
regexString :: RegEx -> String
regexGrammar :: Grammar RegEx

Grammar

type Grammar a = forall (p :: Type -> Type -> Type). Grammatical p => p a a Source #

Grammar is a Backus-Naur form grammar, extended by regular expressions, embedded in Haskell, with combinators:

pattern matching >?, >?<
alternation <|>
sequencing >*<, >*, *<
Kleene quantifiers optionalP, manyP, someP
any character anyToken
regular predicates inClass, notInClass, inCategory, notInCategory
nonregular predicate satisfy
terminal strings tokens, fromString and -XOverloadedStrings
nonterminal rules rule, ruleRec
and more.

To see an example of a Grammar, look at the source of regexGrammar.

type Grammarr a b = forall (p :: Type -> Type -> Type). Grammatical p => p a a -> p b b Source #

A Grammarr is just a function of Grammars, useful for expressing one in terms of another Grammar. The arr is for arrow; and it should be pronounced like a pirate.

class (Alternator p, Filtrator p, Tokenized Char Char p, forall t. t ~ p () () => IsString t) => Grammatical (p :: Type -> Type -> Type) where Source #

One can create new generators from a Grammar by defining instances of Grammatical. For instance, one could create generators for Parsec style parsers, and use rule for labeling of parse errors.

A Grammatical Profunctor is a partial distributor, being an Alternator & Filtrator. It is also Tokenized with Char input & output tokens, and IsString with the property:

fromString = tokens

Grammatical has defaults for methods inClass, notInClass, inCategory, notInCategory in terms of satisfy; and rule & ruleRec in terms of id & fix.

Minimal complete definition

Nothing

Methods

inClass :: String -> p Char Char Source #

Only characters which are in the given String.

notInClass :: String -> p Char Char Source #

Only characters which are not in the given String.

inCategory :: GeneralCategory -> p Char Char Source #

Only characters which are in the given GeneralCategory.

notInCategory :: GeneralCategory -> p Char Char Source #

Only characters which are not in the given GeneralCategory.

rule :: String -> p a a -> p a a Source #

A nonterminal rule.

ruleRec :: String -> (p a a -> p a a) -> p a a Source #

A recursive, nonterminal rule.

Instances

Instances details

(Monad f, Alternative f, Filterable f, Cons s s Char Char) => Grammatical (Parsor s f) Source #
Instance details Defined in Text.Grammar.Distributor Methods inClass :: String -> Parsor s f Char Char Source # notInClass :: String -> Parsor s f Char Char Source # inCategory :: GeneralCategory -> Parsor s f Char Char Source # notInCategory :: GeneralCategory -> Parsor s f Char Char Source # rule :: String -> Parsor s f a a -> Parsor s f a a Source # ruleRec :: String -> (Parsor s f a a -> Parsor s f a a) -> Parsor s f a a Source #
(Alternative f, Cons s s Char Char) => Grammatical (Printor s f) Source #
Instance details Defined in Text.Grammar.Distributor Methods inClass :: String -> Printor s f Char Char Source # notInClass :: String -> Printor s f Char Char Source # inCategory :: GeneralCategory -> Printor s f Char Char Source # notInCategory :: GeneralCategory -> Printor s f Char Char Source # rule :: String -> Printor s f a a -> Printor s f a a Source # ruleRec :: String -> (Printor s f a a -> Printor s f a a) -> Printor s f a a Source #

Generators

genReadS :: Grammar a -> ReadS a Source #

Generate a ReadS parser from a Grammar.

readGrammar :: Grammar a -> String -> [a] Source #

Use a Grammar to parse a String.

genShowS :: Alternative f => Grammar a -> a -> f ShowS Source #

Generate ShowS printers from a Grammar.

showGrammar :: Alternative f => Grammar a -> a -> f String Source #

Use a Grammar to print Strings.

genRegEx :: Grammar a -> RegEx Source #

Generate a RegEx from a Grammar. This will infinite loop if your Grammar includes a ruleRec, otherwise it will inline all rules and produce a regular expression.

genGrammar :: Grammar a -> [(String, RegEx)] Source #

Generate a context free grammar, consisting of "start" & named RegEx rules, from a Grammar.

printGrammar :: Grammar a -> IO () Source #

Print a Grammar.

RegEx

data RegEx Source #

A version of regular expressions extended by nonterminals.

Constructors

Terminal String	abc123etc\.
Sequence RegEx RegEx	xy
Fail	\q
Alternate RegEx RegEx	x\|y
KleeneOpt RegEx	x?
KleeneStar RegEx	x*
KleenePlus RegEx	x+
AnyChar	.
InClass String	[abc]
NotInClass String	[^abc]
InCategory GeneralCategory	\p{Lu}
NotInCategory GeneralCategory	\P{Ll}
NonTerminal String	\q{rule-name}

Instances

Instances details

Generic RegEx Source #

Instance details

Defined in Text.Grammar.Distributor

Associated Types

type Rep RegEx

Instance details

Defined in Text.Grammar.Distributor

Methods

from :: RegEx -> Rep RegEx x #

to :: Rep RegEx x -> RegEx #

Show RegEx Source #

Instance details

Defined in Text.Grammar.Distributor

Methods

showsPrec :: Int -> RegEx -> ShowS #

show :: RegEx -> String #

showList :: [RegEx] -> ShowS #

Eq RegEx Source #

Instance details

Defined in Text.Grammar.Distributor

Methods

(==) :: RegEx -> RegEx -> Bool #

(/=) :: RegEx -> RegEx -> Bool #

Ord RegEx Source #

Instance details

Defined in Text.Grammar.Distributor

Methods

compare :: RegEx -> RegEx -> Ordering #

(<) :: RegEx -> RegEx -> Bool #

(<=) :: RegEx -> RegEx -> Bool #

(>) :: RegEx -> RegEx -> Bool #

(>=) :: RegEx -> RegEx -> Bool #

max :: RegEx -> RegEx -> RegEx #

min :: RegEx -> RegEx -> RegEx #

type Rep RegEx Source #

Instance details

Defined in Text.Grammar.Distributor

regexNorm :: RegEx -> RegEx Source #

Normalize a RegEx.

>>> regexNorm (Sequence (Terminal "abc") (Terminal "xyz"))
Terminal "abcxyz"

regexParse :: String -> RegEx Source #

Parse a RegEx from a String.

>>> let str = "xy|z+"
>>> regexParse str
Alternate (Terminal "xy") (KleenePlus (Terminal "z"))

Fail if the String is not a valid regular expression.

>>> let bad = ")("
>>> regexParse bad
Fail

regexString :: RegEx -> String Source #

The RegEx String.

>>> let rex = Alternate (Terminal "xy") (KleenePlus (Terminal "z"))
>>> putStrLn (regexString rex)
xy|z+

regexGrammar :: Grammar RegEx Source #

regexGrammar provides an important example of a Grammar. Take a look at the source to see its definition.

>>> printGrammar regexGrammar
start = \q{regex}
alternate = \q{sequence}(\|\q{sequence})*
any = \.
atom = \q{nonterminal}|\q{fail}|\q{class-in}|\q{class-not-in}|\q{category-in}|\q{category-not-in}|\q{char}|\q{any}|\q{parenthesized}
category = Ll|Lu|Lt|Lm|Lo|Mn|Mc|Me|Nd|Nl|No|Pc|Pd|Ps|Pe|Pi|Pf|Po|Sm|Sc|Sk|So|Zs|Zl|Zp|Cc|Cf|Cs|Co|Cn
category-in = \\p\{\q{category}\}
category-not-in = \\P\{\q{category}\}
char = \q{char-literal}|\q{char-escaped}
char-escaped = \\[\$\(\)\*\+\.\?\[\\\]\^\{\|\}]
char-literal = [^\$\(\)\*\+\.\?\[\\\]\^\{\|\}]
class-in = \[\q{char}*\]
class-not-in = \[\^\q{char}*\]
expression = \q{terminal}|\q{kleene-optional}|\q{kleene-star}|\q{kleene-plus}|\q{atom}
fail = \\q
kleene-optional = \q{atom}\?
kleene-plus = \q{atom}\+
kleene-star = \q{atom}\*
nonterminal = \\q\{\q{char}*\}
parenthesized = \(\q{regex}\)
regex = \q{alternate}
sequence = \q{expression}*
terminal = \q{char}+