scrappy-core-0.1.0.1: html pattern matching library and high-level interface concurrent requests lib for webscraping
Safe HaskellNone
LanguageHaskell2010

Scrappy.Elem

Synopsis

Documentation

eitherP :: Alternative m => m a -> m b -> m (Either a b) Source #

el :: forall s (m :: Type -> Type) u. Stream s m Char => Elem -> [(String, String)] -> ParsecT s u m (Elem' String) Source #

Try to cut out Megaparsec for now - get direct export from Control.Applicative

Note: could make class HtmlP where { el :: a -> Elem, attrs :: a -> Attrs, innerText :: a -> Text }

A use-case/problem is popping up as I code: if elem a contains elem a then do what? 1) Restrict to identifying in parent only if not in some inner same element 2) Get all in parent element regardless 3) Consider being inside of same element a fail -> then get inner-same element like 1) but seeks to carry minimal data around it / more honed in

Simplest interface to building element patterns

elemParser :: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) => Maybe [Elem] -> Maybe (ParsecT s u m a) -> [(String, Maybe String)] -> ParsecT s u m (Elem' a) Source #

Generic interface for building Html element patterns where we do not differentiate based on whats inside | for control of allowable inner html patterns, see ChainHTML and/or TreeElemParser

selfClosing :: [String] Source #

Might be worth it to do again with findNextMatch func this would open up ability to return multiple matches inside of a given element would need to retain ability to handle 3Cases{ self-closing(2 { /> or > .. eof}) | match | no match in case of no match { self-closing || simply, no match } -> needs to throw parserZero

findNextMatch already handles case of Eof would be re-definition of baseParser in `let`

Case 1:

Case 2:

Case 3: The most general

Maybe elemParser can be abstracted to be a class function

Does this work with parser meant to take up whole inner? I suppose it would but this would allow other stuff that case is handled by treeElemParserSpecific

innerElemParser :: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) => String -> Maybe (ParsecT s u m a) -> ParsecT s u m [HTMLMatcher Elem' a] Source #

elemParserWhere Source #

Arguments

:: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) 
=> Maybe [Elem] 
-> Maybe (ParsecT s u m a) 
-> String 
-> (String -> Bool)

An attr and a predicate

-> ParsecT s u m (Elem' a) 

Generic interface for building Html element patterns where we do not differentiate based on whats inside | for control of allowable inner html patterns, see ChainHTML and/or TreeElemParser

clickableHref :: forall s (m :: Type -> Type) u. Stream s m Char => Bool -> LastUrl -> ParsecT s u m Clickable Source #

clickableHref' :: forall s (m :: Type -> Type) a u. (Stream s m Char, ShowHTML a) => ParsecT s u m a -> Bool -> LastUrl -> ParsecT s u m Clickable Source #

sameElTag :: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) => Elem -> Maybe (ParsecT s u m a) -> ParsecT s u m (Elem' a) Source #

matchesInSameElTag :: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) => Elem -> Maybe (ParsecT s u m a) -> ParsecT s u m [a] Source #

elSelfC :: forall s (m :: Type -> Type) u a. Stream s m Char => Maybe [Elem] -> [(String, Maybe String)] -> ParsecT s u m (Elem' a) Source #

elSelfClosing :: forall s (m :: Type -> Type) u a. Stream s m Char => Maybe [Elem] -> Maybe (ParsecT s u m a) -> [(String, Maybe String)] -> ParsecT s u m (Elem' a) Source #

elemWithBody :: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) => Maybe [Elem] -> Maybe (ParsecT s u m a) -> [(String, Maybe String)] -> ParsecT s u m (Elem' a) Source #

elemParserInternal :: forall a s (m :: Type -> Type) u. (ShowHTML a, Stream s m Char) => Maybe [Elem] -> Maybe (ParsecT s u m a) -> [(String, Maybe String)] -> ParsecT s u m (Elem' a) Source #

stylingElem :: forall s (m :: Type -> Type) u. Stream s m Char => ParsecT s u m String Source #

Just gives the inners

parseInnerHTMLAndEndTag :: forall s (m :: Type -> Type) u. Stream s m Char => Elem -> Maybe (ParsecT s u m String) -> ParsecT s u m (InnerTextResult String) Source #

Deprecated: use new elem parser directly

Does not get subsets, gets most inner (Elem - Match) combo | Monoid may need to be implemented so that we can have mempty to help generalize

elemParserOld :: forall s (m :: Type -> Type) u. Stream s m Char => Maybe [Elem] -> Maybe (ParsecT s u m String) -> [(String, Maybe String)] -> ParsecT s u m (Elem' String) Source #

Deprecated: use elemParser

Note: In case of Nothing for innerSpec, the parser should be : optional anyChar == ()

attrs (Attr | AnyAttr) maybe discr elem