pcre-heavy: A regexp (regex) library on top of pcre-light you can actually use.

[ library, public-domain, web ] [ Propose Tags ] [ Report a vulnerability ]

A PCRE-based regular expressions library with support for multiple matches and replacements. Based on pcre-light. Takes and returns convertible strings everywhere. Includes a QuasiQuoter for regexps that does compile time checking.


[Skip to Readme]

Modules

[Last Documentation]

  • Text
    • Regex
      • PCRE
        • Text.Regex.PCRE.Heavy

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0.1.0, 0.2.0, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 1.0.0, 1.0.0.1, 1.0.0.2, 1.0.0.3, 1.0.0.4
Dependencies base (>=4.3.0.0 && <5), base-compat (>=0.8.0), bytestring, pcre-light, semigroups, string-conversions, template-haskell (>=2.16.0.0) [details]
Tested with ghc ==9.10.1
License LicenseRef-PublicDomain
Copyright 2015-2025 Val Packett <val@packett.cool>
Author Val Packett
Maintainer val@packett.cool
Category Web
Home page https://codeberg.org/valpackett/pcre-heavy
Bug tracker https://codeberg.org/valpackett/pcre-heavy/issues
Source repo head: git clone https://codeberg.org/valpackett/pcre-heavy.git
Uploaded by valpackett at 2025-05-03T09:07:13Z
Distributions Arch:1.0.0.4, LTSHaskell:1.0.0.4, NixOS:1.0.0.4, Stackage:1.0.0.4
Reverse Dependencies 17 direct, 14 indirect [details]
Downloads 13271 total (56 in the last 30 days)
Rating 2.5 (votes: 5) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs not available [build log]
All reported builds failed as of 2025-05-03 [all 2 reports]

Readme for pcre-heavy-1.0.0.4

[back to package description]

Hackage unlicense Support me on Patreon

pcre-heavy

A Haskell regular expressions library with support for multiple matches and replacements:

  • based on pcre-light, none of that regex-base complicated pluggable-backend stuff
  • takes and returns ConvertibleStrings everywhere, use any common string type (String, ByteString, Lazy ByteString, Text, Lazy Text) -- but you need a bit more type annotations (or ClassyPrelude's asText, asString, etc.) if you use OverloadedStrings which you probably can't live without
  • provides a QuasiQuoter for regexps that does compile time checking

Usage

{-# LANGUAGE QuasiQuotes, FlexibleContexts #-}
import           Text.Regex.PCRE.Heavy

Checking

>>> "https://val.packett.cool" =~ [re|^http.*|]
True

For UnicodeSyntax fans, it's also available as ≈ (U+2248 ALMOST EQUAL TO):

>>> "https://val.packett.cool" ≈ [re|^http.*|]
True

Matching (Searching)

(You can use any string type, not just String!)

scan returns all matches as pairs like (fullmatch, [group, group...]).

>>> scan [re|\s*entry (\d+) (\w+)\s*&?|] " entry 1 hello  &entry 2 hi" :: [(String, [String])]
[
  (" entry 1 hello  &", ["1", "hello"])
, ("entry 2 hi",        ["2", "hi"])
]

It is lazy! If you only need the first match, use head (or, much better, headMay from safe) -- no extra work will be performed!

>>> headMay $ scan [re|\s*entry (\d+) (\w+)\s*&?|] " entry 1 hello  &entry 2 hi"
Just (" entry 1 hello  &", ["1", "hello"])

Replacement

sub replaces the first match, gsub replaces all matches.

-- You can use a convertible string type `a` as the replacement...
>>> gsub [re|\d+|] "!!!NUMBER!!!" "Copyright (c) 2015 The 000 Group"
"Copyright (c) !!!NUMBER!!! The !!!NUMBER!!! Group"

-- or a ([a] -> a) function -- that will get the groups...
>>> gsub [re|%(\d+)(\w+)|] (\(d:w:_) -> "{" ++ d ++ " of " ++ w ++ "}" :: String) "Hello, %20thing"
"Hello, {20 of thing}"

-- or a (a -> a) function -- that will get the full match...
>>> gsub [re|-\w+|] (\x -> "+" ++ (reverse $ drop 1 x) :: String) "hello -world"
"hello +dlrow"

-- or a (a -> [a] -> a) function.
-- That will get both the full match and the groups.
-- I have no idea why you would want to use that, but that's there :-)

Note that functions are the only way to use captured groups in the replacement. There is no "in string" syntax like in Perl or in Python.

Splitting

split, well, splits.

>>> split [re|%(begin|next|end)%|] "%begin%hello%next%world%end%"
["","hello","world",""]

Options

You can pass pcre-light options by using the somethingO variants of functions (and mkRegexQQ for compile time options):

>>> let myRe = mkRegexQQ [multiline, utf8, ungreedy]
>>> scanO [myRe|\s*entry (\d+) (\w+)\s*&?|] [exec_no_utf8_check] " entry 1 hello  &entry 2 hi" :: [[String]]
>>> gsubO [myRe|\d+|] [exec_notempty] "!!!NUMBER!!!" "Copyright (c) 2015 The 000 Group"

utf8 is passed by default in the re QuasiQuoter.

Development

Use stack to build.
Use ghci to run tests quickly with :test (see the .ghci file).

$ stack build

$ stack test && rm tests.tix

$ stack ghci --ghc-options="-fno-hpc"

License

This is free and unencumbered software released into the public domain.
For more information, please refer to the UNLICENSE file or unlicense.org.