text-2.1.4: An efficient packed Unicode text type.
Copyright(c) 2008 2009 Tom Harper
(c) 2009 2010 Bryan O'Sullivan
(c) 2009 Duncan Coutts
(c) 2021 Andrew Lelechenko
LicenseBSD-style
Maintainerbos@serpentine.com
Stabilityexperimental
PortabilityGHC
Safe HaskellNone
LanguageHaskell2010

Data.Text.Internal.Encoding.Utf8

Description

Warning: this is an internal module, and does not have a stable API or name. Functions in this module may not check or enforce preconditions expected by public modules. Use at your own risk!

Basic UTF-8 validation and character manipulation.

Synopsis

Documentation

utf8Length :: Char -> Int Source #

Measure byte length of UTF-8 encoding for a given character.

Since: 2.0

utf8LengthByLeader :: Word8 -> Int Source #

Measure byte length of UTF-8 encoding for characters, starting with a given byte.

Since: 2.0

ord2 :: Char -> (Word8, Word8) Source #

Encode a character as UTF-8 bytes assuming that exactly 2 are needed. This precondition is not checked.

Since: 1.1.0.0

ord3 :: Char -> (Word8, Word8, Word8) Source #

Encode a character as UTF-8 bytes assuming that exactly 3 are needed. This precondition is not checked.

Since: 1.1.0.0

ord4 :: Char -> (Word8, Word8, Word8, Word8) Source #

Encode a character as UTF-8 bytes assuming that exactly 4 are needed. This precondition is not checked.

Since: 1.1.0.0

chr2 :: Word8 -> Word8 -> Char Source #

Since: 1.1.0.0

chr3 :: Word8 -> Word8 -> Word8 -> Char Source #

Since: 1.1.0.0

chr4 :: Word8 -> Word8 -> Word8 -> Word8 -> Char Source #

Since: 1.1.0.0

Validation

validate1 :: Word8 -> Bool Source #

Since: 1.1.0.0

validate2 :: Word8 -> Word8 -> Bool Source #

Since: 1.1.0.0

validate3 :: Word8 -> Word8 -> Word8 -> Bool Source #

Since: 1.1.0.0

validate4 :: Word8 -> Word8 -> Word8 -> Word8 -> Bool Source #

Since: 1.1.0.0

Naive decoding

newtype DecoderState Source #

Since: 2.0

Constructors

DecoderState Word8 

newtype CodePoint Source #

Since: 2.0

Constructors

CodePoint Int