dataframe-0.3.3.6: A fast, safe, and intuitive DataFrame library.
Safe HaskellNone
LanguageHaskell2010

DataFrame.Internal.Column

Synopsis

Documentation

data Column where Source #

Our representation of a column is a GADT that can store data based on the underlying data.

This allows us to pattern match on data kinds and limit some operations to only some kinds of vectors. E.g. operations for missing data only happen in an OptionalColumn.

Constructors

BoxedColumn :: forall a. Columnable a => Vector a -> Column 
UnboxedColumn :: forall a. (Columnable a, Unbox a) => Vector a -> Column 
OptionalColumn :: forall a. Columnable a => Vector (Maybe a) -> Column 

Instances

Instances details
Show Column Source # 
Instance details

Defined in DataFrame.Internal.Column

Eq Column Source # 
Instance details

Defined in DataFrame.Internal.Column

Methods

(==) :: Column -> Column -> Bool #

(/=) :: Column -> Column -> Bool #

Ord Column Source # 
Instance details

Defined in DataFrame.Internal.Column

data MutableColumn where Source #

Constructors

MBoxedColumn :: forall a. Columnable a => IOVector a -> MutableColumn 
MUnboxedColumn :: forall a. (Columnable a, Unbox a) => IOVector a -> MutableColumn 

data TypedColumn a where Source #

A TypedColumn is a wrapper around our type-erased column. It is used to type check expressions on columns.

Constructors

TColumn :: forall a. Columnable a => Column -> TypedColumn a 

Instances

Instances details
Show a => Show (TypedColumn a) Source # 
Instance details

Defined in DataFrame.Internal.Column

Eq a => Eq (TypedColumn a) Source # 
Instance details

Defined in DataFrame.Internal.Column

Ord a => Ord (TypedColumn a) Source # 
Instance details

Defined in DataFrame.Internal.Column

unwrapTypedColumn :: TypedColumn a -> Column Source #

Gets the underlying value from a TypedColumn.

hasMissing :: Column -> Bool Source #

Checks if a column contains missing values.

isNumeric :: Column -> Bool Source #

Checks if a column contains numeric values.

hasElemType :: Columnable a => Column -> Bool Source #

Checks if a column is of a given type values.

columnVersionString :: Column -> String Source #

An internal/debugging function to get the column type of a column.

columnTypeString :: Column -> String Source #

An internal/debugging function to get the type stored in the outermost vector of a column.

class ColumnifyRep (r :: Rep) a where Source #

A class for converting a vector to a column of the appropriate type. Given each Rep we tell the toColumnRep function which Column type to pick.

Methods

toColumnRep :: Vector a -> Column Source #

Instances

Instances details
Columnable a => ColumnifyRep 'RBoxed a Source # 
Instance details

Defined in DataFrame.Internal.Column

Methods

toColumnRep :: Vector a -> Column Source #

(Columnable a, Unbox a) => ColumnifyRep 'RUnboxed a Source # 
Instance details

Defined in DataFrame.Internal.Column

Methods

toColumnRep :: Vector a -> Column Source #

Columnable a => ColumnifyRep 'ROptional (Maybe a) Source # 
Instance details

Defined in DataFrame.Internal.Column

Methods

toColumnRep :: Vector (Maybe a) -> Column Source #

type Columnable a = (Columnable' a, ColumnifyRep (KindOf a) a, UnboxIf a, IntegralIf a, FloatingIf a, SBoolI (Unboxable a), SBoolI (Numeric a), SBoolI (IntegralTypes a), SBoolI (FloatingTypes a)) Source #

Constraint synonym for what we can put into columns.

fromVector :: (Columnable a, ColumnifyRep (KindOf a) a) => Vector a -> Column Source #

O(n) Convert a vector to a column. Automatically picks the best representation of a vector to store the underlying data in.

Examples:

> import qualified Data.Vector as V
> fromVector (VB.fromList [(1 :: Int), 2, 3, 4])
[1,2,3,4]

fromUnboxedVector :: (Columnable a, Unbox a) => Vector a -> Column Source #

O(n) Convert an unboxed vector to a column. This avoids the extra conversion if you already have the data in an unboxed vector.

Examples:

> import qualified Data.Vector.Unboxed as V
> fromUnboxedVector (VB.fromList [(1 :: Int), 2, 3, 4])
[1,2,3,4]

fromList :: (Columnable a, ColumnifyRep (KindOf a) a) => [a] -> Column Source #

O(n) Convert a list to a column. Automatically picks the best representation of a vector to store the underlying data in.

Examples:

> fromList [(1 :: Int), 2, 3, 4]
[1,2,3,4]

mapColumn :: (Columnable b, Columnable c, UnboxIf c) => (b -> c) -> Column -> Either DataFrameException Column Source #

An internal function to map a function over the values of a column.

columnLength :: Column -> Int Source #

O(1) Gets the number of elements in the column.

numElements :: Column -> Int Source #

O(n) Gets the number of elements in the column.

takeColumn :: Int -> Column -> Column Source #

O(n) Takes the first n values of a column.

takeLastColumn :: Int -> Column -> Column Source #

O(n) Takes the last n values of a column.

sliceColumn :: Int -> Int -> Column -> Column Source #

O(n) Takes n values after a given column index.

atIndices :: Set Int -> Column -> Column Source #

O(n) Selects the elements at a given set of indices. May change the order.

atIndicesStable :: Vector Int -> Column -> Column Source #

O(n) Selects the elements at a given set of indices. Does not change the order.

getIndices :: Vector Int -> Vector a -> Vector a Source #

Internal helper to get indices in a boxed vector.

getIndicesUnboxed :: Unbox a => Vector Int -> Vector a -> Vector a Source #

Internal helper to get indices in an unboxed vector.

sortedIndexes :: Bool -> Column -> Vector Int Source #

An internal function that returns a vector of how indexes change after a column is sorted.

imapColumn :: (Columnable b, Columnable c) => (Int -> b -> c) -> Column -> Either DataFrameException Column Source #

Applies a function that returns an unboxed result to an unboxed vector, storing the result in a column.

ifilterColumn :: Columnable a => (Int -> a -> Bool) -> Column -> Either DataFrameException Column Source #

Filter column with index.

ifoldrColumn :: (Columnable a, Columnable b) => (Int -> a -> b -> b) -> b -> Column -> Either DataFrameException b Source #

Fold (right) column with index.

ifoldlColumn :: (Columnable a, Columnable b) => (b -> Int -> a -> b) -> b -> Column -> Either DataFrameException b Source #

Fold (left) column with index.

reduceColumn :: (Columnable a, Columnable b) => (a -> b) -> Column -> Either DataFrameException b Source #

Generic reduce function for all Column types.

zipColumns :: Column -> Column -> Column Source #

An internal, column version of zip.

zipWithColumns :: (Columnable a, Columnable b, Columnable c) => (a -> b -> c) -> Column -> Column -> Either DataFrameException Column Source #

An internal, column version of zipWith.

expandColumn :: Int -> Column -> Column Source #

Fills the end of a column, up to n, with Nothing. Does nothing if column has length greater than n.

leftExpandColumn :: Int -> Column -> Column Source #

Fills the beginning of a column, up to n, with Nothing. Does nothing if column has length greater than n.

concatColumns :: Column -> Column -> Either DataFrameException Column Source #

Concatenates two columns. Returns Nothing if the columns are of different types.

concatColumnsEither :: Column -> Column -> Column Source #

Concatenates two columns.

Works similar to concatColumns, but unlike that function, it will also combine columns of different types by wrapping the values in an Either.

E.g. combining Column containing [1,2] with Column containing ["a","b"] will result in a Column containing [Left 1, Left 2, Right "a", Right "b"].

toList :: Columnable a => Column -> [a] Source #

O(n) Converts a column to a list. Throws an exception if the wrong type is specified.

Examples:

> column = fromList [(1 :: Int), 2, 3, 4]
> toList Int column
[1,2,3,4]
> toList Double column
exception: ...

toVector :: forall a v. (Vector v a, Columnable a) => Column -> Either DataFrameException (v a) Source #

Converts a column to a vector of a specific type.

This is a type-safe conversion that requires the column's element type to exactly match the requested type. You must specify the desired type via type applications.

Type Parameters

Expand
a
The element type to convert to
v
The vector type (e.g., Vector, Vector)

Examples

Expand
>>> toVector @Int @VU.Vector column
Right (unboxed vector of Ints)
>>> toVector @Text @VB.Vector column
Right (boxed vector of Text)

Returns

Expand

See also

Expand

For numeric conversions with automatic type coercion, see toDoubleVector, toFloatVector, and toIntVector.

toDoubleVector :: Column -> Either DataFrameException (Vector Double) Source #

Converts a column to an unboxed vector of Double values.

This function performs intelligent type coercion for numeric types:

  • If the column is already Double, returns it directly
  • If the column contains other floating-point types, converts via realToFrac
  • If the column contains integral types, converts via fromIntegral (beware of overflow if the type is Integer).

Optional column handling

Expand

For OptionalColumn types, Nothing values are converted to NaN (Not a Number). This allows optional numeric data to be represented in the resulting vector.

Returns

Expand

toFloatVector :: Column -> Either DataFrameException (Vector Float) Source #

Converts a column to an unboxed vector of Float values.

This function performs intelligent type coercion for numeric types:

  • If the column is already Float, returns it directly
  • If the column contains other floating-point types, converts via realToFrac
  • If the column contains integral types, converts via fromIntegral
  • If the column is boxed Integer, converts via fromIntegral (beware of overflow for 64-bit integers and Integer)

Optional column handling

Expand

For OptionalColumn types, Nothing values are converted to NaN (Not a Number). This allows optional numeric data to be represented in the resulting vector.

Returns

Expand

Precision warning

Expand

Converting from Double to Float may result in loss of precision.

toIntVector :: Column -> Either DataFrameException (Vector Int) Source #

Converts a column to an unboxed vector of Int values.

This function performs intelligent type coercion for numeric types:

  • If the column is already Int, returns it directly
  • If the column contains floating-point types, rounds via round and converts
  • If the column contains other integral types, converts via fromIntegral
  • If the column is boxed Integer, converts via fromIntegral

Returns

Expand

Note

Expand

Unlike toDoubleVector and toFloatVector, this function does NOT support OptionalColumn. Optional columns must be handled separately.

Rounding behavior

Expand

Floating-point values are rounded to the nearest integer using round. For example: 2.5 rounds to 2, 3.5 rounds to 4 (banker's rounding).