dataframe-1.0.0.1: A fast, safe, and intuitive DataFrame library.
Safe HaskellNone
LanguageHaskell2010

DataFrame.Operations.Transformations

Synopsis

Documentation

apply Source #

Arguments

:: (Columnable b, Columnable c) 
=> (b -> c)

function to apply

-> Text

Column name

-> DataFrame

DataFrame to apply operation to

-> DataFrame 

O(k) Apply a function to a given column in a dataframe.

safeApply Source #

Arguments

:: (Columnable b, Columnable c) 
=> (b -> c)

function to apply

-> Text

Column name

-> DataFrame

DataFrame to apply operation to

-> Either DataFrameException DataFrame 

O(k) Safe version of the apply function. Returns (instead of throwing) the error.

derive :: Columnable a => Text -> Expr a -> DataFrame -> DataFrame Source #

O(k) Apply a function to an expression in a dataframe and add the result into alias column.

deriveWithExpr :: Columnable a => Text -> Expr a -> DataFrame -> (Expr a, DataFrame) Source #

O(k) Apply a function to an expression in a dataframe and add the result into alias column but

Examples

Expand
>>> (z, df') = deriveWithExpr "z" (F.col @Int "x" + F.col "y") df
>>> filterWhere (z .>= 50)

applyMany :: (Columnable b, Columnable c) => (b -> c) -> [Text] -> DataFrame -> DataFrame Source #

O(k * n) Apply a function to given column names in a dataframe.

applyInt Source #

Arguments

:: Columnable b 
=> (Int -> b)

function to apply

-> Text

Column name

-> DataFrame

DataFrame to apply operation to

-> DataFrame 

O(k) Convenience function that applies to an int column.

applyDouble Source #

Arguments

:: Columnable b 
=> (Double -> b)

function to apply

-> Text

Column name

-> DataFrame

DataFrame to apply operation to

-> DataFrame 

O(k) Convenience function that applies to an double column.

applyWhere Source #

Arguments

:: (Columnable a, Columnable b) 
=> (a -> Bool)

Filter condition

-> Text

Criterion Column

-> (b -> b)

function to apply

-> Text

Column name

-> DataFrame

DataFrame to apply operation to

-> DataFrame 

O(k * n) Apply a function to a column only if there is another column value that matches the given criterion.

applyWhere (<20) "Age" (const "Gen-Z") "Generation" df

applyAtIndex Source #

Arguments

:: Columnable a 
=> Int

Index

-> (a -> a)

function to apply

-> Text

Column name

-> DataFrame

DataFrame to apply operation to

-> DataFrame 

O(k) Apply a function to the column at a given index.

imputeCore :: Columnable b => Expr (Maybe b) -> b -> DataFrame -> DataFrame Source #

Core impute implementation for nullable columns. Silently no-ops on non-nullable columns.

class Columnable a => ImputeOp a where Source #

Instances

Instances details
Columnable a => ImputeOp a Source # 
Instance details

Defined in DataFrame.Operations.Transformations

Columnable b => ImputeOp (Maybe b) Source #

O(n) Impute missing values in a column using a derived scalar.

Given

  • an expression f :: Expr b -> Expr b that, when interpreted over a non-nullable column, produces the same value in every row (for example a mean, median, or other aggregate), and
  • a nullable column Expr (Maybe b)

this function:

  1. Drops all Nothing values from the target column.
  2. Interprets f on the remaining non-null values.
  3. Checks that the resulting column contains a single repeated value.
  4. Uses that value to impute all Nothings in the original column.

Throws

Expand

Example

Expand
>>> :set -XOverloadedStrings
>>> import qualified DataFrame as D
>>> let df =
...       D.fromNamedColumns
...         [ ("age", D.fromList [Just 10, Nothing, Just 20 :: Maybe Int]) ]
>>>
>>> -- Impute missing ages with the mean of the observed ages
>>> D.imputeWith F.mean "age" df
-- age
-- ----
-- 10
-- 15
-- 20
Instance details

Defined in DataFrame.Operations.Statistics

impute :: ImputeOp a => Expr a -> BaseType a -> DataFrame -> DataFrame Source #

Replace all instances of Nothing in a column with the given value. When the column is already non-nullable, this is a silent no-op.