Copyright	(c) 2026
License	MIT
Maintainer	@ocramz
Safe Haskell	None
Language	Haskell2010

Test.Hspec.BenchGolden

Contents

Spec Combinators
Configuration
Types
Benchmarkable Constructors
Low-Level API
Lens-Based Expectations
Re-exports
Orphan instances

Description

Overview

golds-gym is a framework for golden testing of performance benchmarks. It integrates with hspec and uses benchpress for lightweight timing measurements.

Benchmarks can use robust statistics to mitigate the impact of outliers.

The library can be used both to assert that performance does not regress, and to set expectations for improvements across project versions (see benchGoldenWithExpectation).

Quick Start

import Test.Hspec
import Test.Hspec.BenchGolden
import Data.List (sort)

main :: IO ()
main = hspec $ do
  describe "Performance" $ do
    -- Pure function with normal form evaluation
    benchGolden "list sorting" $
      nf sort [1000, 999..1]

    -- Weak head normal form (lazy evaluation)
    benchGolden "replicate" $
      whnf (replicate 1000) 42

    -- IO action with result forced to normal form
    benchGolden "file read" $
      nfIO (readFile "data.txt")

Evaluation strategies control how values are forced:

nf - Force to normal form (deep evaluation, use for most cases)
whnf - Force to weak head normal form (only outermost constructor is evaluated)
nfIO, whnfIO - Variants for IO actions
nfAppIO, whnfAppIO - For functions returning IO
io - Plain IO action without forcing

Without proper evaluation strategies, GHC may optimize away computations or share results across iterations, making benchmarks meaningless.

Best Practices: Avoiding Shared Thunks

CRITICAL: When benchmarking with data structures, ensure the data is reconstructed on each iteration to avoid measuring shared, cached results.

❌ Anti-pattern (shared list across iterations):

benchGolden "sum" $ nf sum [1..1000000]

The list [1..1000000] is constructed once and shared across all iterations. This allocates the entire list in memory, creates GC pressure, and prevents list fusion. The first iteration evaluates the shared thunk, and subsequent iterations measure cached results.

✅ Correct pattern (list reconstructed per iteration):

benchGolden "sum" $ nf (\n -> sum [1..n]) 1000000

The lambda wrapper ensures the list is reconstructed on every iteration, measuring the true cost of both construction and computation.

Other considerations:

Ensure return types are inhabited enough to depend on all computations (avoid b ~ () where GHC might optimize away the payload)
For inlinable functions, ensure full saturation: prefer nf (\n -> f n) x over nf f x to guarantee inlining and rewrite rules fire
Use NFData constraints where applicable to ensure deep evaluation

How It Works

On first run, the benchmark is executed and results are saved to a golden file as the baseline.

On subsequent runs, the benchmark is executed and compared against the baseline using a configurable tolerance or expectation combinators.

Architecture-Specific Baselines

Golden files are stored per-architecture to ensure benchmarks are only compared against equivalent hardware. The architecture identifier includes CPU type, OS, and CPU model.

Configuration

Use benchGoldenWith or benchGoldenWithExpectation with a custom BenchConfig:

Tolerance Configuration

The framework supports two tolerance mechanisms that work together:

Percentage tolerance (tolerancePercent): Checks if the mean time change is within ±X% of the baseline. This is the traditional approach and works well for operations that take more than a few milliseconds.
Absolute tolerance (absoluteToleranceMs): Checks if the absolute time difference is within X milliseconds. This prevents false failures for extremely fast operations (< 1ms) where measurement noise causes large percentage variations despite negligible absolute differences.

By default, benchmarks pass if EITHER tolerance is satisfied:

pass = (percentChange <= 15%) OR (absTimeDiff <= 0.01 ms)

This hybrid strategy combines the benefits of both approaches:

For fast operations (< 1ms): Absolute tolerance dominates, preventing noise
For slow operations (> 1ms): Percentage tolerance dominates, catching real regressions

To disable absolute tolerance and use percentage-only comparison:

benchGoldenWith defaultBenchConfig
  { absoluteToleranceMs = Nothing
  }
  "benchmark" $ ...

To adjust the absolute tolerance threshold:

benchGoldenWith defaultBenchConfig
  { absoluteToleranceMs = Just 0.001  -- 1 microsecond (very strict)
  }
  "benchmark" $ ...

Synopsis

benchGolden :: String -> BenchAction -> Spec
benchGoldenWith :: BenchConfig -> String -> BenchAction -> Spec
benchGoldenWithExpectation :: String -> BenchConfig -> [Expectation] -> BenchAction -> Spec
data BenchConfig = BenchConfig {
- iterations :: !Int
- warmupIterations :: !Int
- tolerancePercent :: !Double
- absoluteToleranceMs :: !(Maybe Double)
- warnOnVarianceChange :: !Bool
- varianceTolerancePercent :: !Double
- outputDir :: !FilePath
- failOnFirstRun :: !Bool
- useRobustStatistics :: !Bool
- trimPercent :: !Double
- outlierThreshold :: !Double
}
defaultBenchConfig :: BenchConfig
data BenchGolden = BenchGolden {
- benchName :: !String
- benchAction :: !BenchAction
- benchConfig :: !BenchConfig
}
newtype BenchAction = BenchAction {
- runBenchAction :: Word64 -> IO ()
}
data GoldenStats = GoldenStats {
- statsMean :: !Double
- statsStddev :: !Double
- statsMedian :: !Double
- statsMin :: !Double
- statsMax :: !Double
- statsPercentiles :: ![(Int, Double)]
- statsArch :: !Text
- statsTimestamp :: !UTCTime
- statsTrimmedMean :: !Double
- statsMAD :: !Double
- statsIQR :: !Double
- statsOutliers :: ![Double]
}
data BenchResult
- = FirstRun !GoldenStats
- | Pass !GoldenStats !GoldenStats ![Warning]
- | Regression !GoldenStats !GoldenStats !Double !Double !(Maybe Double)
- | Improvement !GoldenStats !GoldenStats !Double !Double !(Maybe Double)
data Warning
- = VarianceIncreased !Double !Double !Double !Double
- | VarianceDecreased !Double !Double !Double !Double
- | HighVariance !Double
- | OutliersDetected !Int ![Double]
data ArchConfig = ArchConfig {
- archId :: !Text
- archOS :: !Text
- archCPU :: !Text
- archModel :: !(Maybe Text)
}
nf :: NFData b => (a -> b) -> a -> BenchAction
whnf :: (a -> b) -> a -> BenchAction
nfIO :: NFData a => IO a -> BenchAction
whnfIO :: IO a -> BenchAction
nfAppIO :: NFData b => (a -> IO b) -> a -> BenchAction
whnfAppIO :: (a -> IO b) -> a -> BenchAction
io :: IO () -> BenchAction
runBenchGolden :: BenchGolden -> IO BenchResult
expect :: Lens' GoldenStats Double -> Tolerance -> Expectation
pattern And :: !Expectation -> !Expectation -> Expectation
pattern ExpectStat :: !(Lens' GoldenStats Double) -> !Tolerance -> Expectation
pattern Or :: !Expectation -> !Expectation -> Expectation
data Tolerance
- = Percent !Double
- | Absolute !Double
- | Hybrid !Double !Double
- | MustImprove !Double
- | MustRegress !Double
metricFor :: BenchConfig -> Lens' GoldenStats Double
varianceFor :: BenchConfig -> Lens' GoldenStats Double
_statsMean :: Lens' GoldenStats Double
_statsStddev :: Lens' GoldenStats Double
_statsMedian :: Lens' GoldenStats Double
_statsMin :: Lens' GoldenStats Double
_statsMax :: Lens' GoldenStats Double
_statsTrimmedMean :: Lens' GoldenStats Double
_statsMAD :: Lens' GoldenStats Double
_statsIQR :: Lens' GoldenStats Double
expectStat :: Lens' GoldenStats Double -> Tolerance -> Expectation
checkExpectation :: Expectation -> GoldenStats -> GoldenStats -> Bool
withinPercent :: Double -> Double -> Double -> Bool
withinAbsolute :: Double -> Double -> Double -> Bool
withinHybrid :: Double -> Double -> Double -> Double -> Bool
mustImprove :: Double -> Double -> Double -> Bool
mustRegress :: Double -> Double -> Double -> Bool
(@~) :: Double -> Double -> Double -> Bool
(@<) :: Double -> Double -> Double -> Bool
(@<<) :: Double -> Double -> Double -> Bool
(@>>) :: Double -> Double -> Double -> Bool
(&&~) :: Expectation -> Expectation -> Expectation
(||~) :: Expectation -> Expectation -> Expectation
percentDiff :: Double -> Double -> Double
absDiff :: Double -> Double -> Double
toleranceFromExpectation :: Expectation -> (Double, Maybe Double)
toleranceValues :: Tolerance -> (Double, Maybe Double)
module Test.Hspec.BenchGolden.Arch

Spec Combinators

benchGolden Source #

Arguments

:: String	Name of the benchmark
-> BenchAction	The benchmarkable action
-> Spec

Create a benchmark golden test with default configuration.

This is the simplest way to add a benchmark test:

describe Sorting $ do
  benchGolden "quicksort 1000 elements" $
    nf quicksort [1000, 999..1]

Use evaluation strategy combinators to control how values are forced:

nf - Normal form (deep evaluation)
whnf - Weak head normal form (shallow evaluation)
nfIO - Normal form for IO actions
whnfIO - WHNF for IO actions
nfAppIO - Normal form for functions returning IO
whnfAppIO - WHNF for functions returning IO
io - Plain IO action (for backward compatibility)

Default configuration:

100 iterations
5 warm-up iterations
15% tolerance
Variance warnings enabled
Standard statistics (not robust mode)

benchGoldenWith Source #

Arguments

:: BenchConfig	Configuration parameters
-> String	Name of the benchmark
-> BenchAction	The benchmarkable action
-> Spec

Create a benchmark golden test with custom configuration.

Examples:

-- Tighter tolerance for critical code
benchGoldenWith defaultBenchConfig
  { iterations = 500
  , tolerancePercent = 5.0
  , warmupIterations = 20
  }
  "hot loop" $
  nf criticalFunction input

-- Robust statistics mode for noisy environments
benchGoldenWith defaultBenchConfig
  { useRobustStatistics = True
  , trimPercent = 10.0
  , outlierThreshold = 3.0
  }
  "benchmark with outliers" $
  whnf computation input

benchGoldenWithExpectation Source #

Arguments

:: String	Name of the benchmark
-> BenchConfig	Configuration parameters
-> [Expectation]	List of expectations (all must pass)
-> BenchAction	The benchmarkable action
-> Spec

Create a benchmark golden test with custom lens-based expectations.

This combinator allows you to specify custom performance expectations using lenses and tolerance combinators. Expectations can be composed using boolean operators (&&~, ||~).

Examples:

-- Median-based comparison (more robust to outliers)
benchGoldenWithExpectation "median test" defaultBenchConfig
  [expect _statsMedian (Percent 10.0)]
  (nf sort [1000, 999..1])

-- Multiple metrics must pass (AND composition)
benchGoldenWithExpectation "strict test" defaultBenchConfig
  [ expect _statsMean (Percent 15.0) &&~
    expect _statsMAD (Percent 50.0)
  ]
  (nf algorithm data)

-- Either metric can pass (OR composition)
benchGoldenWithExpectation "flexible test" defaultBenchConfig
  [ expect _statsMedian (Percent 10.0) ||~
    expect _statsMin (Absolute 0.01)
  ]
  (nf fastOp input)

-- Expect performance improvement (must be faster)
benchGoldenWithExpectation "optimization" defaultBenchConfig
  [expect _statsMean (MustImprove 10.0)]  -- Must be ≥10% faster
  (nf optimizedVersion data)

-- Expect controlled regression (for feature additions)
benchGoldenWithExpectation "new feature" defaultBenchConfig
  [expect _statsMean (MustRegress 5.0)]  -- Accept 5-20% slowdown
  (nf newFeature input)

-- Low variance requirement
benchGoldenWithExpectation "stable perf" defaultBenchConfig
  [ expect _statsMean (Percent 15.0) &&~
    expect _statsIQR (Absolute 0.1)
  ]
  (nfIO stableOperation)

Note: Expectations are checked against golden files. On first run, a baseline is created. Use GOLDS_GYM_ACCEPT=1 to regenerate baselines.

Configuration

data BenchConfig Source #

Configurable parameters for benchmark execution and comparison.

Constructors

BenchConfig

Fields

iterations :: !Int
Number of benchmark iterations to run
warmupIterations :: !Int
Number of warm-up iterations (discarded before measurement)
tolerancePercent :: !Double
Allowed deviation in mean time (as percentage, e.g., 15.0 = ±15%)
absoluteToleranceMs :: !(Maybe Double)
Minimum absolute tolerance in milliseconds (e.g., 0.01 = 10 microseconds). When set, benchmarks pass if EITHER the percentage difference is within tolerancePercent OR the absolute time difference is within this threshold. This prevents false failures for extremely fast operations (< 1ms) where measurement noise causes large percentage variations despite negligible absolute time differences. Set to Nothing to disable (percentage-only).
warnOnVarianceChange :: !Bool
Whether to emit warnings when stddev changes significantly
varianceTolerancePercent :: !Double
Allowed deviation in stddev (as percentage)
outputDir :: !FilePath
Directory for storing golden files
failOnFirstRun :: !Bool
Whether to fail if no golden file exists yet
useRobustStatistics :: !Bool
Use robust statistics (trimmed mean, MAD) instead of mean/stddev
trimPercent :: !Double
Percentage to trim from each tail for trimmed mean (e.g., 10.0 = 10%)
outlierThreshold :: !Double
MAD multiplier for outlier detection (e.g., 3.0 = 3 MADs from median)

Instances

Instances details

Generic BenchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Associated Types

type Rep BenchConfig

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

from :: BenchConfig -> Rep BenchConfig x #

to :: Rep BenchConfig x -> BenchConfig #

Show BenchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

showsPrec :: Int -> BenchConfig -> ShowS #

show :: BenchConfig -> String #

showList :: [BenchConfig] -> ShowS #

Eq BenchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

(==) :: BenchConfig -> BenchConfig -> Bool #

(/=) :: BenchConfig -> BenchConfig -> Bool #

type Rep BenchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

defaultBenchConfig :: BenchConfig Source #

Default benchmark configuration with sensible defaults.

100 iterations
5 warm-up iterations
15% tolerance on mean time
0.01 ms (10 microseconds) absolute tolerance - prevents false failures for fast operations
Variance warnings enabled at 50% tolerance
Output to .golden/ directory
Success on first run (creates baseline)

Hybrid Tolerance Strategy

The default configuration uses BOTH percentage and absolute tolerance:

Benchmarks pass if mean time is within ±15% OR within ±0.01ms
This prevents measurement noise from failing fast operations (< 1ms)
For slower operations (> 1ms), percentage tolerance dominates

Set absoluteToleranceMs = Nothing for percentage-only comparison.

Types

data BenchGolden Source #

Configuration for a single benchmark golden test.

Constructors

BenchGolden
Fields benchName :: !String Name of the benchmark (used for golden file naming) benchAction :: !BenchAction The benchmarkable action to run benchConfig :: !BenchConfig Configuration parameters

Instances

Instances details

Example BenchGolden Source #

Instance for BenchGolden without arguments.

Instance details

Defined in Test.Hspec.BenchGolden

Associated Types

type Arg BenchGolden
Instance details Defined in Test.Hspec.BenchGolden type Arg BenchGolden = ()

Methods

evaluateExample :: BenchGolden -> Params -> (ActionWith (Arg BenchGolden) -> IO ()) -> ProgressCallback -> IO Result #

Example (arg -> BenchGolden) Source #

Instance for BenchGolden with an argument.

This allows benchmarks to receive setup data from before or around combinators.

Instance details

Defined in Test.Hspec.BenchGolden

Associated Types

type Arg (arg -> BenchGolden)
Instance details Defined in Test.Hspec.BenchGolden type Arg (arg -> BenchGolden) = arg

Methods

evaluateExample :: (arg -> BenchGolden) -> Params -> (ActionWith (Arg (arg -> BenchGolden)) -> IO ()) -> ProgressCallback -> IO Result #

type Arg BenchGolden Source #

Instance details

Defined in Test.Hspec.BenchGolden

type Arg BenchGolden = ()

type Arg (arg -> BenchGolden) Source #

Instance details

Defined in Test.Hspec.BenchGolden

type Arg (arg -> BenchGolden) = arg

newtype BenchAction Source #

A benchmarkable action that can be run multiple times. The Word64 parameter represents the number of iterations to execute.

Constructors

BenchAction
Fields runBenchAction :: Word64 -> IO ()

data GoldenStats Source #

Statistics stored in golden files.

These represent the baseline performance characteristics of a benchmark on a specific architecture.

Constructors

GoldenStats

Fields

statsMean :: !Double
Mean execution time in milliseconds
statsStddev :: !Double
Standard deviation in milliseconds
statsMedian :: !Double
Median execution time in milliseconds
statsMin :: !Double
Minimum execution time in milliseconds
statsMax :: !Double
Maximum execution time in milliseconds
statsPercentiles :: ![(Int, Double)]
Percentile values (e.g., [(50, 1.2), (90, 1.5), (99, 1.8)])
statsArch :: !Text
Architecture identifier
statsTimestamp :: !UTCTime
When this baseline was recorded
statsTrimmedMean :: !Double
Trimmed mean (with tails removed) in milliseconds
statsMAD :: !Double
Median absolute deviation in milliseconds
statsIQR :: !Double
Interquartile range (Q3 - Q1) in milliseconds
statsOutliers :: ![Double]
List of detected outlier timings in milliseconds

Instances

Instances details

FromJSON GoldenStats Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

parseJSON :: Value -> Parser GoldenStats #

parseJSONList :: Value -> Parser [GoldenStats] #

omittedField :: Maybe GoldenStats #

ToJSON GoldenStats Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

toJSON :: GoldenStats -> Value #

toEncoding :: GoldenStats -> Encoding #

toJSONList :: [GoldenStats] -> Value #

toEncodingList :: [GoldenStats] -> Encoding #

omitField :: GoldenStats -> Bool #

Generic GoldenStats Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Associated Types

type Rep GoldenStats

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

from :: GoldenStats -> Rep GoldenStats x #

to :: Rep GoldenStats x -> GoldenStats #

Show GoldenStats Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

showsPrec :: Int -> GoldenStats -> ShowS #

show :: GoldenStats -> String #

showList :: [GoldenStats] -> ShowS #

Eq GoldenStats Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

(==) :: GoldenStats -> GoldenStats -> Bool #

(/=) :: GoldenStats -> GoldenStats -> Bool #

type Rep GoldenStats Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

data BenchResult Source #

Result of running a benchmark and comparing against golden.

Constructors

FirstRun !GoldenStats	No golden file existed; baseline created
Pass !GoldenStats !GoldenStats ![Warning]	Benchmark passed (golden stats, actual stats, warnings)
Regression !GoldenStats !GoldenStats !Double !Double !(Maybe Double)	Performance regression (golden, actual, percent change, tolerance, absolute tolerance)
Improvement !GoldenStats !GoldenStats !Double !Double !(Maybe Double)	Performance improvement (golden, actual, percent change, tolerance, absolute tolerance)

Instances

Instances details

Show BenchResult Source #
Instance details Defined in Test.Hspec.BenchGolden.Types Methods showsPrec :: Int -> BenchResult -> ShowS # show :: BenchResult -> String # showList :: [BenchResult] -> ShowS #
Eq BenchResult Source #
Instance details Defined in Test.Hspec.BenchGolden.Types Methods (==) :: BenchResult -> BenchResult -> Bool # (/=) :: BenchResult -> BenchResult -> Bool #

data Warning Source #

Warnings that may be emitted during benchmark comparison.

Constructors

VarianceIncreased !Double !Double !Double !Double	Stddev increased (golden, actual, percent change, tolerance)
VarianceDecreased !Double !Double !Double !Double	Stddev decreased significantly (golden, actual, percent change, tolerance)
HighVariance !Double	Current run has unusually high variance
OutliersDetected !Int ![Double]	Outliers detected (count, list of outlier timings)

Instances

Instances details

Show Warning Source #
Instance details Defined in Test.Hspec.BenchGolden.Types Methods showsPrec :: Int -> Warning -> ShowS # show :: Warning -> String # showList :: [Warning] -> ShowS #
Eq Warning Source #
Instance details Defined in Test.Hspec.BenchGolden.Types Methods (==) :: Warning -> Warning -> Bool # (/=) :: Warning -> Warning -> Bool #

data ArchConfig Source #

Machine architecture configuration.

Used to generate unique identifiers for golden file directories, ensuring benchmarks are only compared against equivalent hardware.

Constructors

ArchConfig
Fields archId :: !Text Unique identifier (e.g., "aarch64-darwin-Apple_M1") archOS :: !Text Operating system (e.g., "darwin", "linux") archCPU :: !Text CPU architecture (e.g., "aarch64", "x86_64") archModel :: !(Maybe Text) CPU model if available (e.g., "Apple M1", "Intel Core i7")

Instances

Instances details

FromJSON ArchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

parseJSON :: Value -> Parser ArchConfig #

parseJSONList :: Value -> Parser [ArchConfig] #

omittedField :: Maybe ArchConfig #

ToJSON ArchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

toJSON :: ArchConfig -> Value #

toEncoding :: ArchConfig -> Encoding #

toJSONList :: [ArchConfig] -> Value #

toEncodingList :: [ArchConfig] -> Encoding #

omitField :: ArchConfig -> Bool #

Generic ArchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Associated Types

type Rep ArchConfig

Instance details

Defined in Test.Hspec.BenchGolden.Types

type Rep ArchConfig = D1 ('MetaData "ArchConfig" "Test.Hspec.BenchGolden.Types" "golds-gym-0.4.0.0-3BAgiywNr8n8vcHD5cO6f6" 'False) (C1 ('MetaCons "ArchConfig" 'PrefixI 'True) ((S1 ('MetaSel ('Just "archId") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Text) :*: S1 ('MetaSel ('Just "archOS") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Text)) :*: (S1 ('MetaSel ('Just "archCPU") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Text) :*: S1 ('MetaSel ('Just "archModel") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Text)))))

Methods

from :: ArchConfig -> Rep ArchConfig x #

to :: Rep ArchConfig x -> ArchConfig #

Show ArchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

showsPrec :: Int -> ArchConfig -> ShowS #

show :: ArchConfig -> String #

showList :: [ArchConfig] -> ShowS #

Eq ArchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

Methods

(==) :: ArchConfig -> ArchConfig -> Bool #

(/=) :: ArchConfig -> ArchConfig -> Bool #

type Rep ArchConfig Source #

Instance details

Defined in Test.Hspec.BenchGolden.Types

type Rep ArchConfig = D1 ('MetaData "ArchConfig" "Test.Hspec.BenchGolden.Types" "golds-gym-0.4.0.0-3BAgiywNr8n8vcHD5cO6f6" 'False) (C1 ('MetaCons "ArchConfig" 'PrefixI 'True) ((S1 ('MetaSel ('Just "archId") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Text) :*: S1 ('MetaSel ('Just "archOS") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Text)) :*: (S1 ('MetaSel ('Just "archCPU") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 Text) :*: S1 ('MetaSel ('Just "archModel") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Maybe Text)))))

Benchmarkable Constructors

nf :: NFData b => (a -> b) -> a -> BenchAction Source #

Benchmark a pure function applied to an argument, forcing the result to normal form (NF) using rnf from Control.DeepSeq. This ensures the entire result structure is evaluated.

Example: benchGolden "fib 30" (nf fib 30)

whnf :: (a -> b) -> a -> BenchAction Source #

Benchmark a pure function applied to an argument, forcing the result to weak head normal form (WHNF) only. This evaluates just the outermost constructor.

Example: benchGolden "replicate" (whnf (replicate 1000) 42)

nfIO :: NFData a => IO a -> BenchAction Source #

Benchmark an IO action, forcing the result to normal form.

Example: benchGolden "readFile" (nfIO $ readFile "data.txt")

whnfIO :: IO a -> BenchAction Source #

Benchmark an IO action, forcing the result to weak head normal form.

Example: benchGolden "getLine" (whnfIO getLine)

nfAppIO :: NFData b => (a -> IO b) -> a -> BenchAction Source #

Benchmark a function that performs IO, forcing the result to normal form.

Example: benchGolden "lookup in map" (nfAppIO lookupInDB "key")

whnfAppIO :: (a -> IO b) -> a -> BenchAction Source #

Benchmark a function that performs IO, forcing the result to weak head normal form.

Example: benchGolden "query database" (whnfAppIO queryDB params)

io :: IO () -> BenchAction Source #

Benchmark an IO action, discarding the result. This is for backward compatibility with code that uses IO () actions.

Example: benchGolden "compute" (io $ do result <- heavyComputation evaluate result)

Low-Level API

runBenchGolden :: BenchGolden -> IO BenchResult Source #

Run a benchmark golden test.

This function:

Runs warm-up iterations (discarded)
Runs the actual benchmark
Writes actual results to .actual file
If no golden exists, creates it (first run)
Otherwise, compares against golden with tolerance

The result includes any warnings (e.g., variance changes).

Lens-Based Expectations

expect :: Lens' GoldenStats Double -> Tolerance -> Expectation Source #

Create an expectation for a specific statistic field.

Example:

expect _statsMedian (Percent 10.0)
expect _statsIQR (Absolute 0.5)
expect _statsMean (Hybrid 15.0 0.01)
expect _statsMean (MustImprove 10.0)

pattern And :: !Expectation -> !Expectation -> Expectation Source #

Both expectations must pass

pattern ExpectStat :: !(Lens' GoldenStats Double) -> !Tolerance -> Expectation Source #

Expect a specific field to be within tolerance

pattern Or :: !Expectation -> !Expectation -> Expectation Source #

Either expectation can pass

data Tolerance Source #

Tolerance specification for performance comparison.

Constructors

Percent !Double	Percentage tolerance (e.g., `Percent 15.0` = ±15%)
Absolute !Double	Absolute tolerance in milliseconds (e.g., `Absolute 0.01` = ±0.01ms)
Hybrid !Double !Double	Hybrid tolerance: pass if EITHER percentage OR absolute is satisfied (e.g., `Hybrid 15.0 0.01` = pass if within ±15% OR ±0.01ms)
MustImprove !Double	Must be faster by at least this percentage (e.g., `MustImprove 10.0` = must be ≥10% faster)
MustRegress !Double	Must be slower by at least this percentage (e.g., `MustRegress 5.0` = must be ≥5% slower)

Instances

Instances details

Show Tolerance Source #
Instance details Defined in Test.Hspec.BenchGolden.Lenses Methods showsPrec :: Int -> Tolerance -> ShowS # show :: Tolerance -> String # showList :: [Tolerance] -> ShowS #
Eq Tolerance Source #
Instance details Defined in Test.Hspec.BenchGolden.Lenses Methods (==) :: Tolerance -> Tolerance -> Bool # (/=) :: Tolerance -> Tolerance -> Bool #

metricFor :: BenchConfig -> Lens' GoldenStats Double Source #

Select the appropriate central tendency metric based on configuration.

Returns:

_statsTrimmedMean if useRobustStatistics is True
_statsMean otherwise

Example:

let lens = metricFor config
    baseline = golden ^. lens
    current = actual ^. lens

varianceFor :: BenchConfig -> Lens' GoldenStats Double Source #

Select the appropriate dispersion metric based on configuration.

Returns:

_statsMAD if useRobustStatistics is True
_statsStddev otherwise

Example:

let vLens = varianceFor config
    goldenVar = golden ^. vLens
    actualVar = actual ^. vLens

_statsMean :: Lens' GoldenStats Double Source #

Lens for mean execution time in milliseconds.

_statsStddev :: Lens' GoldenStats Double Source #

Lens for standard deviation in milliseconds.

_statsMedian :: Lens' GoldenStats Double Source #

Lens for median execution time in milliseconds.

_statsMin :: Lens' GoldenStats Double Source #

Lens for minimum execution time in milliseconds.

_statsMax :: Lens' GoldenStats Double Source #

Lens for maximum execution time in milliseconds.

_statsTrimmedMean :: Lens' GoldenStats Double Source #

Lens for trimmed mean (with tails removed) in milliseconds.

_statsMAD :: Lens' GoldenStats Double Source #

Lens for median absolute deviation (MAD) in milliseconds.

_statsIQR :: Lens' GoldenStats Double Source #

Lens for interquartile range (IQR = Q3 - Q1) in milliseconds.

expectStat :: Lens' GoldenStats Double -> Tolerance -> Expectation Source #

Create an expectation using a custom lens.

This is an alias for expect for compatibility.

checkExpectation :: Expectation -> GoldenStats -> GoldenStats -> Bool Source #

Check if an expectation is satisfied for the given golden and actual stats.

Returns True if the expectation passes, False otherwise.

withinPercent :: Double -> Double -> Double -> Bool Source #

Check if value is within percentage tolerance.

withinPercent 15.0 baseline actual  -- within ±15%

withinAbsolute :: Double -> Double -> Double -> Bool Source #

Check if value is within absolute tolerance (milliseconds).

withinAbsolute 0.01 baseline actual  -- within ±0.01ms

withinHybrid :: Double -> Double -> Double -> Double -> Bool Source #

Check if value satisfies hybrid tolerance (percentage OR absolute).

withinHybrid 15.0 0.01 baseline actual  -- within ±15% OR ±0.01ms

mustImprove :: Double -> Double -> Double -> Bool Source #

Check if actual is faster than baseline by at least the given percentage.

mustImprove 10.0 baseline actual  -- must be ≥10% faster

mustRegress :: Double -> Double -> Double -> Bool Source #

Check if actual is slower than baseline by at least the given percentage.

mustRegress 5.0 baseline actual  -- must be ≥5% slower

(@~) :: Double -> Double -> Double -> Bool infixl 4 Source #

Infix operator for percentage tolerance check.

baseline @~ 15.0 $ actual  -- within ±15%

(@<) :: Double -> Double -> Double -> Bool infixl 4 Source #

Infix operator for absolute tolerance check.

baseline @< 0.01 $ actual  -- within ±0.01ms

(@<<) :: Double -> Double -> Double -> Bool infixl 4 Source #

Infix operator for "must improve" check.

baseline @<< 10.0 $ actual  -- must be ≥10% faster

(@>>) :: Double -> Double -> Double -> Bool infixl 4 Source #

Infix operator for "must regress" check.

baseline @>> 5.0 $ actual  -- must be ≥5% slower

(&&~) :: Expectation -> Expectation -> Expectation infixr 3 Source #

AND composition of expectations (both must pass).

expect _statsMean (Percent 15.0) &&~ expect _statsMAD (Percent 50.0)

(||~) :: Expectation -> Expectation -> Expectation infixr 2 Source #

OR composition of expectations (either can pass).

expect _statsMedian (Percent 10.0) ||~ expect _statsMin (Absolute 0.01)

percentDiff :: Double -> Double -> Double Source #

Calculate percentage difference between baseline and actual.

Returns: ((actual - baseline) / baseline) * 100

Positive = regression (slower)
Negative = improvement (faster)
Zero = no change

absDiff :: Double -> Double -> Double Source #

Calculate absolute difference between baseline and actual.

Returns: abs(actual - baseline)

toleranceFromExpectation :: Expectation -> (Double, Maybe Double) Source #

Extract tolerance description from an expectation for error messages. For compound expectations (And/Or), returns the first tolerance found.

toleranceValues :: Tolerance -> (Double, Maybe Double) Source #

Extract percentage and optional absolute tolerance from a Tolerance.

Re-exports

module Test.Hspec.BenchGolden.Arch

Orphan instances

Example BenchGolden Source #

Instance for BenchGolden without arguments.

Instance details

Associated Types

type Arg BenchGolden
Instance details Defined in Test.Hspec.BenchGolden type Arg BenchGolden = ()

Methods

evaluateExample :: BenchGolden -> Params -> (ActionWith (Arg BenchGolden) -> IO ()) -> ProgressCallback -> IO Result #

Example (arg -> BenchGolden) Source #

Instance for BenchGolden with an argument.

This allows benchmarks to receive setup data from before or around combinators.

Instance details

Associated Types

type Arg (arg -> BenchGolden)
Instance details Defined in Test.Hspec.BenchGolden type Arg (arg -> BenchGolden) = arg

Methods

evaluateExample :: (arg -> BenchGolden) -> Params -> (ActionWith (Arg (arg -> BenchGolden)) -> IO ()) -> ProgressCallback -> IO Result #