golds-gym-0.2.0.0: Golden testing framework for performance benchmarks

Copyright	(c) 2026
License	MIT
Maintainer	your.email@example.com
Safe Haskell	None
Language	Haskell2010

Test.Hspec.BenchGolden.Runner

Contents

Running Benchmarks
Golden File Operations
Comparison
Robust Statistics
Environment

Description

This module handles running benchmarks and comparing results against golden files. It includes:

Benchmark execution with warm-up iterations
Golden file IO (readingwriting JSON statistics)
Tolerance-based comparison with variance warnings
Support for updating baselines via GOLDS_GYM_ACCEPT environment variable

Synopsis

runBenchGolden :: BenchGolden -> IO BenchResult
runBenchmark :: String -> IO () -> BenchConfig -> ArchConfig -> IO GoldenStats
runBenchmarkWithRawTimings :: String -> IO () -> BenchConfig -> ArchConfig -> IO GoldenStats
readGoldenFile :: FilePath -> IO (Either String GoldenStats)
writeGoldenFile :: FilePath -> FilePath -> String -> GoldenStats -> IO ()
writeActualFile :: FilePath -> FilePath -> String -> GoldenStats -> IO ()
getGoldenPath :: FilePath -> FilePath -> String -> FilePath
getActualPath :: FilePath -> FilePath -> String -> FilePath
compareStats :: BenchConfig -> GoldenStats -> GoldenStats -> BenchResult
checkVariance :: BenchConfig -> GoldenStats -> GoldenStats -> [Warning]
calculateRobustStats :: BenchConfig -> Vector Double -> Double -> (Double, Double, Double, [Double])
calculateTrimmedMean :: Double -> Vector Double -> Double
calculateMAD :: Vector Double -> Double -> Double
calculateIQR :: Vector Double -> Double
detectOutliers :: Double -> Vector Double -> Double -> Double -> [Double]
shouldUpdateGolden :: IO Bool
shouldSkipBenchmarks :: IO Bool
setAcceptGoldens :: Bool -> IO ()
setSkipBenchmarks :: Bool -> IO ()

Running Benchmarks

runBenchGolden :: BenchGolden -> IO BenchResult Source #

Run a benchmark golden test.

This function:

Runs warm-up iterations (discarded)
Runs the actual benchmark
Writes actual results to .actual file
If no golden exists, creates it (first run)
Otherwise, compares against golden with tolerance

The result includes any warnings (e.g., variance changes).

runBenchmark :: String -> IO () -> BenchConfig -> ArchConfig -> IO GoldenStats Source #

Run a benchmark and collect statistics.

runBenchmarkWithRawTimings :: String -> IO () -> BenchConfig -> ArchConfig -> IO GoldenStats Source #

Run a benchmark with raw timing collection for robust statistics.

Golden File Operations

readGoldenFile :: FilePath -> IO (Either String GoldenStats) Source #

Read a golden file.

writeGoldenFile :: FilePath -> FilePath -> String -> GoldenStats -> IO () Source #

Write a golden file.

writeActualFile :: FilePath -> FilePath -> String -> GoldenStats -> IO () Source #

Write an actual results file.

getGoldenPath :: FilePath -> FilePath -> String -> FilePath Source #

Get the path for a golden file.

getActualPath :: FilePath -> FilePath -> String -> FilePath Source #

Get the path for an actual results file.

Comparison

compareStats :: BenchConfig -> GoldenStats -> GoldenStats -> BenchResult Source #

Compare actual stats against golden stats.

Returns a BenchResult indicating whether the benchmark passed, regressed, or improved, along with any warnings.

Hybrid Tolerance Strategy

The comparison uses BOTH percentage and absolute tolerance (when configured):

Calculate percentage difference: ((actual - golden) / golden) * 100
Pass if abs(percentDiff) <= tolerancePercent (percentage check)
OR if abs(actual - golden) <= absoluteToleranceMs (absolute check)

This prevents false failures for sub-millisecond operations where measurement noise creates large percentage variations despite negligible absolute differences.

checkVariance :: BenchConfig -> GoldenStats -> GoldenStats -> [Warning] Source #

Check for variance changes and generate warnings.

Robust Statistics

calculateRobustStats :: BenchConfig -> Vector Double -> Double -> (Double, Double, Double, [Double]) Source #

Calculate robust statistics from raw timing data.

Returns: (trimmed mean, MAD, IQR, outliers)

calculateTrimmedMean :: Double -> Vector Double -> Double Source #

Calculate trimmed mean by removing specified percentage from each tail.

calculateMAD :: Vector Double -> Double -> Double Source #

Calculate Median Absolute Deviation (MAD).

MAD = median(|x_i - median(x)|)

calculateIQR :: Vector Double -> Double Source #

Calculate Interquartile Range (IQR = Q3 - Q1).

detectOutliers :: Double -> Vector Double -> Double -> Double -> [Double] Source #

Detect outliers using MAD-based threshold.

An observation is an outlier if: |x - median| > threshold * MAD

Environment

shouldUpdateGolden :: IO Bool Source #

Check if golden files should be updated.

Returns True if GOLDS_GYM_ACCEPT environment variable is set.

Usage:

GOLDS_GYM_ACCEPT=1 cabal test
GOLDS_GYM_ACCEPT=1 stack test

shouldSkipBenchmarks :: IO Bool Source #

Check if benchmarks should be skipped entirely.

Returns True if GOLDS_GYM_SKIP environment variable is set. Useful for CI environments where benchmark hardware is inconsistent.

Usage:

GOLDS_GYM_SKIP=1 cabal test
GOLDS_GYM_SKIP=1 stack test

setAcceptGoldens :: Bool -> IO () Source #

Set the accept goldens flag (called from BenchGolden Example instance).

setSkipBenchmarks :: Bool -> IO () Source #

Set the skip benchmarks flag (called from BenchGolden Example instance).