horde-ad-0.2.0.0: Higher Order Reverse Derivatives Efficiently - Automatic Differentiation
Safe HaskellNone
LanguageGHC2024

HordeAd.Core.DeltaEval

Description

Evaluation of delta expressions, that is, transpose of the linear maps of which the delta expressions are sparse representations. See comments in HordeAd.Core.Delta.

Synopsis

Delta expression evaluation

gradientFromDelta :: forall (x :: TK) (z :: TK) target. (ADReadyNoLet target, ShareTensor target) => FullShapeTK x -> FullShapeTK z -> target (ADTensorKind z) -> Delta target z -> target (ADTensorKind x) Source #

The top-level function for computing a gradient of an objective function.

Delta expressions naturally denote forward derivatives, as encoded in function derivativeFromDelta. However, we are usually more interested in computing gradients, which is what gradientFromDelta does. The two functions are bound by the equation from Lemma 5 from the paper "Provably correct, asymptotically efficient, higher-order reverse-mode automatic differentiation":

dt <.> derivativeFromDelta d ds = gradientFromDelta d dt <.> ds

where <.> denotes generalized dot product (multiplying all tensors element-wise and summing the results), d is the top level delta expression from translation of the objective function f to dual numbers, ds belongs to the domain of f and dt to the codomain. In other words, ds is a perturbation (small change) of the arguments of f, for which we compute the derivative, and dt is a sensitivity of the result of f, for which we compute the gradient. Nota bene, this property is checked for many example objective functions (and perturbations and sensitivities) in the horde-ad testsuite.

derivativeFromDelta :: forall (x :: TK) (z :: TK) target. (ADReadyNoLet target, ShareTensor target) => Delta target z -> FullShapeTK (ADTensorKind x) -> target (ADTensorKind x) -> target (ADTensorKind z) Source #

The top-level function for computing a (forward) derivative of an objective function.

Exported to be specialized elsewhere

evalRev :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target) => FullShapeTK y -> EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target Source #

Reverse pass, that is, transpose/evaluation of the delta expressions in order to produce the gradient for the objective function runtime trace represented by the delta expression.

The first argument is the tensor kind that constrains the shapes of the contangent accumulator and the delta expression arguments. The second is the evaluation state being modified. The third is the cotangent accumulator that will become an actual cotangent contribution when complete (see below for an explanation). The fourth is the delta expression node to evaluate.

Obtaining the gradient amounts to transposing the linear map that is straightforwardly represented by the delta expression. The evalRev function transposes the linear map and, at the same time, evaluates the transposed map on the cotangent accumulator value contained in the third argument. If the cotangent and the tensor operations are symbolic, the resulting value represents the transposed map itself, if its free variables are treated as the map's inputs.

evalRevFTK :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target) => EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target Source #

A helper function to evalRev. The FTK suffix denotes it doesn't get an FTK as an argument but reconstructs it as needed.

All constructors that can have a type with TKProduct kind need to be handled here, as opposed to in evalRevSame, except for DeltaInput that is always constructed only in basic kinds even though its type permits others.

evalRevSame :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target, y ~ ADTensorKind y) => EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target Source #

A helper function to evalRev. It assumes the scalar underlying the tensor kind of its arguments is differentiable.

All constructors that can only have types with non-TKProduct kinds (and the DeltaInput constructor and the vector space constructors) can be handled here, where the extra equality constraint makes it easier.

evalRevFromnMap :: forall (target :: Target). (ADReadyNoLet target, ShareTensor target) => EvalState target -> EvalState target Source #

data EvalState (target :: Target) Source #

The state of evaluation. It consists of several maps. The maps indexed by input identifiers and node identifiers eventually store cotangents for their respective nodes. The cotangents are built gradually during the evaluation, by summing cotangent contributions.

Data invariant: keys nMap == keys dMap.