Safe Haskell | None |
---|---|
Language | GHC2024 |
HordeAd.Core.DeltaEval
Description
Evaluation of delta expressions, that is, transpose of the linear maps of which the delta expressions are sparse representations. See comments in HordeAd.Core.Delta.
Synopsis
- gradientFromDelta :: forall (x :: TK) (z :: TK) target. (ADReadyNoLet target, ShareTensor target) => FullShapeTK x -> FullShapeTK z -> target (ADTensorKind z) -> Delta target z -> target (ADTensorKind x)
- derivativeFromDelta :: forall (x :: TK) (z :: TK) target. (ADReadyNoLet target, ShareTensor target) => Delta target z -> FullShapeTK (ADTensorKind x) -> target (ADTensorKind x) -> target (ADTensorKind z)
- evalRev :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target) => FullShapeTK y -> EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target
- evalRevFTK :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target) => EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target
- evalRevSame :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target, y ~ ADTensorKind y) => EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target
- evalRevFromnMap :: forall (target :: Target). (ADReadyNoLet target, ShareTensor target) => EvalState target -> EvalState target
- data EvalState (target :: Target)
Delta expression evaluation
gradientFromDelta :: forall (x :: TK) (z :: TK) target. (ADReadyNoLet target, ShareTensor target) => FullShapeTK x -> FullShapeTK z -> target (ADTensorKind z) -> Delta target z -> target (ADTensorKind x) Source #
The top-level function for computing a gradient of an objective function.
Delta expressions naturally denote forward derivatives, as encoded
in function derivativeFromDelta
. However, we are usually more
interested in computing gradients, which is what gradientFromDelta
does.
The two functions are bound by the equation from Lemma 5 from the paper
"Provably correct, asymptotically efficient, higher-order reverse-mode
automatic differentiation":
dt <.> derivativeFromDelta d ds = gradientFromDelta d dt <.> ds
where <.>
denotes generalized dot product (multiplying
all tensors element-wise and summing the results), d
is the top level
delta expression from translation of the objective function f
to dual
numbers, ds
belongs to the domain of f
and dt
to the codomain.
In other words, ds
is a perturbation (small change) of the arguments
of f
, for which we compute the derivative, and dt
is a sensitivity
of the result of f
, for which we compute the gradient.
Nota bene, this property is checked for many example objective functions
(and perturbations and sensitivities) in the horde-ad testsuite.
derivativeFromDelta :: forall (x :: TK) (z :: TK) target. (ADReadyNoLet target, ShareTensor target) => Delta target z -> FullShapeTK (ADTensorKind x) -> target (ADTensorKind x) -> target (ADTensorKind z) Source #
The top-level function for computing a (forward) derivative of an objective function.
Exported to be specialized elsewhere
evalRev :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target) => FullShapeTK y -> EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target Source #
Reverse pass, that is, transpose/evaluation of the delta expressions in order to produce the gradient for the objective function runtime trace represented by the delta expression.
The first argument is the tensor kind that constrains the shapes of the contangent accumulator and the delta expression arguments. The second is the evaluation state being modified. The third is the cotangent accumulator that will become an actual cotangent contribution when complete (see below for an explanation). The fourth is the delta expression node to evaluate.
Obtaining the gradient amounts to transposing the linear map
that is straightforwardly represented by the delta expression.
The evalRev
function transposes the linear map and,
at the same time, evaluates the transposed map on the cotangent accumulator
value contained in the third argument. If the cotangent and the tensor
operations are symbolic, the resulting value represents the transposed
map itself, if its free variables are treated as the map's inputs.
evalRevFTK :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target) => EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target Source #
A helper function to evalRev
. The FTK
suffix denotes it doesn't get
an FTK as an argument but reconstructs it as needed.
All constructors that can have a type with TKProduct kind
need to be handled here,
as opposed to in evalRevSame
, except for DeltaInput that is always
constructed only in basic kinds even though its type permits others.
evalRevSame :: forall (y :: TK) target. (ADReadyNoLet target, ShareTensor target, y ~ ADTensorKind y) => EvalState target -> target (ADTensorKind y) -> Delta target y -> EvalState target Source #
A helper function to evalRev
. It assumes the scalar underlying
the tensor kind of its arguments is differentiable.
All constructors that can only have types with non-TKProduct kinds (and the DeltaInput constructor and the vector space constructors) can be handled here, where the extra equality constraint makes it easier.
evalRevFromnMap :: forall (target :: Target). (ADReadyNoLet target, ShareTensor target) => EvalState target -> EvalState target Source #
data EvalState (target :: Target) Source #
The state of evaluation. It consists of several maps. The maps indexed by input identifiers and node identifiers eventually store cotangents for their respective nodes. The cotangents are built gradually during the evaluation, by summing cotangent contributions.
Data invariant: keys nMap == keys dMap.