futhark-0.25.28: An optimising compiler for a functional, array-oriented language.
Safe HaskellSafe-Inferred
LanguageGHC2021

Futhark.Optimise.Fusion.RulesWithAccs

Description

This module consists of rules for fusion that involves WithAcc constructs. Currently, we support two non-trivial transformations: I. map-flatten-scatter: a map nest produces multi-dimensional index and values arrays that are then flattened and used in a scatter consumer. Such pattern can be fused by re-writing the scatter by means of a WithAcc containing a map-nest, thus eliminating the flatten operations. The obtained WithAcc can then be fused with the producer map nest, e.g., benefiting intra-group kernels. The eloquent target for this rule is an efficient implementation of radix-sort.

II. WithAcc-WithAcc fusion: two withaccs can be fused as long as the common accumulators use the same operator, and as long as the non-accumulator input of an WithAcc is not used as an accumulator in the other. This fusion opens the door for fusing the SOACs appearing inside the WithAccs. This is also intended to demonstrate that it is not so important where exactly the WithAccs were originally introduced in the code, it is more important that they can be transformed by various optimizations passes.

Synopsis

Documentation

ruleMFScat :: (HasScope SOACS m, MonadFreshNames m) => DepNode -> DepGraph -> m (Maybe DepGraph) Source #

Implements a specialized rule for fusing a pattern formed by a map o flatten o scatter, i.e., let (inds, vals) = map-nest f inps (finds, fvals) = (flatten inds, flatten vals) let res = scatter res0 finds fvals where inds & vals have higher rank than finds & fvals.

tryFuseWithAccs :: (HasScope SOACS m, MonadFreshNames m) => [VName] -> Stm SOACS -> Stm SOACS -> m (Maybe (Stm SOACS)) Source #

Simple case for fusing two withAccs (can be extended): let (b1, ..., bm, x1, ..., xq) = withAcc a1 ... am lam1 let (d1, ..., dn, y1, ..., yp) = withAcc c1 ... cn lam2 Notation: `b1 ... bm` are the accumulator results of the first withAcc and `d1, ..., dn` of the second withAcc. `x1 ... xq` and `y1, ..., yp` are non-accumulator results. Conservative conditions: 1. for any bi (i=1..m) either `bi IN {c1, ..., cm}` OR `bi NOT-IN FV(lam2)`, i.e., perfect producer-consumer relation on accums. Of course the binary-op should be the same. 2. The bs that are also accumulated upon in lam2 do NOT belong to the infusible set (they are destroyed) 3. x1 ... xq do not overlap with c1 ... cn Fusion will create one withacc that accumulates on the union of `a1 ... am` and `c1 ... cn` and returns, in addition to the accumulator arrays the union of regular variables `x1 ... xq` and `y1, ..., yp`