| Copyright | Copyright (c) 2006 Robert Dockins | 
|---|---|
| License | MIT; see COPYRIGHT file for terms and conditions | 
| Maintainer | robdockins AT fastmail DOT fm | 
| Stability | stable | 
| Portability | GHC, Hugs (MPTC and FD) | 
| Safe Haskell | Safe | 
| Language | Haskell98 | 
Data.Edison
Contents
Description
Edison is a library of purely functional data structures written by Chris Okasaki. It is named after Thomas Alva Edison and for the mnemonic value EDiSon (Efficent Data Structures).
Edison provides several families of abstractions, each with multiple implementations. The main abstractions provided by Edison are:
- Sequences such as stacks, queues, and dequeues,
- Collections such as sets, bags and heaps, and
- Associative Collections such as finite maps and priority queues where the priority and element are distinct.
Conventions:
Each data structure is implemented as a separate module.  These modules
   should always be imported qualified to prevent a flood of name clashes,
   and it is recommended to rename the module using the as keyword to reduce
   the overhead of qualified names and to make substituting one implementation
   for another as painless as possible.
Names have been chosen to match standard usage as much as possible.  This
   means that operations for abstractions frequently share the same name
   (for example, empty, null, size, etc).  It also means that in many
   cases names have been reused from the Prelude.  However, the use of
   qualified imports will prevent name reuse from becoming name clashes.  If
   for some reason you chose to import an Edison data structure unqualified,
   you will likely need to import the Prelude hiding the relevant names.
Edison modules also frequently share type names.  For example, most sequence
   type constructors are named Seq.  This additionally aids substituting
   implementations by simply importing a different module.
Argument orders are selected with the following points in mind:
- Partial application: arguments more likely to be static usually appear before other arguments in order to facilitate partial application.
- Collection appears last: in all cases where an operation queries a single collection or modifies an existing collection, the collection argument will appear last. This is something of a de facto standard for Haskell datastructure libraries and lends a degree of consistency to the API.
- Most usual order: where an operation represents a well-known mathematical function on more than one datastructure, the arguments are chosen to match the most usual argument order for the function.
Type classes:
Each family of abstractions is defined as a set of classes: a main class that every implementation of that abstraction should support and several auxiliary subclasses that an implementation may or may not support. However, not all applications require the power of type classes, so each method is also directly accessible from the implementation module. Thus you can choose to use overloading or not, as appropriate for your particular application.
Documentation about the behavior of data structure operations is defined in the modules Data.Edison.Seq, Data.Edison.Coll and Data.Edison.Assoc. Implementations are required to respect the descriptions and axioms found in these modules. In some cases time complexity is also given. Implementations may differ from these time complexities; if so, the differences will be given in the documentation for the individual implementation module.
Notes on Eq and Ord instances:
Many Edison data structures require Eq or Ord contexts to define equivalence
   and total ordering on elements or keys.  Edison makes the following assumptions
   about all such required instances:
- An Eqinstance correctly defines an equivalence relation (but not necessarily structural equality); that is, we assume(==)(considered as a relation) is reflexive, symmetric and transitive, but allow that equivalent items may be distinguishable by other means.
- An Ordinstance correctly defines a total order which is consistent with theEqinstance for that type.
These assumptions correspond to the usual meanings assigned to these classes.  If
   an Edison data structure is used with an Eq or Ord instance which violates these
   assumptions, then the behavior of that data structure is undefined.
Notes on Read and Show instances:
The usual Haskell convention for Read and Show instances (as championed by the
   Haskell "deriving" mechanism), is that show generates a string which is a
   valid Haskell expression built up
   using the data type's data constructors such that, if interpreted as Haskell code, the
   string would generate an identical data item.  Furthermore, the derived  Read
   instances are able to parse such strings, such that (read . show) === id.
   So, derived instances of Read and Show exhibit
   the following useful properties:
- readand- showare complementary; that is,- readis a useful inverse for- show
- showgenerates a string which is legal Haskell code representing the data item
For concrete data types, the deriving mechanism is usually quite sufficient.
   However, for abstract types the derived Read instance may allow users to create data
   which violates invariants. Furthermore, the strings resulting from show reference hidden
   data constructors which violates good software engineering principles and also
   cannot be compiled because the constructors are not available outside the defining module.
Edison avoids most of these problems and still maintains the above useful properties by
   doing conversions to and from lists and inserting explicit calls to the list conversion
   operations.  The corresponding Read instance strips the list conversion call before
   parsing the list.  In this way, private data constructors are not revealed and show strings
   are still legal, compilable Haskell code.  Furthermore, the showed strings gain a degree of
   independence from the underlying datastructure implementation.
For example, calling show on an empty Banker's queue will result in the following string:
Data.Edison.Seq.BankersQueue.fromList []
Datatypes which are not native Edison data structures (such as StandardSet and StandardMap)
   may or may not provide Read or Show instances and, if they exist, they may or may
   not also provide the properties that Edison native Read and Show instances do.
Notes on time complexities:
Some Edison data structures (only the sequences currently) have detailed time complexity information. Unless otherwise stated, these are amortized time complexities, assuming persistent usage of the datastructure. Much of this data comes from:
Martin Holters. Efficent Data Structures in a Lazy Functional Language. Master's Thesis. Chalmers University of Technology, Sweden. 2003.
Notes on unsafe functions:
There are a number of different notions of what constitutes an unsafe function.
   In Haskell, a function is generally called "unsafe" if it can subvert
   type safety or referential integrity, such as unsafePerformIO or unsafeCoerce#.
   In Edison, however, we downgrade the meaning of "unsafe" somewhat.  An
   "unsafe" Edison function is one which, if misused, can violate the structural
   invariants of a data structure.  Misusing an Edison "unsafe" function should
   never cause your runtime to crash or break referential integrity, but it may cause
   later uses of a data structure to behave in undefined ways.  Almost all unsafe functions
   in Edison are labeled with the unsafe prefix.  An exception to this rule is the
   With functions in the Set class, which are also unsafe but do not have
   the prefix.  Unsafe functions will have explicit preconditions listed in their
   documentation.
Notes on ambiguous functions:
Edison also contains some functions which are labeled "ambiguous".  These
   functions cannot violate the structural invariants of a data structure, but, under
   some conditions, the result of applying an ambiguous function is not well defined.
   For ambiguous functions, the result of applying the function may depend on otherwise
   unobservable internal state of the data structure, such as the actual shape of a
   balanced tree.  For example, the AssocX class contains the fold function, which
   folds over the elements in the collection in an arbitrary order.  If the combining
   function passed to fold is not fold-commutative (see below), then the result of
   the fold will depend on the actual order that elements are presented to the
   combining function, which is not defined.
To aid programmers, each API function is labeled ambiguous or unambiguous in its documentation. If a function is unambiguous only under some circumstances, that will also be explicitly stated.
An "unambiguous" operation is one where all correct implementations of the operation will return "indistinguishable" results. For concrete data types, "indistinguishable" means structural equality. An instance of an abstract data type is considered indistinguishable from another if all possible applications of unambiguous operations to both yield indistinguishable results. (Note: this definition is impredicative and rather imprecise. Should it become an issue, I will attempt to develop a better definition. I hope the intent is sufficiently clear).
A higher-order unambiguous operation may be rendered ambiguous if passed a "function" which
   does not respect referential integrity (one containing unsafePerformIO for example).
   Only do something like this if you are 110% sure you know what you are doing, and even then
   think it over two or three times.
How to choose a fold:
Folds are an important class of operations on data structures in a functional
   language; they perform essentially the same role that iterators perform in
   imperative languages.  Edison provides a dizzying array of folds which (hopefully)
   correspond to all the various ways a programmer might want to fold over a data
   structure.  However, it can be difficult to know which fold to choose for a
   particular application.  In general, you should choose a fold which provides
   the fewest guarantees necessary for correctness.  The folds which have fewer
   guarantees give data structure implementers more leeway to provide efficient
   implementations.  For example, if you which to fold a commutative, associative
   function, you should chose fold (which does not guarantee an order) over foldl
   or foldr, which specify particular orders.
Also, if your function is strict in
   the accumulating argument, you should prefer the strict folds (eg, fold'); they will
   often provide better space behavior.  Be aware, however, that the "strict" folds
   are not necessarily more strict than the "non-strict" folds; they merely give
   implementers the option to provide additional strictness if it improves performance.
For associative collections, only use with WithKey folds if you actually need the
   value of the key.
Painfully detailed information about ambiguous folds:
All of the folds that are listed ambiguous are ambiguous because they do not or cannot guarantee a stable order with which the folding function will be applied. However, some functions are order insensitive, and the result will be unambiguous regardless of the fold order chosen. Here we formalize this property, which we call "fold commutativity".
We say f :: a -> b -> b is fold-commutative iff f is unambiguous and
   forall w, z :: b; m, n :: a
      w = z ==> f m (f n w) = f n (f m z)
where = means indistinguishability.
This property is sufficient (but not necessary) to ensure that, for any collection of elements to fold over, folds over all permutations of those elements will generate indistinguishable results. In other words, an ambiguous fold applied to a fold-commutative combining function becomes unambiguous.
Some fold combining functions take their arguments in the reverse order.  We
   straightforwardly extend the notion of fold commutativity to such functions
   by reversing the arguments.  More formally, we say g :: b -> a -> b is fold
   commutative iff flip g :: a -> b -> b is fold commutative.
For folds which take both a key and an element value, we extend the notion of fold
   commutativity by considering the key and element to be a single, uncurried argument.
   More formally, we say g :: k -> a -> b -> b is fold commutative iff
\(k,x) z -> g k x z :: (k,a) -> b -> b
is fold commutative according to the above definition.
Note that for g :: a -> a -> a, if g is unambiguous,
   commutative, and associative, then g is fold-commutative.
Proof:
   let w = z, then
   g m (g n w) = g m (g n z)     g is unambiguous
               = g (g n z) m     commutative property of g
               = g n (g z m)     associative property of g
               = g n (g m z)     commutative property of gQed.
Thus, many common numeric combining functions, including (+) and (*) at
   integral types, are fold commutative and can be safely used with ambiguous
   folds.
Be aware however, that (+) and (*) at floating point types are only
   approximately commutative and associative due to rounding errors; using
   ambiguous folds with these operations may result in subtle differences in
   the results.  As always, be aware of the limitations and numeric
   properties of floating point representations.
About this module:
This module re-exports the various data structure abstraction classes, but
   not their methods. This allows you to write type signatures which have
   contexts that mention Edison type classes without having to import the
   appropriate modules qualified.  The class methods are not exported to
   avoid name clashes.  Obviously, to use the methods of these classes, you
   will have to import the appropriate modules.  This module additionally
   re-exports the entire Data.Edison.Prelude module.
Miscellaneous points:
Some implementations export a few extra functions beyond those included in the relevant classes. These are typically operations that are particularly efficient for that implementation, but are not general enough to warrant inclusion in a class.
Since qualified infix symbols are fairly ugly, they have been largely avoided. However, the Data.Edison.Sym module defines a number of infix operators which alias the prefix operators; this module is intended to be imported unqualified.
Most of the operations on most of the data structures are strict. This is inevitable for data structures with non-trivial invariants. Even given that, however, many of the operations are stricter than necessary. In fact, operations are never deliberately made lazy unless the laziness is required by the algorithm, as can happen with amortized data structures.
Note, however, that the various sequence implementations are always lazy in their elements. Similarly, associative collections are always lazy in their elements (but usually strict in their keys). Non-associative collections are usually strict in their elements.
- class (Functor s, MonadPlus s) => Sequence s
- class (Eq a, Monoid c) => CollX c a | c -> a
- class (CollX c a, Ord a) => OrdCollX c a | c -> a
- class CollX c a => SetX c a | c -> a
- class (OrdCollX c a, SetX c a) => OrdSetX c a | c -> a
- class CollX c a => Coll c a | c -> a
- class (Coll c a, OrdCollX c a) => OrdColl c a | c -> a
- class (Coll c a, SetX c a) => Set c a | c -> a
- class (OrdColl c a, Set c a) => OrdSet c a | c -> a
- class (Eq k, Functor m) => AssocX m k | m -> k
- class (AssocX m k, Ord k) => OrdAssocX m k | m -> k
- class AssocX m k => FiniteMapX m k | m -> k
- class (OrdAssocX m k, FiniteMapX m k) => OrdFiniteMapX m k | m -> k
- class AssocX m k => Assoc m k | m -> k
- class (Assoc m k, OrdAssocX m k) => OrdAssoc m k | m -> k
- class (Assoc m k, FiniteMapX m k) => FiniteMap m k | m -> k
- class (OrdAssoc m k, FiniteMap m k) => OrdFiniteMap m k | m -> k
- module Data.Edison.Prelude
Sequence class
class (Functor s, MonadPlus s) => Sequence s Source
The Sequence class defines an interface for datatypes which
   implement sequences.  A description for each function is
   given below.  
Minimal complete definition
lcons, rcons, fromList, copy, lview, lhead, lheadM, ltail, ltailM, rview, rhead, rheadM, rtail, rtailM, null, size, toList, concat, reverse, reverseOnto, fold, fold', fold1, fold1', foldr, foldr', foldl, foldl', foldr1, foldr1', foldl1, foldl1', reducer, reducer', reducel, reducel', reduce1, reduce1', take, drop, splitAt, subseq, filter, partition, takeWhile, dropWhile, splitWhile, inBounds, lookup, lookupM, lookupWithDefault, update, adjust, mapWithIndex, foldrWithIndex, foldrWithIndex', foldlWithIndex, foldlWithIndex', zip, zip3, zipWith, zipWith3, unzip, unzip3, unzipWith, unzipWith3, strict, strictWith, structuralInvariant, instanceName
Collection classes
Non-observable collections
class (Eq a, Monoid c) => CollX c a | c -> a Source
This is the root class of the collection hierarchy. However, it is perfectly adequate for many applications that use sets or bags.
class (CollX c a, Ord a) => OrdCollX c a | c -> a Source
Collections for which the elements have an ordering relation.
Minimal complete definition
deleteMin, deleteMax, unsafeInsertMin, unsafeInsertMax, unsafeFromOrdSeq, unsafeAppend, filterLT, filterLE, filterGT, filterGE, partitionLT_GE, partitionLE_GT, partitionLT_GT
class CollX c a => SetX c a | c -> a Source
A collection where the set property is maintained; that is, a set
   contains at most one element of the equivalence class formed by the
   Eq instance on the elements.
Minimal complete definition
intersection, difference, symmetricDifference, properSubset, subset
class (OrdCollX c a, SetX c a) => OrdSetX c a | c -> a Source
Sets where the elements also have an ordering relation.
   This class contains no methods; it is only an abbreviation for
   the context (OrdCollX c a,SetX c a).
Observable collections
class CollX c a => Coll c a | c -> a Source
Collections with observable elements. See the module documentation for comments on observability.
class (Coll c a, OrdCollX c a) => OrdColl c a | c -> a Source
Collections with observable elements where the elements additionally have an ordering relation. See the module documentation for comments on observability.
class (Coll c a, SetX c a) => Set c a | c -> a Source
Collections with observable elements where the set property is maintained;
   that is, a set contains at most one element of the equivalence class
   formed by the Eq instance on the elements.
WARNING: Each of the following "With" functions is unsafe. The passed in combining functions are used to choose which element is kept in the case of duplicates. They are required to satisfy the precondition that, given two equal elements, they return a third element equal to the other two. Usually, the combining function just returns its first or second argument, but it can combine elements in non-trivial ways.
The combining function should usually be associative. Where the function involves a sequence of elements, the elements will be combined from left-to-right, but with an unspecified associativity.
For example, if x == y == z,
   then fromSeqWith (+) [x,y,z] equals either
     single (x + (y + z))
   or
     single ((x + y) + z)
Minimal complete definition
fromSeqWith, insertWith, insertSeqWith, unionl, unionr, unionWith, unionSeqWith, intersectionWith
class (OrdColl c a, Set c a) => OrdSet c a | c -> a Source
Collections with observable elements where the set property is maintained and where additionally, there is an ordering relation on the elements. This class introduces no new methods, and is simply an abbreviation for the context:
(OrdColl c a,Set c a)
Associative collection classes
Non-observable associative collections
class (Eq k, Functor m) => AssocX m k | m -> k Source
The root class of the associative collection hierarchy.
Minimal complete definition
empty, singleton, fromSeq, insert, insertSeq, union, unionSeq, delete, deleteAll, deleteSeq, null, size, member, count, lookup, lookupM, lookupAll, lookupAndDelete, lookupAndDeleteM, lookupAndDeleteAll, lookupWithDefault, adjust, adjustAll, adjustOrInsert, adjustAllOrInsert, adjustOrDelete, adjustOrDeleteAll, fold, fold', fold1, fold1', filter, partition, elements, strict, strictWith, structuralInvariant, instanceName
class (AssocX m k, Ord k) => OrdAssocX m k | m -> k Source
An associative collection where the keys additionally have an ordering relation.
Minimal complete definition
minView, minElem, deleteMin, unsafeInsertMin, maxView, maxElem, deleteMax, unsafeInsertMax, foldr, foldr', foldl, foldl', foldr1, foldr1', foldl1, foldl1', unsafeFromOrdSeq, unsafeAppend, filterLT, filterLE, filterGT, filterGE, partitionLT_GE, partitionLE_GT, partitionLT_GT
class AssocX m k => FiniteMapX m k | m -> k Source
An associative collection where the keys form a set; that is, each key appears in the associative collection at most once.
Minimal complete definition
fromSeqWith, fromSeqWithKey, insertWith, insertWithKey, insertSeqWith, insertSeqWithKey, unionl, unionr, unionWith, unionSeqWith, intersectionWith, difference, properSubset, subset, submapBy, properSubmapBy, sameMapBy
class (OrdAssocX m k, FiniteMapX m k) => OrdFiniteMapX m k | m -> k Source
Finite maps where the keys additionally have an ordering relation. This class introduces no new methods.
Observable associative collections
class AssocX m k => Assoc m k | m -> k Source
Associative collections where the keys are observable.
Minimal complete definition
toSeq, keys, mapWithKey, foldWithKey, foldWithKey', filterWithKey, partitionWithKey
class (Assoc m k, OrdAssocX m k) => OrdAssoc m k | m -> k Source
An associative collection with observable keys where the keys additionally have an ordering relation.
Minimal complete definition
minViewWithKey, minElemWithKey, maxViewWithKey, maxElemWithKey, foldrWithKey, foldrWithKey', foldlWithKey, foldlWithKey', toOrdSeq
class (Assoc m k, FiniteMapX m k) => FiniteMap m k | m -> k Source
Finite maps with observable keys.
Minimal complete definition
class (OrdAssoc m k, FiniteMap m k) => OrdFiniteMap m k | m -> k Source
Finite maps with observable keys where the keys additionally have an ordering relation. This class introduces no new methods.
module Data.Edison.Prelude