hierarchical-clustering: Fast algorithms for single, average/UPGMA and complete linkage clustering.
This package provides a function to create a dendrogram from a list of items and a distance function between them. Initially a singleton cluster is created for each item, and then new, bigger clusters are created by merging the two clusters with least distance between them. The distance between two clusters is calculated according to the linkage type. The dendrogram represents not only the clusters but also the order on which they were created.
This package has many implementations with different
performance characteristics. There are SLINK and CLINK
algorithm implementations that are optimal in both space and
time. There are also naive implementations using a distance
matrix. Using the dendrogram
function from
Data.Clustering.Hierarchical
automatically chooses the best
implementation we have.
Changes in version 0.4.4:
Remove most upper bounds.
Changes in version 0.4:
Specialize the distance type to Double for efficiency reasons. It's uncommon to use distances other than Double.
Implement SLINK and CLINK. These are optimal algorithms in both space and time for single and complete linkage, respectively, running in O(n^2) time and O(n) space.
Reorganized internal implementation.
Some performance improvements for the naive implementation.
Better test coverage. Also, performance improvements for the test suite, now running in 3 seconds (instead of one minute).
Changes in version 0.3.1.2 (version 0.3.1.1 was skipped):
Added tests for many things. Use
cabal test
=).
Changes in version 0.3.1:
Works with containers 0.4 (thanks, Doug Beardsley).
Removed some internal unnecessary overheads and added some strictness.
Changes in version 0.3.0.1:
Listed changes of unreleased version 0.2.
Changes in version 0.3:
Added function
cutAt
.Fixed complexity in Haddock comments.
Changes in version 0.2:
Added function
elements
.Added separate functions for each linkage type. This may be useful if you want to create a dendrogram and your distance data type isn't an instance of
Floating
.
Modules
[Index]
Downloads
- hierarchical-clustering-0.4.4.tar.gz [browse] (Cabal source package)
- Package description (revised from the package)
Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.
Maintainer's Corner
For package maintainers and hackage trustees
Candidates
- No Candidates
Versions [RSS] | 0.1, 0.3, 0.3.0.1, 0.3.1, 0.3.1.2, 0.4, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.5, 0.4.6, 0.4.7 |
---|---|
Dependencies | array (>=0.3), base (>=4 && <4.8), containers (>=0.3) [details] |
License | BSD-3-Clause |
Author | Felipe Almeida Lessa |
Maintainer | felipe.lessa@gmail.com |
Revised | Revision 1 made by HerbertValerioRiedel at 2016-01-17T23:18:08Z |
Category | Clustering |
Source repo | head: darcs get http://patch-tag.com/r/felipe/hierarchical-clustering |
Uploaded | by FelipeLessa at 2014-08-14T18:56:52Z |
Distributions | Debian:0.4.7 |
Reverse Dependencies | 10 direct, 0 indirect [details] |
Downloads | 10979 total (7 in the last 30 days) |
Rating | (no votes yet) [estimated by Bayesian average] |
Your Rating | |
Status | Docs available [build log] Successful builds reported [all 1 reports] |