1,326 research outputs found
HIERARCHICAL CLUSTERING USING LEVEL SETS
Over the past several decades, clustering algorithms have earned their place as a go-to solution for database mining. This paper introduces a new concept which is used to develop a new recursive version of DBSCAN that can successfully perform hierarchical clustering, called Level- Set Clustering (LSC). A level-set is a subset of points of a data-set whose densities are greater than some threshold, ‘t’. By graphing the size of each level-set against its respective ‘t,’ indents are produced in the line graph which correspond to clusters in the data-set, as the points in a cluster have very similar densities. This new algorithm is able to produce the clustering result with the same O(n log n) time complexity as DBSCAN and OPTICS, while catching clusters the others missed
Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering
The robust improper maximum likelihood estimator (RIMLE) is a new method for
robust multivariate clustering finding approximately Gaussian clusters. It
maximizes a pseudo-likelihood defined by adding a component with improper
constant density for accommodating outliers to a Gaussian mixture. A special
case of the RIMLE is MLE for multivariate finite Gaussian mixture models. In
this paper we treat existence, consistency, and breakdown theory for the RIMLE
comprehensively. RIMLE's existence is proved under non-smooth covariance matrix
constraints. It is shown that these can be implemented via a computationally
feasible Expectation-Conditional Maximization algorithm.Comment: The title of this paper was originally: "A consistent and breakdown
robust model-based clustering method
Mapping Topographic Structure in White Matter Pathways with Level Set Trees
Fiber tractography on diffusion imaging data offers rich potential for
describing white matter pathways in the human brain, but characterizing the
spatial organization in these large and complex data sets remains a challenge.
We show that level set trees---which provide a concise representation of the
hierarchical mode structure of probability density functions---offer a
statistically-principled framework for visualizing and analyzing topography in
fiber streamlines. Using diffusion spectrum imaging data collected on
neurologically healthy controls (N=30), we mapped white matter pathways from
the cortex into the striatum using a deterministic tractography algorithm that
estimates fiber bundles as dimensionless streamlines. Level set trees were used
for interactive exploration of patterns in the endpoint distributions of the
mapped fiber tracks and an efficient segmentation of the tracks that has
empirical accuracy comparable to standard nonparametric clustering methods. We
show that level set trees can also be generalized to model pseudo-density
functions in order to analyze a broader array of data types, including entire
fiber streamlines. Finally, resampling methods show the reliability of the
level set tree as a descriptive measure of topographic structure, illustrating
its potential as a statistical descriptor in brain imaging analysis. These
results highlight the broad applicability of level set trees for visualizing
and analyzing high-dimensional data like fiber tractography output
- …