182,293 research outputs found
On Randomly Projected Hierarchical Clustering with Guarantees
Hierarchical clustering (HC) algorithms are generally limited to small data
instances due to their runtime costs. Here we mitigate this shortcoming and
explore fast HC algorithms based on random projections for single (SLC) and
average (ALC) linkage clustering as well as for the minimum spanning tree
problem (MST). We present a thorough adaptive analysis of our algorithms that
improve prior work from by up to a factor of for a
dataset of points in Euclidean space. The algorithms maintain, with
arbitrary high probability, the outcome of hierarchical clustering as well as
the worst-case running-time guarantees. We also present parameter-free
instances of our algorithms.Comment: This version contains the conference paper "On Randomly Projected
Hierarchical Clustering with Guarantees'', SIAM International Conference on
Data Mining (SDM), 2014 and, additionally, proofs omitted in the conference
versio
Algorithms of maximum likelihood data clustering with applications
We address the problem of data clustering by introducing an unsupervised,
parameter free approach based on maximum likelihood principle. Starting from
the observation that data sets belonging to the same cluster share a common
information, we construct an expression for the likelihood of any possible
cluster structure. The likelihood in turn depends only on the Pearson's
coefficient of the data. We discuss clustering algorithms that provide a fast
and reliable approximation to maximum likelihood configurations. Compared to
standard clustering methods, our approach has the advantages that i) it is
parameter free, ii) the number of clusters need not be fixed in advance and
iii) the interpretation of the results is transparent. In order to test our
approach and compare it with standard clustering algorithms, we analyze two
very different data sets: Time series of financial market returns and gene
expression data. We find that different maximization algorithms produce similar
cluster structures whereas the outcome of standard algorithms has a much wider
variability.Comment: Accepted by Physica A; 12 pag., 5 figures. More information at:
http://www.sissa.it/dataclusterin
A Novel Clustering Algorithm Based on Quantum Games
Enormous successes have been made by quantum algorithms during the last
decade. In this paper, we combine the quantum game with the problem of data
clustering, and then develop a quantum-game-based clustering algorithm, in
which data points in a dataset are considered as players who can make decisions
and implement quantum strategies in quantum games. After each round of a
quantum game, each player's expected payoff is calculated. Later, he uses a
link-removing-and-rewiring (LRR) function to change his neighbors and adjust
the strength of links connecting to them in order to maximize his payoff.
Further, algorithms are discussed and analyzed in two cases of strategies, two
payoff matrixes and two LRR functions. Consequently, the simulation results
have demonstrated that data points in datasets are clustered reasonably and
efficiently, and the clustering algorithms have fast rates of convergence.
Moreover, the comparison with other algorithms also provides an indication of
the effectiveness of the proposed approach.Comment: 19 pages, 5 figures, 5 table
Review on recent developments in jet finding
We review recent developments related to jet clustering algorithms and jet
finding. These include fast implementations of sequential recombination
algorithms, new IRC safe algorithms, quantitative determination of jet areas
and quality measures for jet finding, among many others. We also briefly
discuss the status of jet finding in heavy ion collisions, where full QCD jets
have been measured for the first time at RHIC.Comment: 5 pages, 5 figures, proceedings of the International Symposium on
Multiparticle Dynamics 08, 15-20 september 2008, DES
- …