241,232 research outputs found
Dimensionality Reduction for k-Means Clustering and Low Rank Approximation
We show how to approximate a data matrix with a much smaller
sketch that can be used to solve a general class of
constrained k-rank approximation problems to within error.
Importantly, this class of problems includes -means clustering and
unconstrained low rank approximation (i.e. principal component analysis). By
reducing data points to just dimensions, our methods generically
accelerate any exact, approximate, or heuristic algorithm for these ubiquitous
problems.
For -means dimensionality reduction, we provide relative
error results for many common sketching techniques, including random row
projection, column selection, and approximate SVD. For approximate principal
component analysis, we give a simple alternative to known algorithms that has
applications in the streaming setting. Additionally, we extend recent work on
column-based matrix reconstruction, giving column subsets that not only `cover'
a good subspace for \bv{A}, but can be used directly to compute this
subspace.
Finally, for -means clustering, we show how to achieve a
approximation by Johnson-Lindenstrauss projecting data points to just dimensions. This gives the first result that leverages the
specific structure of -means to achieve dimension independent of input size
and sublinear in
A new jet algorithm based on the k-means clustering for the reconstruction of heavy states from jets
A jet algorithm based on the k-means clustering procedure is proposed which
can be used for the invariant-mass reconstruction of heavy states decaying to
hadronic jets. The proposed algorithm was tested by reconstructing E+ E- to
ttbar to 6 jets and E+ E- to W+W- to 4 jets processes at \sqrt{s}=500 GeV using
a Monte Carlo simulation. It was shown that the algorithm has a reconstruction
efficiency similar to traditional jet-finding algorithms, and leads to 25% and
40% reduction of reconstruction width for top quarks and W bosons,
respectively, compared to the kT (Durham) algorithm. In addition, it is
expected that the peak positions measured with the new algorithm have smaller
systematical uncertainty.Comment: 11 pages, 3 eps figures (Eur. Phys. J. C, in press
- …