Search CORE

14,978 research outputs found

How to Round Subspaces: A New Spectral Clustering Algorithm

Author: Sinop Ali Kemal
Publication venue
Publication date: 19/10/2015
Field of study

A basic problem in spectral clustering is the following. If a solution obtained from the spectral relaxation is close to an integral solution, is it possible to find this integral solution even though they might be in completely different basis? In this paper, we propose a new spectral clustering algorithm. It can recover a

k

-partition such that the subspace corresponding to the span of its indicator vectors is

O(\sqrt{opt})

close to the original subspace in spectral norm with

opt

being the minimum possible (

opt \le 1

always). Moreover our algorithm does not impose any restriction on the cluster sizes. Previously, no algorithm was known which could find a

k

-partition closer than

o(k \cdot opt)

. We present two applications for our algorithm. First one finds a disjoint union of bounded degree expanders which approximate a given graph in spectral norm. The second one is for approximating the sparsest

k

-partition in a graph where each cluster have expansion at most

\phi_k

provided

\phi_k \le O(\lambda_{k+1})

where

\lambda_{k+1}

is the

(k+1)^{st}

eigenvalue of Laplacian matrix. This significantly improves upon the previous algorithms, which required

\phi_k \le O(\lambda_{k+1}/k)

.Comment: Appeared in SODA 201

arXiv.org e-Print Archive

Crossref

Macrostate Data Clustering

Author: A. Pothen
A. Ulitsky
C.J. Alpert
D. Horn
D. Shalloway
Daniel Korenblum
David Shalloway
G. Milligan
K. Rose
L. Angelini
L. Giada
L. Kullmann
M. Blatt
M. Wong
O. Alter
R.B. Altman
S. Wiseman
S.T. Barnard
Publication venue: 'American Physical Society (APS)'
Publication date: 18/06/2003
Field of study

We develop an effective nonhierarchical data clustering method using an analogy to the dynamic coarse graining of a stochastic system. Analyzing the eigensystem of an interitem transition matrix identifies fuzzy clusters corresponding to the metastable macroscopic states (macrostates) of a diffusive system. A "minimum uncertainty criterion" determines the linear transformation from eigenvectors to cluster-defining window functions. Eigenspectrum gap and cluster certainty conditions identify the proper number of clusters. The physically motivated fuzzy representation and associated uncertainty analysis distinguishes macrostate clustering from spectral partitioning methods. Macrostate data clustering solves a variety of test cases that challenge other methods.Comment: keywords: cluster analysis, clustering, pattern recognition, spectral graph theory, dynamic eigenvectors, machine learning, macrostates, classificatio

arXiv.org e-Print Archive

Crossref

CERN Document Server

Poisson noise reduction with non-local PCA

Author: Deledalle Charles-Alban
Harmany Zachary
Salmon Joseph
Willett Rebecca
Publication venue
Publication date: 01/02/2014
Field of study

Photon-limited imaging arises when the number of photons collected by a sensor array is small relative to the number of detector elements. Photon limitations are an important concern for many applications such as spectral imaging, night vision, nuclear medicine, and astronomy. Typically a Poisson distribution is used to model these observations, and the inherent heteroscedasticity of the data combined with standard noise removal methods yields significant artifacts. This paper introduces a novel denoising algorithm for photon-limited images which combines elements of dictionary learning and sparse patch-based representations of images. The method employs both an adaptation of Principal Component Analysis (PCA) for Poisson noise and recently developed sparsity-regularized convex optimization algorithms for photon-limited images. A comprehensive empirical evaluation of the proposed method helps characterize the performance of this approach relative to other state-of-the-art denoising methods. The results reveal that, despite its conceptual simplicity, Poisson PCA-based denoising appears to be highly competitive in very low light regimes.Comment: erratum: Image man is wrongly name pepper in the journal versio

arXiv.org e-Print Archive

Oskar Bordeaux

Algorithmic and Statistical Perspectives on Large-Scale Data Analysis

Author: Mahoney Michael W.
Publication venue
Publication date: 08/10/2010
Field of study

In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors, "Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201

arXiv.org e-Print Archive

CiteSeerX