4,954 research outputs found
Compressive PCA for Low-Rank Matrices on Graphs
We introduce a novel framework for an approxi- mate recovery of data matrices
which are low-rank on graphs, from sampled measurements. The rows and columns
of such matrices belong to the span of the first few eigenvectors of the graphs
constructed between their rows and columns. We leverage this property to
recover the non-linear low-rank structures efficiently from sampled data
measurements, with a low cost (linear in n). First, a Resrtricted Isometry
Property (RIP) condition is introduced for efficient uniform sampling of the
rows and columns of such matrices based on the cumulative coherence of graph
eigenvectors. Secondly, a state-of-the-art fast low-rank recovery method is
suggested for the sampled data. Finally, several efficient, parallel and
parameter-free decoders are presented along with their theoretical analysis for
decoding the low-rank and cluster indicators for the full data matrix. Thus, we
overcome the computational limitations of the standard linear low-rank recovery
methods for big datasets. Our method can also be seen as a major step towards
efficient recovery of non- linear low-rank structures. For a matrix of size n X
p, on a single core machine, our method gains a speed up of over Robust
Principal Component Analysis (RPCA), where k << p is the subspace dimension.
Numerically, we can recover a low-rank matrix of size 10304 X 1000, 100 times
faster than Robust PCA
CUR Decompositions, Similarity Matrices, and Subspace Clustering
A general framework for solving the subspace clustering problem using the CUR
decomposition is presented. The CUR decomposition provides a natural way to
construct similarity matrices for data that come from a union of unknown
subspaces . The similarity
matrices thus constructed give the exact clustering in the noise-free case.
Additionally, this decomposition gives rise to many distinct similarity
matrices from a given set of data, which allow enough flexibility to perform
accurate clustering of noisy data. We also show that two known methods for
subspace clustering can be derived from the CUR decomposition. An algorithm
based on the theoretical construction of similarity matrices is presented, and
experiments on synthetic and real data are presented to test the method.
Additionally, an adaptation of our CUR based similarity matrices is utilized
to provide a heuristic algorithm for subspace clustering; this algorithm yields
the best overall performance to date for clustering the Hopkins155 motion
segmentation dataset.Comment: Approximately 30 pages. Current version contains improved algorithm
and numerical experiments from the previous versio
- âŠ