2,748 research outputs found
Tensor Methods for Nonlinear Matrix Completion
In the low rank matrix completion (LRMC) problem, the low rank assumption
means that the columns (or rows) of the matrix to be completed are points on a
low-dimensional linear algebraic variety. This paper extends this thinking to
cases where the columns are points on a low-dimensional nonlinear algebraic
variety, a problem we call Low Algebraic Dimension Matrix Completion (LADMC).
Matrices whose columns belong to a union of subspaces (UoS) are an important
special case. We propose a LADMC algorithm that leverages existing LRMC methods
on a tensorized representation of the data. For example, a second-order
tensorization representation is formed by taking the outer product of each
column with itself, and we consider higher order tensorizations as well. This
approach will succeed in many cases where traditional LRMC is guaranteed to
fail because the data are low-rank in the tensorized representation but not in
the original representation. We also provide a formal mathematical
justification for the success of our method. In particular, we show bounds of
the rank of these data in the tensorized representation, and we prove sampling
requirements to guarantee uniqueness of the solution. Interestingly, the
sampling requirements of our LADMC algorithm nearly match the information
theoretic lower bounds for matrix completion under a UoS model. We also provide
experimental results showing that the new approach significantly outperforms
existing state-of-the-art methods for matrix completion in many situations
Sparse Subspace Clustering: Algorithm, Theory, and Applications
In many real-world problems, we are dealing with collections of
high-dimensional data, such as images, videos, text and web documents, DNA
microarray data, and more. Often, high-dimensional data lie close to
low-dimensional structures corresponding to several classes or categories the
data belongs to. In this paper, we propose and study an algorithm, called
Sparse Subspace Clustering (SSC), to cluster data points that lie in a union of
low-dimensional subspaces. The key idea is that, among infinitely many possible
representations of a data point in terms of other points, a sparse
representation corresponds to selecting a few points from the same subspace.
This motivates solving a sparse optimization program whose solution is used in
a spectral clustering framework to infer the clustering of data into subspaces.
Since solving the sparse optimization program is in general NP-hard, we
consider a convex relaxation and show that, under appropriate conditions on the
arrangement of subspaces and the distribution of data, the proposed
minimization program succeeds in recovering the desired sparse representations.
The proposed algorithm can be solved efficiently and can handle data points
near the intersections of subspaces. Another key advantage of the proposed
algorithm with respect to the state of the art is that it can deal with data
nuisances, such as noise, sparse outlying entries, and missing entries,
directly by incorporating the model of the data into the sparse optimization
program. We demonstrate the effectiveness of the proposed algorithm through
experiments on synthetic data as well as the two real-world problems of motion
segmentation and face clustering
- …