12,504 research outputs found
Completing Low-Rank Matrices with Corrupted Samples from Few Coefficients in General Basis
Subspace recovery from corrupted and missing data is crucial for various
applications in signal processing and information theory. To complete missing
values and detect column corruptions, existing robust Matrix Completion (MC)
methods mostly concentrate on recovering a low-rank matrix from few corrupted
coefficients w.r.t. standard basis, which, however, does not apply to more
general basis, e.g., Fourier basis. In this paper, we prove that the range
space of an matrix with rank can be exactly recovered from few
coefficients w.r.t. general basis, though and the number of corrupted
samples are both as high as . Our model covers
previous ones as special cases, and robust MC can recover the intrinsic matrix
with a higher rank. Moreover, we suggest a universal choice of the
regularization parameter, which is . By our
filtering algorithm, which has theoretical guarantees, we can
further reduce the computational cost of our model. As an application, we also
find that the solutions to extended robust Low-Rank Representation and to our
extended robust MC are mutually expressible, so both our theory and algorithm
can be applied to the subspace clustering problem with missing values under
certain conditions. Experiments verify our theories.Comment: To appear in IEEE Transactions on Information Theor
Oracle Based Active Set Algorithm for Scalable Elastic Net Subspace Clustering
State-of-the-art subspace clustering methods are based on expressing each
data point as a linear combination of other data points while regularizing the
matrix of coefficients with , or nuclear norms.
regularization is guaranteed to give a subspace-preserving affinity (i.e.,
there are no connections between points from different subspaces) under broad
theoretical conditions, but the clusters may not be connected. and
nuclear norm regularization often improve connectivity, but give a
subspace-preserving affinity only for independent subspaces. Mixed ,
and nuclear norm regularizations offer a balance between the
subspace-preserving and connectedness properties, but this comes at the cost of
increased computational complexity. This paper studies the geometry of the
elastic net regularizer (a mixture of the and norms) and uses
it to derive a provably correct and scalable active set method for finding the
optimal coefficients. Our geometric analysis also provides a theoretical
justification and a geometric interpretation for the balance between the
connectedness (due to regularization) and subspace-preserving (due to
regularization) properties for elastic net subspace clustering. Our
experiments show that the proposed active set method not only achieves
state-of-the-art clustering performance, but also efficiently handles
large-scale datasets.Comment: 15 pages, 6 figures, accepted to CVPR 2016 for oral presentatio
Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit
Subspace clustering methods based on , or nuclear norm
regularization have become very popular due to their simplicity, theoretical
guarantees and empirical success. However, the choice of the regularizer can
greatly impact both theory and practice. For instance, regularization
is guaranteed to give a subspace-preserving affinity (i.e., there are no
connections between points from different subspaces) under broad conditions
(e.g., arbitrary subspaces and corrupted data). However, it requires solving a
large scale convex optimization problem. On the other hand, and
nuclear norm regularization provide efficient closed form solutions, but
require very strong assumptions to guarantee a subspace-preserving affinity,
e.g., independent subspaces and uncorrupted data. In this paper we study a
subspace clustering method based on orthogonal matching pursuit. We show that
the method is both computationally efficient and guaranteed to give a
subspace-preserving affinity under broad conditions. Experiments on synthetic
data verify our theoretical analysis, and applications in handwritten digit and
face clustering show that our approach achieves the best trade off between
accuracy and efficiency.Comment: 13 pages, 1 figure, 2 tables. Accepted to CVPR 2016 as an oral
presentatio
Graph Connectivity in Noisy Sparse Subspace Clustering
Subspace clustering is the problem of clustering data points into a union of
low-dimensional linear/affine subspaces. It is the mathematical abstraction of
many important problems in computer vision, image processing and machine
learning. A line of recent work (4, 19, 24, 20) provided strong theoretical
guarantee for sparse subspace clustering (4), the state-of-the-art algorithm
for subspace clustering, on both noiseless and noisy data sets. It was shown
that under mild conditions, with high probability no two points from different
subspaces are clustered together. Such guarantee, however, is not sufficient
for the clustering to be correct, due to the notorious "graph connectivity
problem" (15). In this paper, we investigate the graph connectivity problem for
noisy sparse subspace clustering and show that a simple post-processing
procedure is capable of delivering consistent clustering under certain "general
position" or "restricted eigenvalue" assumptions. We also show that our
condition is almost tight with adversarial noise perturbation by constructing a
counter-example. These results provide the first exact clustering guarantee of
noisy SSC for subspaces of dimension greater then 3.Comment: 14 pages. To appear in The 19th International Conference on
Artificial Intelligence and Statistics, held at Cadiz, Spain in 201
- β¦