7,064 research outputs found
Exactly Robust Kernel Principal Component Analysis
Robust principal component analysis (RPCA) can recover low-rank matrices when
they are corrupted by sparse noises. In practice, many matrices are, however,
of high-rank and hence cannot be recovered by RPCA. We propose a novel method
called robust kernel principal component analysis (RKPCA) to decompose a
partially corrupted matrix as a sparse matrix plus a high or full-rank matrix
with low latent dimensionality. RKPCA can be applied to many problems such as
noise removal and subspace clustering and is still the only unsupervised
nonlinear method robust to sparse noises. Our theoretical analysis shows that,
with high probability, RKPCA can provide high recovery accuracy. The
optimization of RKPCA involves nonconvex and indifferentiable problems. We
propose two nonconvex optimization algorithms for RKPCA. They are alternating
direction method of multipliers with backtracking line search and proximal
linearized minimization with adaptive step size. Comparative studies in noise
removal and robust subspace clustering corroborate the effectiveness and
superiority of RKPCA.Comment: The paper was accepted by IEEE Transactions on Neural Networks and
Learning System
Kernelized Low Rank Representation on Grassmann Manifolds
Low rank representation (LRR) has recently attracted great interest due to
its pleasing efficacy in exploring low-dimensional subspace structures embedded
in data. One of its successful applications is subspace clustering which means
data are clustered according to the subspaces they belong to. In this paper, at
a higher level, we intend to cluster subspaces into classes of subspaces. This
is naturally described as a clustering problem on Grassmann manifold. The
novelty of this paper is to generalize LRR on Euclidean space onto an LRR model
on Grassmann manifold in a uniform kernelized framework. The new methods have
many applications in computer vision tasks. Several clustering experiments are
conducted on handwritten digit images, dynamic textures, human face clips and
traffic scene sequences. The experimental results show that the proposed
methods outperform a number of state-of-the-art subspace clustering methods.Comment: 13 page
Low Rank Representation on Grassmann Manifolds: An Extrinsic Perspective
Many computer vision algorithms employ subspace models to represent data. The
Low-rank representation (LRR) has been successfully applied in subspace
clustering for which data are clustered according to their subspace structures.
The possibility of extending LRR on Grassmann manifold is explored in this
paper. Rather than directly embedding Grassmann manifold into a symmetric
matrix space, an extrinsic view is taken by building the self-representation of
LRR over the tangent space of each Grassmannian point. A new algorithm for
solving the proposed Grassmannian LRR model is designed and implemented.
Several clustering experiments are conducted on handwritten digits dataset,
dynamic texture video clips and YouTube celebrity face video data. The
experimental results show our method outperforms a number of existing methods.Comment: 9 page
Kernelized LRR on Grassmann Manifolds for Subspace Clustering
Low rank representation (LRR) has recently attracted great interest due to
its pleasing efficacy in exploring low-dimensional sub- space structures
embedded in data. One of its successful applications is subspace clustering, by
which data are clustered according to the subspaces they belong to. In this
paper, at a higher level, we intend to cluster subspaces into classes of
subspaces. This is naturally described as a clustering problem on Grassmann
manifold. The novelty of this paper is to generalize LRR on Euclidean space
onto an LRR model on Grassmann manifold in a uniform kernelized LRR framework.
The new method has many applications in data analysis in computer vision tasks.
The proposed models have been evaluated on a number of practical data analysis
applications. The experimental results show that the proposed models outperform
a number of state-of-the-art subspace clustering methods
Co-manifold learning with missing data
Representation learning is typically applied to only one mode of a data
matrix, either its rows or columns. Yet in many applications, there is an
underlying geometry to both the rows and the columns. We propose utilizing this
coupled structure to perform co-manifold learning: uncovering the underlying
geometry of both the rows and the columns of a given matrix, where we focus on
a missing data setting. Our unsupervised approach consists of three components.
We first solve a family of optimization problems to estimate a complete matrix
at multiple scales of smoothness. We then use this collection of smooth matrix
estimates to compute pairwise distances on the rows and columns based on a new
multi-scale metric that implicitly introduces a coupling between the rows and
the columns. Finally, we construct row and column representations from these
multi-scale metrics. We demonstrate that our approach outperforms competing
methods in both data visualization and clustering.Comment: 16 pages, 9 figure
Clustering is semidefinitely not that hard: Nonnegative SDP for manifold disentangling
In solving hard computational problems, semidefinite program (SDP)
relaxations often play an important role because they come with a guarantee of
optimality. Here, we focus on a popular semidefinite relaxation of K-means
clustering which yields the same solution as the non-convex original
formulation for well segregated datasets. We report an unexpected finding: when
data contains (greater than zero-dimensional) manifolds, the SDP solution
captures such geometrical structures. Unlike traditional manifold embedding
techniques, our approach does not rely on manually defining a kernel but rather
enforces locality via a nonnegativity constraint. We thus call our approach
NOnnegative MAnifold Disentangling, or NOMAD. To build an intuitive
understanding of its manifold learning capabilities, we develop a theoretical
analysis of NOMAD on idealized datasets. While NOMAD is convex and the globally
optimal solution can be found by generic SDP solvers with polynomial time
complexity, they are too slow for modern datasets. To address this problem, we
analyze a non-convex heuristic and present a new, convex and yet efficient,
algorithm, based on the conditional gradient method. Our results render NOMAD a
versatile, understandable, and powerful tool for manifold learning
Localized LRR on Grassmann Manifolds: An Extrinsic View
Subspace data representation has recently become a common practice in many
computer vision tasks. It demands generalizing classical machine learning
algorithms for subspace data. Low-Rank Representation (LRR) is one of the most
successful models for clustering vectorial data according to their subspace
structures. This paper explores the possibility of extending LRR for subspace
data on Grassmann manifolds. Rather than directly embedding the Grassmann
manifolds into the symmetric matrix space, an extrinsic view is taken to build
the LRR self-representation in the local area of the tangent space at each
Grassmannian point, resulting in a localized LRR method on Grassmann manifolds.
A novel algorithm for solving the proposed model is investigated and
implemented. The performance of the new clustering algorithm is assessed
through experiments on several real-world datasets including MNIST handwritten
digits, ballet video clips, SKIG action clips, DynTex++ dataset and highway
traffic video clips. The experimental results show the new method outperforms a
number of state-of-the-art clustering methodsComment: IEEE Transactions on Circuits and Systems for Video Technology with
Minor Revisions. arXiv admin note: text overlap with arXiv:1504.0180
Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality
We consider the problem of decomposing a higher-order tensor with binary
entries. Such data problems arise frequently in applications such as
neuroimaging, recommendation system, topic modeling, and sensor network
localization. We propose a multilinear Bernoulli model, develop a
rank-constrained likelihood-based estimation method, and obtain the theoretical
accuracy guarantees. In contrast to continuous-valued problems, the binary
tensor problem exhibits an interesting phase transition phenomenon according to
the signal-to-noise ratio. The error bound for the parameter tensor estimation
is established, and we show that the obtained rate is minimax optimal under the
considered model. Furthermore, we develop an alternating optimization algorithm
with convergence guarantees. The efficacy of our approach is demonstrated
through both simulations and analyses of multiple data sets on the tasks of
tensor completion and clustering.Comment: 35 pages, 7 figures, 4 table
Group Preserving Label Embedding for Multi-Label Classification
Multi-label learning is concerned with the classification of data with
multiple class labels. This is in contrast to the traditional classification
problem where every data instance has a single label. Due to the exponential
size of output space, exploiting intrinsic information in feature and label
spaces has been the major thrust of research in recent years and use of
parametrization and embedding have been the prime focus. Researchers have
studied several aspects of embedding which include label embedding, input
embedding, dimensionality reduction and feature selection. These approaches
differ from one another in their capability to capture other intrinsic
properties such as label correlation, local invariance etc. We assume here that
the input data form groups and as a result, the label matrix exhibits a
sparsity pattern and hence the labels corresponding to objects in the same
group have similar sparsity. In this paper, we study the embedding of labels
together with the group information with an objective to build an efficient
multi-label classification. We assume the existence of a low-dimensional space
onto which the feature vectors and label vectors can be embedded. In order to
achieve this, we address three sub-problems namely; (1) Identification of
groups of labels; (2) Embedding of label vectors to a low rank-space so that
the sparsity characteristic of individual groups remains invariant; and (3)
Determining a linear mapping that embeds the feature vectors onto the same set
of points, as in stage 2, in the low-dimensional space. We compare our method
with seven well-known algorithms on twelve benchmark data sets. Our
experimental analysis manifests the superiority of our proposed method over
state-of-art algorithms for multi-label learning
Low-Rank Modeling and Its Applications in Image Analysis
Low-rank modeling generally refers to a class of methods that solve problems
by representing variables of interest as low-rank matrices. It has achieved
great success in various fields including computer vision, data mining, signal
processing and bioinformatics. Recently, much progress has been made in
theories, algorithms and applications of low-rank modeling, such as exact
low-rank matrix recovery via convex programming and matrix completion applied
to collaborative filtering. These advances have brought more and more
attentions to this topic. In this paper, we review the recent advance of
low-rank modeling, the state-of-the-art algorithms, and related applications in
image analysis. We first give an overview to the concept of low-rank modeling
and challenging problems in this area. Then, we summarize the models and
algorithms for low-rank matrix recovery and illustrate their advantages and
limitations with numerical experiments. Next, we introduce a few applications
of low-rank modeling in the context of image analysis. Finally, we conclude
this paper with some discussions.Comment: To appear in ACM Computing Survey
- …