2,164 research outputs found
Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
PCA is one of the most widely used dimension reduction techniques. A related
easier problem is "subspace learning" or "subspace estimation". Given
relatively clean data, both are easily solved via singular value decomposition
(SVD). The problem of subspace learning or PCA in the presence of outliers is
called robust subspace learning or robust PCA (RPCA). For long data sequences,
if one tries to use a single lower dimensional subspace to represent the data,
the required subspace dimension may end up being quite large. For such data, a
better model is to assume that it lies in a low-dimensional subspace that can
change over time, albeit gradually. The problem of tracking such data (and the
subspaces) while being robust to outliers is called robust subspace tracking
(RST). This article provides a magazine-style overview of the entire field of
robust subspace learning and tracking. In particular solutions for three
problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition
(S+LR), RST via S+LR, and "robust subspace recovery (RSR)". RSR assumes that an
entire data vector is either an outlier or an inlier. The S+LR formulation
instead assumes that outliers occur on only a few data vector indices and hence
are well modeled as sparse corruptions.Comment: To appear, IEEE Signal Processing Magazine, July 201
Dimension Detection with Local Homology
Detecting the dimension of a hidden manifold from a point sample has become
an important problem in the current data-driven era. Indeed, estimating the
shape dimension is often the first step in studying the processes or phenomena
associated to the data. Among the many dimension detection algorithms proposed
in various fields, a few can provide theoretical guarantee on the correctness
of the estimated dimension. However, the correctness usually requires certain
regularity of the input: the input points are either uniformly randomly sampled
in a statistical setting, or they form the so-called
-sample which can be neither too dense nor too sparse.
Here, we propose a purely topological technique to detect dimensions. Our
algorithm is provably correct and works under a more relaxed sampling
condition: we do not require uniformity, and we also allow Hausdorff noise. Our
approach detects dimension by determining local homology. The computation of
this topological structure is much less sensitive to the local distribution of
points, which leads to the relaxation of the sampling conditions. Furthermore,
by leveraging various developments in computational topology, we show that this
local homology at a point can be computed \emph{exactly} for manifolds
using Vietoris-Rips complexes whose vertices are confined within a local
neighborhood of . We implement our algorithm and demonstrate the accuracy
and robustness of our method using both synthetic and real data sets
Self-Dictionary Sparse Regression for Hyperspectral Unmixing: Greedy Pursuit and Pure Pixel Search are Related
This paper considers a recently emerged hyperspectral unmixing formulation
based on sparse regression of a self-dictionary multiple measurement vector
(SD-MMV) model, wherein the measured hyperspectral pixels are used as the
dictionary. Operating under the pure pixel assumption, this SD-MMV formalism is
special in that it allows simultaneous identification of the endmember spectral
signatures and the number of endmembers. Previous SD-MMV studies mainly focus
on convex relaxations. In this study, we explore the alternative of greedy
pursuit, which generally provides efficient and simple algorithms. In
particular, we design a greedy SD-MMV algorithm using simultaneous orthogonal
matching pursuit. Intriguingly, the proposed greedy algorithm is shown to be
closely related to some existing pure pixel search algorithms, especially, the
successive projection algorithm (SPA). Thus, a link between SD-MMV and pure
pixel search is revealed. We then perform exact recovery analyses, and prove
that the proposed greedy algorithm is robust to noise---including its
identification of the (unknown) number of endmembers---under a sufficiently low
noise level. The identification performance of the proposed greedy algorithm is
demonstrated through both synthetic and real-data experiments
- …