19,877 research outputs found
Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization
Principal component analysis (PCA) is widely used for dimensionality
reduction, with well-documented merits in various applications involving
high-dimensional data, including computer vision, preference measurement, and
bioinformatics. In this context, the fresh look advocated here permeates
benefits from variable selection and compressive sampling, to robustify PCA
against outliers. A least-trimmed squares estimator of a low-rank bilinear
factor analysis model is shown closely related to that obtained from an
-(pseudo)norm-regularized criterion encouraging sparsity in a matrix
explicitly modeling the outliers. This connection suggests robust PCA schemes
based on convex relaxation, which lead naturally to a family of robust
estimators encompassing Huber's optimal M-class as a special case. Outliers are
identified by tuning a regularization parameter, which amounts to controlling
sparsity of the outlier matrix along the whole robustification path of (group)
least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its
neat ties to robust statistics, the developed outlier-aware PCA framework is
versatile to accommodate novel and scalable algorithms to: i) track the
low-rank signal subspace robustly, as new data are acquired in real time; and
ii) determine principal components robustly in (possibly) infinite-dimensional
feature spaces. Synthetic and real data tests corroborate the effectiveness of
the proposed robust PCA schemes, when used to identify aberrant responses in
personality assessment surveys, as well as unveil communities in social
networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin
Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
PCA is one of the most widely used dimension reduction techniques. A related
easier problem is "subspace learning" or "subspace estimation". Given
relatively clean data, both are easily solved via singular value decomposition
(SVD). The problem of subspace learning or PCA in the presence of outliers is
called robust subspace learning or robust PCA (RPCA). For long data sequences,
if one tries to use a single lower dimensional subspace to represent the data,
the required subspace dimension may end up being quite large. For such data, a
better model is to assume that it lies in a low-dimensional subspace that can
change over time, albeit gradually. The problem of tracking such data (and the
subspaces) while being robust to outliers is called robust subspace tracking
(RST). This article provides a magazine-style overview of the entire field of
robust subspace learning and tracking. In particular solutions for three
problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition
(S+LR), RST via S+LR, and "robust subspace recovery (RSR)". RSR assumes that an
entire data vector is either an outlier or an inlier. The S+LR formulation
instead assumes that outliers occur on only a few data vector indices and hence
are well modeled as sparse corruptions.Comment: To appear, IEEE Signal Processing Magazine, July 201
From Sparse Signals to Sparse Residuals for Robust Sensing
One of the key challenges in sensor networks is the extraction of information
by fusing data from a multitude of distinct, but possibly unreliable sensors.
Recovering information from the maximum number of dependable sensors while
specifying the unreliable ones is critical for robust sensing. This sensing
task is formulated here as that of finding the maximum number of feasible
subsystems of linear equations, and proved to be NP-hard. Useful links are
established with compressive sampling, which aims at recovering vectors that
are sparse. In contrast, the signals here are not sparse, but give rise to
sparse residuals. Capitalizing on this form of sparsity, four sensing schemes
with complementary strengths are developed. The first scheme is a convex
relaxation of the original problem expressed as a second-order cone program
(SOCP). It is shown that when the involved sensing matrices are Gaussian and
the reliable measurements are sufficiently many, the SOCP can recover the
optimal solution with overwhelming probability. The second scheme is obtained
by replacing the initial objective function with a concave one. The third and
fourth schemes are tailored for noisy sensor data. The noisy case is cast as a
combinatorial problem that is subsequently surrogated by a (weighted) SOCP.
Interestingly, the derived cost functions fall into the framework of robust
multivariate linear regression, while an efficient block-coordinate descent
algorithm is developed for their minimization. The robust sensing capabilities
of all schemes are verified by simulated tests.Comment: Under review for publication in the IEEE Transactions on Signal
Processing (revised version
- …