12,745 research outputs found
L1-norm Regularized L1-norm Best-fit line problem
Background
Conventional Principal Component Analysis (PCA) is a widely used technique to reduce data dimension. PCA finds linear combinations of the original features capturing maximal variance of data via Singular Value Decomposition (SVD). However, SVD is sensitive to outliers, and often leads to high dimensional results. To address the issues, we propose a new method to estimate best-fit one-dimensional subspace, called l1-norm Regularized l1-norm.
Methods
In this article, we describe a method to fit a lower-dimensional subspace by approximate a non-linear, non-convex, non-smooth optimization problem called l1 regularized l1-norm Best- Fit Line problem; minimize a combination of the l1 error and of the l1 regularization. The procedure can be simply performed using ratios and sorting. Also ,we present applications in the area of video surveillance, where our methodology allows for background subtraction with jitters, illumination changes, and clutters.
Results
We compared our performance with SVD on synthetic data. The numerical results showed our algorithm successfully found a better principal component from a grossly corrupted data than SVD in terms of discordance. Moreover, our algorithm provided a sparser principal component than SVD. However, we expect it to be faster on multi-node environment.
Conclusions
This paper proposes a new algorithm able to generate a sparse best-fit subspace robust to outliers. The projected subspaces sought on non-contaminated data, differ little from that of traditional PCA. When subspaces are projected from contaminated data, it attain arguably significant both smaller discordance and lower dimension than that of traditional PCA.https://scholarscompass.vcu.edu/gradposters/1074/thumbnail.jp
Randomized Robust Subspace Recovery for High Dimensional Data Matrices
This paper explores and analyzes two randomized designs for robust Principal
Component Analysis (PCA) employing low-dimensional data sketching. In one
design, a data sketch is constructed using random column sampling followed by
low dimensional embedding, while in the other, sketching is based on random
column and row sampling. Both designs are shown to bring about substantial
savings in complexity and memory requirements for robust subspace learning over
conventional approaches that use the full scale data. A characterization of the
sample and computational complexity of both designs is derived in the context
of two distinct outlier models, namely, sparse and independent outlier models.
The proposed randomized approach can provably recover the correct subspace with
computational and sample complexity that are almost independent of the size of
the data. The results of the mathematical analysis are confirmed through
numerical simulations using both synthetic and real data
Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization
Principal component analysis (PCA) is widely used for dimensionality
reduction, with well-documented merits in various applications involving
high-dimensional data, including computer vision, preference measurement, and
bioinformatics. In this context, the fresh look advocated here permeates
benefits from variable selection and compressive sampling, to robustify PCA
against outliers. A least-trimmed squares estimator of a low-rank bilinear
factor analysis model is shown closely related to that obtained from an
-(pseudo)norm-regularized criterion encouraging sparsity in a matrix
explicitly modeling the outliers. This connection suggests robust PCA schemes
based on convex relaxation, which lead naturally to a family of robust
estimators encompassing Huber's optimal M-class as a special case. Outliers are
identified by tuning a regularization parameter, which amounts to controlling
sparsity of the outlier matrix along the whole robustification path of (group)
least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its
neat ties to robust statistics, the developed outlier-aware PCA framework is
versatile to accommodate novel and scalable algorithms to: i) track the
low-rank signal subspace robustly, as new data are acquired in real time; and
ii) determine principal components robustly in (possibly) infinite-dimensional
feature spaces. Synthetic and real data tests corroborate the effectiveness of
the proposed robust PCA schemes, when used to identify aberrant responses in
personality assessment surveys, as well as unveil communities in social
networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin
Robust Orthogonal Complement Principal Component Analysis
Recently, the robustification of principal component analysis has attracted
lots of attention from statisticians, engineers and computer scientists. In
this work we study the type of outliers that are not necessarily apparent in
the original observation space but can seriously affect the principal subspace
estimation. Based on a mathematical formulation of such transformed outliers, a
novel robust orthogonal complement principal component analysis (ROC-PCA) is
proposed. The framework combines the popular sparsity-enforcing and low rank
regularization techniques to deal with row-wise outliers as well as
element-wise outliers. A non-asymptotic oracle inequality guarantees the
accuracy and high breakdown performance of ROC-PCA in finite samples. To tackle
the computational challenges, an efficient algorithm is developed on the basis
of Stiefel manifold optimization and iterative thresholding. Furthermore, a
batch variant is proposed to significantly reduce the cost in ultra high
dimensions. The paper also points out a pitfall of a common practice of SVD
reduction in robust PCA. Experiments show the effectiveness and efficiency of
ROC-PCA in both synthetic and real data
Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
PCA is one of the most widely used dimension reduction techniques. A related
easier problem is "subspace learning" or "subspace estimation". Given
relatively clean data, both are easily solved via singular value decomposition
(SVD). The problem of subspace learning or PCA in the presence of outliers is
called robust subspace learning or robust PCA (RPCA). For long data sequences,
if one tries to use a single lower dimensional subspace to represent the data,
the required subspace dimension may end up being quite large. For such data, a
better model is to assume that it lies in a low-dimensional subspace that can
change over time, albeit gradually. The problem of tracking such data (and the
subspaces) while being robust to outliers is called robust subspace tracking
(RST). This article provides a magazine-style overview of the entire field of
robust subspace learning and tracking. In particular solutions for three
problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition
(S+LR), RST via S+LR, and "robust subspace recovery (RSR)". RSR assumes that an
entire data vector is either an outlier or an inlier. The S+LR formulation
instead assumes that outliers occur on only a few data vector indices and hence
are well modeled as sparse corruptions.Comment: To appear, IEEE Signal Processing Magazine, July 201
- …