3,077 research outputs found
Robust Principal Component Analysis?
This paper is about a curious phenomenon. Suppose we have a data matrix,
which is the superposition of a low-rank component and a sparse component. Can
we recover each component individually? We prove that under some suitable
assumptions, it is possible to recover both the low-rank and the sparse
components exactly by solving a very convenient convex program called Principal
Component Pursuit; among all feasible decompositions, simply minimize a
weighted combination of the nuclear norm and of the L1 norm. This suggests the
possibility of a principled approach to robust principal component analysis
since our methodology and results assert that one can recover the principal
components of a data matrix even though a positive fraction of its entries are
arbitrarily corrupted. This extends to the situation where a fraction of the
entries are missing as well. We discuss an algorithm for solving this
optimization problem, and present applications in the area of video
surveillance, where our methodology allows for the detection of objects in a
cluttered background, and in the area of face recognition, where it offers a
principled way of removing shadows and specularities in images of faces
Robust Principal Component Analysis on Graphs
Principal Component Analysis (PCA) is the most widely used tool for linear
dimensionality reduction and clustering. Still it is highly sensitive to
outliers and does not scale well with respect to the number of data samples.
Robust PCA solves the first issue with a sparse penalty term. The second issue
can be handled with the matrix factorization model, which is however
non-convex. Besides, PCA based clustering can also be enhanced by using a graph
of data similarity. In this article, we introduce a new model called "Robust
PCA on Graphs" which incorporates spectral graph regularization into the Robust
PCA framework. Our proposed model benefits from 1) the robustness of principal
components to occlusions and missing values, 2) enhanced low-rank recovery, 3)
improved clustering property due to the graph smoothness assumption on the
low-rank matrix, and 4) convexity of the resulting optimization problem.
Extensive experiments on 8 benchmark, 3 video and 2 artificial datasets with
corruptions clearly reveal that our model outperforms 10 other state-of-the-art
models in its clustering and low-rank recovery tasks
ROBUST PRINCIPAL COMPONENT ANALYSIS
A common technique for robust dispersion estimators is to apply the classical estimator to some subset U of the data. Applying principal component analysis to the subset U can result in a robust principal component analysis with good properties
Robust Principal Component Analysis for Compositional Tables
A data table which is arranged according to two factors can often be
considered as a compositional table. An example is the number of unemployed
people, split according to gender and age classes. Analyzed as compositions,
the relevant information would consist of ratios between different cells of
such a table. This is particularly useful when analyzing several compositional
tables jointly, where the absolute numbers are in very different ranges, e.g.
if unemployment data are considered from different countries. Within the
framework of the logratio methodology, compositional tables can be decomposed
into independent and interactive parts, and orthonormal coordinates can be
assigned to these parts. However, these coordinates usually require some prior
knowledge about the data, and they are not easy to handle for exploring the
relationships between the given factors.
Here we propose a special choice of coordinates with a direct relation to
centered logratio (clr) coefficients, which are particularly useful for an
interpretation in terms of the original cells of the tables. With these
coordinates, robust principal component analysis (PCA) is performed for
dimension reduction, allowing to investigate the relationships between the
factors. The link between orthonormal coordinates and clr coefficients enables
to apply robust PCA, which would otherwise suffer from the singularity of clr
coefficients.Comment: 20 pages, 2 figure
Robust principal component analysis for functional data.
dispersion matrices;
Submodular Load Clustering with Robust Principal Component Analysis
Traditional load analysis is facing challenges with the new electricity usage
patterns due to demand response as well as increasing deployment of distributed
generations, including photovoltaics (PV), electric vehicles (EV), and energy
storage systems (ESS). At the transmission system, despite of irregular load
behaviors at different areas, highly aggregated load shapes still share similar
characteristics. Load clustering is to discover such intrinsic patterns and
provide useful information to other load applications, such as load forecasting
and load modeling. This paper proposes an efficient submodular load clustering
method for transmission-level load areas. Robust principal component analysis
(R-PCA) firstly decomposes the annual load profiles into low-rank components
and sparse components to extract key features. A novel submodular cluster
center selection technique is then applied to determine the optimal cluster
centers through constructed similarity graph. Following the selection results,
load areas are efficiently assigned to different clusters for further load
analysis and applications. Numerical results obtained from PJM load demonstrate
the effectiveness of the proposed approach.Comment: Accepted by 2019 IEEE PES General Meeting, Atlanta, G
Algorithms for Projection - Pursuit robust principal component analysis.
The results of a standard principal component analysis (PCA) can be affected by the presence of outliers. Hence robust alternatives to PCA are needed. One of the most appealing robust methods for principal component analysis uses the Projection-Pursuit principle. Here, one projects the data on a lower-dimensional space such that a robust measure of variance of the projected data will be maximized. The Projection-Pursuit-based method for principal component analysis has recently been introduced in the field of chemometrics, where the number of variables is typically large. In this paper, it is shown that the currently available algorithm for robust Projection-Pursuit PCA performs poor in the presence of many variables. A new algorithm is proposed that is more suitable for the analysis of chemical data. Its performance is studied by means of simulation experiments and illustrated on some real data sets. (c) 2007 Elsevier B.V. All rights reserved.multivariate statistics; optimization; numerical precision; outliers; robustness; scale estimators; estimators; regression;
- …