133,790 research outputs found
Generalized mean for robust principal component analysis
AbstractIn this paper, we propose a robust principal component analysis (PCA) to overcome the problem that PCA is prone to outliers included in the training set. Different from the other alternatives which commonly replace L2-norm by other distance measures, the proposed method alleviates the negative effect of outliers using the characteristic of the generalized mean keeping the use of the Euclidean distance. The optimization problem based on the generalized mean is solved by a novel method. We also present a generalized sample mean, which is a generalization of the sample mean, to estimate a robust mean in the presence of outliers. The proposed method shows better or equivalent performance than the conventional PCAs in various problems such as face reconstruction, clustering, and object categorization
Fault Detection of Single and Interval Valued Data Using Statistical Process Monitoring Techniques
Principal component analysis (PCA) is a linear data analysis technique widely used for fault detection and isolation, data modeling, and noise filtration. PCA may be combined with statistical hypothesis testing methods, such as the generalized likelihood ratio (GLR) technique in order to detect faults. GLR functions by using the concept of maximum likelihood estimation (MLE) in order to maximize the detection rate for a fixed false alarm rate. The benchmark Tennessee Eastman Process (TEP) is used to examine the performance of the different techniques, and the results show that for processes that experience both shifts in the mean and/or variance, the best performance is achieved by independently monitoring the mean and variance using two separate GLR charts, rather than simultaneously monitoring them using a single chart. Moreover, single-valued data can be aggregated into interval form in order to provide a more robust model with improved fault detection performance using PCA and GLR. The TEP example is used once more in order to demonstrate the effectiveness of using of interval-valued data over single-valued data
On Weighted Multivariate Sign Functions
Multivariate sign functions are often used for robust estimation and
inference. We propose using data dependent weights in association with such
functions. The proposed weighted sign functions retain desirable robustness
properties, while significantly improving efficiency in estimation and
inference compared to unweighted multivariate sign-based methods. Using
weighted signs, we demonstrate methods of robust location estimation and robust
principal component analysis. We extend the scope of using robust multivariate
methods to include robust sufficient dimension reduction and functional outlier
detection. Several numerical studies and real data applications demonstrate the
efficacy of the proposed methodology.Comment: Keywords: Multivariate sign, Principal component analysis, Data
depth, Sufficient dimension reductio
Robust Orthogonal Complement Principal Component Analysis
Recently, the robustification of principal component analysis has attracted
lots of attention from statisticians, engineers and computer scientists. In
this work we study the type of outliers that are not necessarily apparent in
the original observation space but can seriously affect the principal subspace
estimation. Based on a mathematical formulation of such transformed outliers, a
novel robust orthogonal complement principal component analysis (ROC-PCA) is
proposed. The framework combines the popular sparsity-enforcing and low rank
regularization techniques to deal with row-wise outliers as well as
element-wise outliers. A non-asymptotic oracle inequality guarantees the
accuracy and high breakdown performance of ROC-PCA in finite samples. To tackle
the computational challenges, an efficient algorithm is developed on the basis
of Stiefel manifold optimization and iterative thresholding. Furthermore, a
batch variant is proposed to significantly reduce the cost in ultra high
dimensions. The paper also points out a pitfall of a common practice of SVD
reduction in robust PCA. Experiments show the effectiveness and efficiency of
ROC-PCA in both synthetic and real data
Relaxed 2-D Principal Component Analysis by Norm for Face Recognition
A relaxed two dimensional principal component analysis (R2DPCA) approach is
proposed for face recognition. Different to the 2DPCA, 2DPCA- and G2DPCA,
the R2DPCA utilizes the label information (if known) of training samples to
calculate a relaxation vector and presents a weight to each subset of training
data. A new relaxed scatter matrix is defined and the computed projection axes
are able to increase the accuracy of face recognition. The optimal -norms
are selected in a reasonable range. Numerical experiments on practical face
databased indicate that the R2DPCA has high generalization ability and can
achieve a higher recognition rate than state-of-the-art methods.Comment: 19 pages, 11 figure
Structural Analysis of Network Traffic Matrix via Relaxed Principal Component Pursuit
The network traffic matrix is widely used in network operation and
management. It is therefore of crucial importance to analyze the components and
the structure of the network traffic matrix, for which several mathematical
approaches such as Principal Component Analysis (PCA) were proposed. In this
paper, we first argue that PCA performs poorly for analyzing traffic matrix
that is polluted by large volume anomalies, and then propose a new
decomposition model for the network traffic matrix. According to this model, we
carry out the structural analysis by decomposing the network traffic matrix
into three sub-matrices, namely, the deterministic traffic, the anomaly traffic
and the noise traffic matrix, which is similar to the Robust Principal
Component Analysis (RPCA) problem previously studied in [13]. Based on the
Relaxed Principal Component Pursuit (Relaxed PCP) method and the Accelerated
Proximal Gradient (APG) algorithm, we present an iterative approach for
decomposing a traffic matrix, and demonstrate its efficiency and flexibility by
experimental results. Finally, we further discuss several features of the
deterministic and noise traffic. Our study develops a novel method for the
problem of structural analysis of the traffic matrix, which is robust against
pollution of large volume anomalies.Comment: Accepted to Elsevier Computer Network
- …