Search CORE

133,790 research outputs found

Generalized mean for robust principal component analysis

Author: Kwak Nojun
Oh Jiyong
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 30/06/2016
Field of study

AbstractIn this paper, we propose a robust principal component analysis (PCA) to overcome the problem that PCA is prone to outliers included in the training set. Different from the other alternatives which commonly replace L2-norm by other distance measures, the proposed method alleviates the negative effect of outliers using the characteristic of the generalized mean keeping the use of the Euclidean distance. The optimization problem based on the generalized mean is solved by a novel method. We also present a generalized sample mean, which is a generalization of the sample mean, to estimate a robust mean in the presence of outliers. The proposed method shows better or equivalent performance than the conventional PCAs in various problems such as face reconstruction, clustering, and object categorization

Elsevier - Publisher Connector

Fault Detection of Single and Interval Valued Data Using Statistical Process Monitoring Techniques

Author: Basha Nour
Karim Muhammad Nazmul
Nounou Hazem
Nounou Mohamed
Sheriff Mohammed Ziyan
Publication venue: 'IntechOpen'
Publication date: 07/08/2019
Field of study

Principal component analysis (PCA) is a linear data analysis technique widely used for fault detection and isolation, data modeling, and noise filtration. PCA may be combined with statistical hypothesis testing methods, such as the generalized likelihood ratio (GLR) technique in order to detect faults. GLR functions by using the concept of maximum likelihood estimation (MLE) in order to maximize the detection rate for a fixed false alarm rate. The benchmark Tennessee Eastman Process (TEP) is used to examine the performance of the different techniques, and the results show that for processes that experience both shifts in the mean and/or variance, the best performance is achieved by independently monitoring the mean and variance using two separate GLR charts, rather than simultaneously monitoring them using a single chart. Moreover, single-valued data can be aggregated into interval form in order to provide a more robust model with improved fault detection performance using PCA and GLR. The TEP example is used once more in order to demonstrate the effectiveness of using of interval-valued data over single-valued data

IntechOpen

On Weighted Multivariate Sign Functions

Author: Chatterjee Snigdhansu
Majumdar Subhabrata
Publication venue
Publication date: 25/12/2020
Field of study

Multivariate sign functions are often used for robust estimation and inference. We propose using data dependent weights in association with such functions. The proposed weighted sign functions retain desirable robustness properties, while significantly improving efficiency in estimation and inference compared to unweighted multivariate sign-based methods. Using weighted signs, we demonstrate methods of robust location estimation and robust principal component analysis. We extend the scope of using robust multivariate methods to include robust sufficient dimension reduction and functional outlier detection. Several numerical studies and real data applications demonstrate the efficacy of the proposed methodology.Comment: Keywords: Multivariate sign, Principal component analysis, Data depth, Sufficient dimension reductio

arXiv.org e-Print Archive

Robust Orthogonal Complement Principal Component Analysis

Author: Li Shijie
She Yiyuan
Wu Dapeng
Publication venue
Publication date: 27/01/2016
Field of study

Recently, the robustification of principal component analysis has attracted lots of attention from statisticians, engineers and computer scientists. In this work we study the type of outliers that are not necessarily apparent in the original observation space but can seriously affect the principal subspace estimation. Based on a mathematical formulation of such transformed outliers, a novel robust orthogonal complement principal component analysis (ROC-PCA) is proposed. The framework combines the popular sparsity-enforcing and low rank regularization techniques to deal with row-wise outliers as well as element-wise outliers. A non-asymptotic oracle inequality guarantees the accuracy and high breakdown performance of ROC-PCA in finite samples. To tackle the computational challenges, an efficient algorithm is developed on the basis of Stiefel manifold optimization and iterative thresholding. Furthermore, a batch variant is proposed to significantly reduce the cost in ultra high dimensions. The paper also points out a pitfall of a common practice of SVD reduction in robust PCA. Experiments show the effectiveness and efficiency of ROC-PCA in both synthetic and real data

arXiv.org e-Print Archive

CiteSeerX

Relaxed 2-D Principal Component Analysis by $L_p$ Norm for Face Recognition

Author: A d’Aspremont
A Pentland
D Meng
DM Witten
H Shen
H Wang
H Zou
I Jolliffe
J Wang
J Yang
J Ye
L Sirovich
L Zhao
M Kirby
M Turk
M Zhao
N Kwak
N Kwak
Q Chang
R Ma
X Li
Z Jia
Z Jia
Z Jia
Z Jia
Z Jia
Z Liang
Z-G Jia
ZZ Liang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/05/2019
Field of study

A relaxed two dimensional principal component analysis (R2DPCA) approach is proposed for face recognition. Different to the 2DPCA, 2DPCA-

L_1

and G2DPCA, the R2DPCA utilizes the label information (if known) of training samples to calculate a relaxation vector and presents a weight to each subset of training data. A new relaxed scatter matrix is defined and the computed projection axes are able to increase the accuracy of face recognition. The optimal

L_p

-norms are selected in a reasonable range. Numerical experiments on practical face databased indicate that the R2DPCA has high generalization ability and can achieve a higher recognition rate than state-of-the-art methods.Comment: 19 pages, 11 figure

arXiv.org e-Print Archive

Crossref

Structural Analysis of Network Traffic Matrix via Relaxed Principal Component Pursuit

Author: Dong Xiaowen
Hu Kai
Wang Zhe
Xu Ke
Yin Baolin
Publication venue
Publication date: 01/01/2012
Field of study

The network traffic matrix is widely used in network operation and management. It is therefore of crucial importance to analyze the components and the structure of the network traffic matrix, for which several mathematical approaches such as Principal Component Analysis (PCA) were proposed. In this paper, we first argue that PCA performs poorly for analyzing traffic matrix that is polluted by large volume anomalies, and then propose a new decomposition model for the network traffic matrix. According to this model, we carry out the structural analysis by decomposing the network traffic matrix into three sub-matrices, namely, the deterministic traffic, the anomaly traffic and the noise traffic matrix, which is similar to the Robust Principal Component Analysis (RPCA) problem previously studied in [13]. Based on the Relaxed Principal Component Pursuit (Relaxed PCP) method and the Accelerated Proximal Gradient (APG) algorithm, we present an iterative approach for decomposing a traffic matrix, and demonstrate its efficiency and flexibility by experimental results. Finally, we further discuss several features of the deterministic and noise traffic. Our study develops a novel method for the problem of structural analysis of the traffic matrix, which is robust against pollution of large volume anomalies.Comment: Accepted to Elsevier Computer Network

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Oxford University Research Archive