Search CORE

1,795 research outputs found

Robust sparse principal component analysis.

Author: Croux Christophe
Filzmoser Peter
Fritz Heinrich
Publication venue
Publication date
Field of study

A method for principal component analysis is proposed that is sparse and robust at the same time. The sparsity delivers principal components that have loadings on a small number of variables, making them easier to interpret. The robustness makes the analysis resistant to outlying observations. The principal components correspond to directions that maximize a robust measure of the variance, with an additional penalty term to take sparseness into account. We propose an algorithm to compute the sparse and robust principal components. The method is applied on several real data examples, and diagnostic plots for detecting outliers and for selecting the degree of sparsity are provided. A simulation experiment studies the loss in statistical efficiency by requiring both robustness and sparsity.Dispersion measure; Projection-pursuit; Outliers; Variable selection;

Research Papers in Economics

Relaxed 2-D Principal Component Analysis by $L_p$ Norm for Face Recognition

Author: A d’Aspremont
A Pentland
D Meng
DM Witten
H Shen
H Wang
H Zou
I Jolliffe
J Wang
J Yang
J Ye
L Sirovich
L Zhao
M Kirby
M Turk
M Zhao
N Kwak
N Kwak
Q Chang
R Ma
X Li
Z Jia
Z Jia
Z Jia
Z Jia
Z Jia
Z Liang
Z-G Jia
ZZ Liang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/05/2019
Field of study

A relaxed two dimensional principal component analysis (R2DPCA) approach is proposed for face recognition. Different to the 2DPCA, 2DPCA-

L_1

and G2DPCA, the R2DPCA utilizes the label information (if known) of training samples to calculate a relaxation vector and presents a weight to each subset of training data. A new relaxed scatter matrix is defined and the computed projection axes are able to increase the accuracy of face recognition. The optimal

L_p

-norms are selected in a reasonable range. Numerical experiments on practical face databased indicate that the R2DPCA has high generalization ability and can achieve a higher recognition rate than state-of-the-art methods.Comment: 19 pages, 11 figure

arXiv.org e-Print Archive

Crossref

Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes

Author: Ahipaşaoğlu Selin Damla
Jahani Majid
Richtárik Peter
Takáč Martin
Publication venue
Publication date: 06/05/2020
Field of study

Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector; these are obtained by combining the following factors: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two different ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via a natural alternating maximization (AM) method. We show the the AM method is nontrivially equivalent to GPower (Journ\'{e}e et al; JMLR 11:517--553, 2010) for all our formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at i) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), ii) obtaining solutions explaining more variance and iii) dealing with big data problems (our cluster code is able to solve a 357 GB problem in about a minute).Comment: 29 pages, 9 tables, 7 figures (the paper is accompanied by a release of the open-source code '24am'

arXiv.org e-Print Archive

Southampton (e-Prints Soton)