Search CORE

78,653 research outputs found

Introduction to Principal Components Analysis

Author: Francis Paul J.
Wills Beverley J.
Publication venue
Publication date: 01/01/1999
Field of study

Understanding the inverse equivalent width - luminosity relationship (Baldwin Effect), the topic of this meeting, requires extracting information on continuum and emission line parameters from samples of AGN. We wish to discover whether, and how, different subsets of measured parameters may correlate with each other. This general problem is the domain of Principal Components Analysis (PCA). We discuss the purpose, principles, and the interpretation of PCA, using some examples from QSO spectroscopy. The hope is that identification of relationships among subsets of correlated variables may lead to new physical insight.Comment: Invited review to appear in ``Quasars and Cosmology'', A.S.P. Conference Series 1999. eds. G. J. Ferland, J. A. Baldwin, (San Francisco: ASP). 10 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

CERN Document Server

Integrating Data Transformation in Principal Components Analysis

Author: Hu Jianhua
Huang Jianhua Z.
Maadooliat Mehdi
Publication venue: e-Publications@Marquette
Publication date: 01/01/2015
Field of study

Principal component analysis (PCA) is a popular dimension-reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples. Supplementary materials for this article are available online

epublications@Marquette

Crossref

PubMed Central

FigShare

Properties of Design-Based Functional Principal Components Analysis

Author: Benko
Berger
Besse
Besse
Breidt
Camelia Goga
Campbell
Cardot
Cardot
Castro
Catherine Labruère
Chatelin
Chen
Chiou
Croux
Cuevas
Dauxois
Davison
Deville
Deville
Ferraty
Hampel
Hastie
Hervé Cardot
Isaki
James
Kato
Kirkpatrick
Kneip
Mohamed Chaouch
Ramsay
Ramsay
Rice
Robinson
Serfling
Skinner
Särndal
von Mises
Publication venue: 'Elsevier BV'
Publication date: 29/01/2009
Field of study

This work aims at performing Functional Principal Components Analysis (FPCA) with Horvitz-Thompson estimators when the observations are curves collected with survey sampling techniques. One important motivation for this study is that FPCA is a dimension reduction tool which is the first step to develop model assisted approaches that can take auxiliary information into account. FPCA relies on the estimation of the eigenelements of the covariance operator which can be seen as nonlinear functionals. Adapting to our functional context the linearization technique based on the influence function developed by Deville (1999), we prove that these estimators are asymptotically design unbiased and consistent. Under mild assumptions, asymptotic variances are derived for the FPCA' estimators and consistent estimators of them are proposed. Our approach is illustrated with a simulation study and we check the good properties of the proposed estimators of the eigenelements as well as their variance estimators obtained with the linearization approach.Comment: Revised version for J. of Statistical Planning and Inference (January 2009

arXiv.org e-Print Archive

HAL-uB

Crossref

HAL - Université de Franche-Comté

Sparse logistic principal components analysis for binary data

Author: Hu Jianhua
Huang Jianhua Z.
Lee Seokho
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization--Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS327 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Wishart Mechanism for Differentially Private Principal Components Analysis

Author: Jiang Wuxuan
Xie Cong
Zhang Zhihua
Publication venue
Publication date: 19/11/2015
Field of study

We propose a new input perturbation mechanism for publishing a covariance matrix to achieve

(\epsilon,0)

-differential privacy. Our mechanism uses a Wishart distribution to generate matrix noise. In particular, We apply this mechanism to principal component analysis. Our mechanism is able to keep the positive semi-definiteness of the published covariance matrix. Thus, our approach gives rise to a general publishing framework for input perturbation of a symmetric positive semidefinite matrix. Moreover, compared with the classic Laplace mechanism, our method has better utility guarantee. To the best of our knowledge, Wishart mechanism is the best input perturbation approach for

(\epsilon,0)

-differentially private PCA. We also compare our work with previous exponential mechanism algorithms in the literature and provide near optimal bound while having more flexibility and less computational intractability.Comment: A full version with technical proofs. Accepted to AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications