Search CORE

174,302 research outputs found

Determining Principal Component Cardinality through the Principle of Minimum Description Length

Author: A Blumer
AJ Donald
AP Dawid
AR Barron
C Eckart
DC Hoyle
IT Jolliffe
J Josse
J Rissanen
J Rissanen
J Rissanen
J Rissanen
JI Myung
M Mitzenmacher
M Zhu
MH Hansen
T Hastie
TM Cover
Y Choi
Publication venue
Publication date: 29/06/2019
Field of study

PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201

arXiv.org e-Print Archive

Crossref

Use of principal components to aggregate rare variants in case-control and family-based association studies in the presence of multiple covariates

Author: AL Price
B Li
BE Madsen
C Dering
D Rabinowitz
DJ Schaid
F Han
H Zou
John S Witte
KL Lunetta
LA Almasy
NM Laird
R Tibshirani
Rémi Kazma
S Morgenthaler
S Nejentsev
Thomas J Hoffmann
TJ Hoffmann
W Bodmer
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Rare variants may help to explain some of the missing heritability of complex diseases. Technological advances in next-generation sequencing give us the opportunity to test this hypothesis. We propose two new methods (one for case-control studies and one for family-based studies) that combine aggregated rare variants and common variants located within a region through principal components analysis and allow for covariate adjustment. We analyzed 200 replicates consisting of 209 case subjects and 488 control subjects and compared the results to weight-based and step-up aggregation methods. The principal components and collapsing method showed an association between the gene FLT1 and the quantitative trait Q1 (P<10−30) in a fraction of the computation time of the other methods. The proposed family-based test has inconclusive results. The two methods provide a fast way to analyze simultaneously rare and common variants at the gene level while adjusting for covariates. However, further evaluation of the statistical efficiency of this approach is warranted

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

A combined linkage, microarray and exome analysis suggests MAP3K11 as a candidate gene for left ventricular hypertrophy

Author: Amin N. (Najaf)
Axenovich T.I. (Tatiana I.)
Berg M.E. (Marten) van den
Demirkan A. (Ayşe)
Duijn C.M. (Cornelia) van
Iglesias A.I. (Adriana I.)
Isaacs A.J. (Aaron)
Kirichenko A.V. (Anatoly)
Kors J.A. (Jan)
Niemeijer M.N. (Maartje)
Oostra B.A. (Ben)
Piñeros-Hernández L.B. (Laura B.)
Restrepo C.M. (Carlos M.)
Silva Aldana C.T. (Claudia)
Stricker B.H.Ch. (Bruno)
Uitterlinden A.G. (André)
Van Leeuwen E. (Elisa)
Willemsen R. (Rob)
Zorkoltseva I.V. (Irina)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/03/2018
Field of study

Background: Electrocardiographic measures of left ventricular hypertrophy (LVH) are used as predictors of cardiovascular risk. We combined linkage and association analyses to discover novel rare genetic variants involved in three such measures and two principal components derived from them. Methods: The study was conducted among participants from the Erasmus Rucphen Family Study (ERF), a Dutch family-based sample from the southwestern Netherlands. Variance components linkage analyses were performed using Merlin. Regions of interest (LOD > 1.9) were fine-mapped using microarray and exome sequence data. Results: We observed one significant LOD score for the second principal component on chromosome 15 (LOD score = 3.01) and 12 suggestive LOD scores. Several loci contained variants identified in GWAS for these traits; however, these did not explain the linkage peaks, nor did other common variants. Exome sequence data identified two associated variants after multiple testing corrections were applied. Conclusions: We did not find common SNPs explaining these linkage signals. Exome sequencing uncovered a relatively rare variant in MAPK3K11 on chromosome 11 (MAF = 0.01) that helped account for the suggestive linkage peak observed for the first principal component. Conditional analysis revealed a drop in LOD from 2.01 to 0.88 for MAP3K11, suggesting that this variant may partially explain the linkage signal at this chromosomal location. MAP3K11 is related to the JNK pathway and is a pro-apoptotic kinase that plays an important role in the induction of cardiomyocyte apoptosis in various pathologies, including LVH

Erasmus University Digital Repository

Integrating joint feature selection into subspace learning: A formulation of 2DPCA for outliers robust feature selection

Author: Blumenstein M
Razzak I
Saris RA
Xu G
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

© 2019 Elsevier Ltd Since the principal component analysis and its variants are sensitive to outliers that affect their performance and applicability in real world, several variants have been proposed to improve the robustness. However, most of the existing methods are still sensitive to outliers and are unable to select useful features. To overcome the issue of sensitivity of PCA against outliers, in this paper, we introduce two-dimensional outliers-robust principal component analysis (ORPCA) by imposing the joint constraints on the objective function. ORPCA relaxes the orthogonal constraints and penalizes the regression coefficient, thus, it selects important features and ignores the same features that exist in other principal components. It is commonly known that square Frobenius norm is sensitive to outliers. To overcome this issue, we have devised an alternative way to derive objective function. Experimental results on four publicly available benchmark datasets show the effectiveness of joint feature selection and provide better performance as compared to state-of-the-art dimensionality-reduction methods

OPUS - University of Technology Sydney

Two-phase incremental kernel PCA for learning massive or online datasets

Author: Zhao Feng
Rekik Islem
Lee Seong-Whan
Liu Jing
Zhang Junying
Shen Dinggang
Publication venue
Publication date: 01/01/2019
Field of study

As a powerful nonlinear feature extractor, kernel principal component analysis (KPCA) has been widely adopted in many machine learning applications. However, KPCA is usually performed in a batch mode, leading to some potential problems when handling massive or online datasets. To overcome this drawback of KPCA, in this paper, we propose a two-phase incremental KPCA (TP-IKPCA) algorithm which can incorporate data into KPCA in an incremental fashion. In the first phase, an incremental algorithm is developed to explicitly express the data in the kernel space. In the second phase, we extend an incremental principal component analysis (IPCA) to estimate the kernel principal components. Extensive experimental results on both synthesized and real datasets showed that the proposed TP-IKPCA produces similar principal components as conventional batch-based KPCA but is computationally faster than KPCA and its several incremental variants. Therefore, our algorithm can be applied to massive or online datasets where the batch method is not available

Repository of Faculty of Science, University of Zagreb

Directory of Open Access Journals

University of Zagreb Repository

Croatian Digital Thesis Repository

University of Dundee Online Publications

Two-phase incremental kernel PCA for learning massive or online datasets

Author: Lee Seong-Whan
Liu Jing
Rekik Islem
Shen Dinggang
Zhang Junying
Zhao Feng
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

Directory of Open Access Journals

University of Dundee Online Publications