58 research outputs found

    Penalized Orthogonal-Components Regression for Large p Small n Data

    Full text link
    We propose a penalized orthogonal-components regression (POCRE) for large p small n data. Orthogonal components are sequentially constructed to maximize, upon standardization, their correlation to the response residuals. A new penalization framework, implemented via empirical Bayes thresholding, is presented to effectively identify sparse predictors of each component. POCRE is computationally efficient owing to its sequential construction of leading sparse principal components. In addition, such construction offers other properties such as grouping highly correlated predictors and allowing for collinear or nearly collinear predictors. With multivariate responses, POCRE can construct common components and thus build up latent-variable models for large p small n data.Comment: 12 page

    Coefficients of Determination for Mixed-Effects Models

    Full text link
    The coefficient of determination is well defined for linear models and its extension is long wanted for mixed-effects models. We revisit its extension to define measures for proportions of variation explained by the whole model, fixed effects only, and random effects only. We propose to calculate unexplained variations conditional on individual random and/or fixed effects so as to keep individual heterogeneity brought by available predictors. While naturally defined for linear mixed models, these measures can be defined for a generalized linear mixed model using a distance measured along its variance function, accounting for its heteroscedasticity

    Brain APOE expression quantitative trait loci-based association study identified one susceptibility locus for Alzheimer\u27s disease by interacting with APOE epsilon 4

    Get PDF
    AbstractSome studies have demonstrated interactions of AD-risk single nucleotide polymorphisms (SNPs) in non-APOE regions with APOE genotype. Nevertheless, no study reported interactions of expression quantitative trait locus (eQTL) for APOE with APOE genotype. In present study, we included 9286 unrelated AD patients and 8479 normal controls from 12 cohorts of NIA Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS) and Alzheimer’s Disease Neuroimaging Initiative (ADNI). 34 unrelated brain eQTLs for APOE were compiled from BRAINEAC and GTEx. We used multi-covariate logistic regression analysis to identify eQTLs interacted with APOE ε4. Adjusted for age and gender, substantia nigra eQTL rs438811 for APOE showed significantly strong interaction with APOE ε4 status (OR, 1.448; CI, 1.124–1.430; P-value = 7.94 × 10−6). APOE ε4-based sub-group analyses revealed that carrying one minor allele T of rs438811 can increase the opportunity of developing to AD by 26.75% in APOE ε4 carriers but not in non-carriers. We revealed substantia nigra eQTL rs438811 for APOE can interact with APOE ε4 and confers risk in APOE ε4 carriers only.</jats:p

    Case-control genome-wide association study of rheumatoid arthritis from Genetic Analysis Workshop 16 using penalized orthogonal-components regression-linear discriminant analysis

    Get PDF
    Currently, genome-wide association studies (GWAS) are conducted by collecting a massive number of SNPs (i.e., large p) for a relatively small number of individuals (i.e., small n) and associations are made between clinical phenotypes and genetic variation one single-nucleotide polymorphism (SNP) at a time. Univariate association approaches like this ignore the linkage disequilibrium between SNPs in regions of low recombination. This results in a low reliability of candidate gene identification. Here we propose to improve the case-control GWAS approach by implementing linear discriminant analysis (LDA) through a penalized orthogonal-components regression (POCRE), a newly developed variable selection method for large p small n data. The proposed POCRE-LDA method was applied to the Genetic Analysis Workshop 16 case-control data for rheumatoid arthritis (RA). In addition to the two regions on chromosomes 6 and 9 previously associated with RA by GWAS, we identified SNPs on chromosomes 10 and 18 as potential candidates for further investigation

    Inferring Gene Regulatory Networks from a Population of Yeast Segregants

    Get PDF
    Constructing gene regulatory networks is crucial to unraveling the genetic architecture of complex traits and to understanding the mechanisms of diseases. On the basis of gene expression and single nucleotide polymorphism data in the yeast, Saccharomyces cerevisiae, we constructed gene regulatory networks using a two-stage penalized least squares method. A large system of structural equations via optimal prediction of a set of surrogate variables was established at the first stage, followed by consistent selection of regulatory effects at the second stage. Using this approach, we identified subnetworks that were enriched in gene ontology categories, revealing directional regulatory mechanisms controlling these biological pathways. Our mapping and analysis of expression-based quantitative trait loci uncovered a known alteration of gene expression within a biological pathway that results in regulatory effects on companion pathway genes in the phosphocholine network. In addition, we identify nodes in these gene ontology-enriched subnetworks that are coordinately controlled by transcription factors driven by trans-acting expression quantitative trait loci. Altogether, the integration of documented transcription factor regulatory associations with subnetworks defined by a system of structural equations using quantitative trait loci data is an effective means to delineate the transcriptional control of biological pathways

    Genome-wide association analysis of GAW17 data using an empirical Bayes variable selection

    Get PDF
    Next-generation sequencing technologies enable us to explore rare functional variants. However, most current statistical techniques are too underpowered to capture signals of rare variants in genome-wide association studies. We propose a supervised coalescing of single-nucleotide polymorphisms to obtain gene-based markers that can stably reveal possible genetic effects related to rare alleles. We use a newly developed empirical Bayes variable selection algorithm to identify associations between studied traits and genetic markers. Using our novel method, we analyzed the three continuous phenotypes in the GAW17 data set across 200 replicates, with intriguing results
    • …
    corecore