2,608 research outputs found

    A fast algorithm for detecting gene-gene interactions in genome-wide association studies

    Full text link
    With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS771 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The evolution of genetic architectures underlying quantitative traits

    Full text link
    In the classic view introduced by R. A. Fisher, a quantitative trait is encoded by many loci with small, additive effects. Recent advances in QTL mapping have begun to elucidate the genetic architectures underlying vast numbers of phenotypes across diverse taxa, producing observations that sometimes contrast with Fisher's blueprint. Despite these considerable empirical efforts to map the genetic determinants of traits, it remains poorly understood how the genetic architecture of a trait should evolve, or how it depends on the selection pressures on the trait. Here we develop a simple, population-genetic model for the evolution of genetic architectures. Our model predicts that traits under moderate selection should be encoded by many loci with highly variable effects, whereas traits under either weak or strong selection should be encoded by relatively few loci. We compare these theoretical predictions to qualitative trends in the genetics of human traits, and to systematic data on the genetics of gene expression levels in yeast. Our analysis provides an evolutionary explanation for broad empirical patterns in the genetic basis of traits, and it introduces a single framework that unifies the diversity of observed genetic architectures, ranging from Mendelian to Fisherian.Comment: Minor changes in the text; Added supplementary materia

    Multiple locus linkage analysis of genomewide expression in yeast.

    Get PDF
    With the ability to measure thousands of related phenotypes from a single biological sample, it is now feasible to genetically dissect systems-level biological phenomena. The genetics of transcriptional regulation and protein abundance are likely to be complex, meaning that genetic variation at multiple loci will influence these phenotypes. Several recent studies have investigated the role of genetic variation in transcription by applying traditional linkage analysis methods to genomewide expression data, where each gene expression level was treated as a quantitative trait and analyzed separately from one another. Here, we develop a new, computationally efficient method for simultaneously mapping multiple gene expression quantitative trait loci that directly uses all of the available data. Information shared across gene expression traits is captured in a way that makes minimal assumptions about the statistical properties of the data. The method produces easy-to-interpret measures of statistical significance for both individual loci and the overall joint significance of multiple loci selected for a given expression trait. We apply the new method to a cross between two strains of the budding yeast Saccharomyces cerevisiae, and estimate that at least 37% of all gene expression traits show two simultaneous linkages, where we have allowed for epistatic interactions. Pairs of jointly linking quantitative trait loci are identified with high confidence for 170 gene expression traits, where it is expected that both loci are true positives for at least 153 traits. In addition, we are able to show that epistatic interactions contribute to gene expression variation for at least 14% of all traits. We compare the proposed approach to an exhaustive two-dimensional scan over all pairs of loci. Surprisingly, we demonstrate that an exhaustive two-dimensional scan is less powerful than the sequential search used here. In addition, we show that a two-dimensional scan does not truly allow one to test for simultaneous linkage, and the statistical significance measured from this existing method cannot be interpreted among many traits

    Dissection of QTL effects for root traits using a chromosome arm-specific mapping population in bread wheat

    Get PDF
    A high-resolution chromosome arm-specific mapping population was used in an attempt to locate/detect gene(s)/QTL for different root traits on the short arm of rye chromosome 1 (1RS) in bread wheat. This population consisted of induced homoeologous recombinants of 1RS with 1BS, each originating from a different crossover event and distinct from all other recombinants in the proportions of rye and wheat chromatin present. It provides a simple and powerful approach to detect even small QTL effects using fewer progeny. A promising empirical Bayes method was applied to estimate additive and epistatic effects for all possible marker pairs simultaneously in a single model. This method has an advantage for QTL analysis in minimizing the error variance and detecting interaction effects between loci with no main effect. A total of 15 QTL effects, 6 additive and 9 epistatic, were detected for different traits of root length and root weight in 1RS wheat. Epistatic interactions were further partitioned into inter-genomic (wheat and rye alleles) and intra-genomic (rye–rye or wheat–wheat alleles) interactions affecting various root traits. Four common regions were identified involving all the QTL for root traits. Two regions carried QTL for almost all the root traits and were responsible for all the epistatic interactions. Evidence for inter-genomic interactions is provided. Comparison of mean values supported the QTL detection

    Informative Bayesian Model Selection: a method for identifying interactions in genome-wide data

    Get PDF
    In high-dimensional genome-wide (GWA) data, a key challenge is to detect genomic variants that interact in a nonlinear fashion in their association with disease. Identifying such genomic interactions is important for elucidating the inheritance of complex phenotypes and diseases. In this paper, we introduce a new computational method called Informative Bayesian Model Selection (IBMS) that leverages correlation among variants in GWA data due to the linkage disequilibrium to identify interactions accurately in a computationally efficient manner. IBMS combines several statistical methods including canonical correlation analysis, logistic regression analysis, and a Bayesians statistical measure of evaluating interactions. Compared to BOOST and BEAM that are two widely used methods for detecting genomic interactions, IBMS had significantly higher power when evaluated on synthetic data. Furthermore, when applied to Alzheimer's disease GWA data, IBMS identified previously reported interactions. IBMS is a useful method for identifying variants in GWA data, and software that implements IBMS is freely available online from http://lbb.ut.ac.ir/Download/ LBBsoft/IBMS. This journal is © the Partner Organisations 2014
    corecore