967 research outputs found

    A fast algorithm for detecting gene-gene interactions in genome-wide association studies

    Full text link
    With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS771 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Multiple locus linkage analysis of genomewide expression in yeast.

    Get PDF
    With the ability to measure thousands of related phenotypes from a single biological sample, it is now feasible to genetically dissect systems-level biological phenomena. The genetics of transcriptional regulation and protein abundance are likely to be complex, meaning that genetic variation at multiple loci will influence these phenotypes. Several recent studies have investigated the role of genetic variation in transcription by applying traditional linkage analysis methods to genomewide expression data, where each gene expression level was treated as a quantitative trait and analyzed separately from one another. Here, we develop a new, computationally efficient method for simultaneously mapping multiple gene expression quantitative trait loci that directly uses all of the available data. Information shared across gene expression traits is captured in a way that makes minimal assumptions about the statistical properties of the data. The method produces easy-to-interpret measures of statistical significance for both individual loci and the overall joint significance of multiple loci selected for a given expression trait. We apply the new method to a cross between two strains of the budding yeast Saccharomyces cerevisiae, and estimate that at least 37% of all gene expression traits show two simultaneous linkages, where we have allowed for epistatic interactions. Pairs of jointly linking quantitative trait loci are identified with high confidence for 170 gene expression traits, where it is expected that both loci are true positives for at least 153 traits. In addition, we are able to show that epistatic interactions contribute to gene expression variation for at least 14% of all traits. We compare the proposed approach to an exhaustive two-dimensional scan over all pairs of loci. Surprisingly, we demonstrate that an exhaustive two-dimensional scan is less powerful than the sequential search used here. In addition, we show that a two-dimensional scan does not truly allow one to test for simultaneous linkage, and the statistical significance measured from this existing method cannot be interpreted among many traits

    Genome-Wide Interaction-Based Association Analysis Identified Multiple New Susceptibility Loci for Common Diseases

    Get PDF
    Genome-wide interaction-based association (GWIBA) analysis has the potential to identify novel susceptibility loci. These interaction effects could be missed with the prevailing approaches in genome-wide association studies (GWAS). However, no convincing loci have been discovered exclusively from GWIBA methods, and the intensive computation involved is a major barrier for application. Here, we developed a fast, multi-thread/parallel program named “pair-wise interaction-based association mapping” (PIAM) for exhaustive two-locus searches. With this program, we performed a complete GWIBA analysis on seven diseases with stringent control for false positives, and we validated the results for three of these diseases. We identified one pair-wise interaction between a previously identified locus, C1orf106, and one new locus, TEC, that was specific for Crohn's disease, with a Bonferroni corrected P<0.05 (P = 0.039). This interaction was replicated with a pair of proxy linked loci (P = 0.013) on an independent dataset. Five other interactions had corrected P<0.5. We identified the allelic effect of a locus close to SLC7A13 for coronary artery disease. This was replicated with a linked locus on an independent dataset (P = 1.09×10−7). Through a local validation analysis that evaluated association signals, rather than locus-based associations, we found that several other regions showed association/interaction signals with nominal P<0.05. In conclusion, this study demonstrated that the GWIBA approach was successful for identifying novel loci, and the results provide new insights into the genetic architecture of common diseases. In addition, our PIAM program was capable of handling very large GWAS datasets that are likely to be produced in the future

    Expression Profiles Reveal Parallel Evolution of Epistatic Interactions Involving the CRP Regulon in Escherichia coli

    Get PDF
    The extent and nature of epistatic interactions between mutations are issues of fundamental importance in evolutionary biology. However, they are difficult to study and their influence on adaptation remains poorly understood. Here, we use a systems-level approach to examine epistatic interactions that arose during the evolution of Escherichia coli in a defined environment. We used expression arrays to compare the effect on global patterns of gene expression of deleting a central regulatory gene, crp. Effects were measured in two lineages that had independently evolved for 20,000 generations and in their common ancestor. We found that deleting crp had a much more dramatic effect on the expression profile of the two evolved lines than on the ancestor. Because the sequence of the crp gene was unchanged during evolution, these differences indicate epistatic interactions between crp and mutations at other loci that accumulated during evolution. Moreover, a striking degree of parallelism was observed between the two independently evolved lines; 115 genes that were not crp-dependent in the ancestor became dependent on crp in both evolved lines. An analysis of changes in crp dependence of well-characterized regulons identified a number of regulatory genes as candidates for harboring beneficial mutations that could account for these parallel expression changes. Mutations within three of these genes have previously been found and shown to contribute to fitness. Overall, these findings indicate that epistasis has been important in the adaptive evolution of these lines, and they provide new insight into the types of genetic changes through which epistasis can evolve. More generally, we demonstrate that expression profiles can be profitably used to investigate epistatic interactions

    Genetic analysis of ear development and tassel architecture in maize (Zea mays L. ssp. mays)

    Get PDF
    Yield potential of maize (Zea mays L.) has been increased significantly during the last century. Along with genetic gains for grain yield, changes in other traits have included an increase in the number of ears per plant (i.e. fewer barren plants) and a reduction in tassel size. The objectives of this study were 1) to identify Quantitative Trait Loci (QTL) associated with number of ears per plant (EPP), growing degree units to anthesis (GDU), plant height (PH) and tassel architectural traits, and 2) to evaluate the consistency of the QTL across environments. A population of 218 recombinant inbred lines (RILs) derived from two nearly isogenic inbreds, C103 and C103AP was evaluated for EPP, GDU, PH and four tassel architectural traits. The genetic map of 123 Simple Sequence Repeat (SSR) loci covered 894 cM. At least 5 novel regions for EPP were detected on chromosomes 2, 3, 6, 8 and 9. A region flanked by loci umc1858 and umc1309, on chromosome 8 (bins 8.04-8.05; a bin is an arbitrary subdivision of the maize genome based on a set of core markers) had a major influence on EPP, PH and GDU to anthesis. With respect to tassel morphology, a total of 32 QTL were identified for tassel branch number (TBN), tassel length (TL), central spike (CSL) and branching zone length (BZL). The majority of these QTL were located on chromosomes 1, 2, 3, 4 and 8. The QTL for TBN, TL and CSL with strong association to the phenotypic variance were located in bins 2.01, 2.06, 2.08 and 9.03. In these bins candidate genes and QTL have not been identified; therefore, this is the first report of a biological function with respect to tassel morphology for those regions in the genome. Comprehensive descriptions of the QTL related to the traits evaluated in this study are provided in the individual chapters of this dissertation. Many results found have not been described previously in the literature and will contribute to the current knowledge. Finally, further study of these regions is required for better understanding of the genetic factors affecting meristem initiation, maintenance and development in maize
    corecore