13 research outputs found

    Association mapping from sequencing reads using k-mers

    Get PDF
    Genome wide association studies (GWAS) rely on microarrays, or more recently mapping of sequencing reads, to genotype individuals. The reliance on prior sequencing of a reference genome limits the scope of association studies, and also precludes mapping associations outside of the reference. We present an alignment free method for association studies of categorical phenotypes based on counting k-mers in whole-genome sequencing reads, testing for associations directly between k-mers and the trait of interest, and local assembly of the statistically significant k-mers to identify sequence differences. An analysis of the 1000 genomes data show that sequences identified by our method largely agree with results obtained using the standard approach. However, unlike standard GWAS, our method identifies associations with structural variations and sites not present in the reference genome. We also demonstrate that population stratification can be inferred from k-mers. Finally, application to an E.coli dataset on ampicillin resistance validates the approach

    Association mapping from sequencing reads using k-mers

    Get PDF
    Genome wide association studies (GWAS) rely on microarrays, or more recently mapping of sequencing reads, to genotype individuals. The reliance on prior sequencing of a reference genome limits the scope of association studies, and also precludes mapping associations outside of the reference. We present an alignment free method for association studies of categorical phenotypes based on counting k-mers in whole-genome sequencing reads, testing for associations directly between k-mers and the trait of interest, and local assembly of the statistically significant k-mers to identify sequence differences. An analysis of the 1000 genomes data show that sequences identified by our method largely agree with results obtained using the standard approach. However, unlike standard GWAS, our method identifies associations with structural variations and sites not present in the reference genome. We also demonstrate that population stratification can be inferred from k-mers. Finally, application to an E.coli dataset on ampicillin resistance validates the approach

    A complete classification of epistatic two-locus models

    Get PDF
    Background: The study of epistasis is of great importance in statistical genetics in fields such as linkage and association analysis and QTL mapping. In an effort to classify the types of epistasis in the case of two biallelic loci Li and Reich listed and described all models in the simplest case of 0/ 1 penetrance values. However, they left open the problem of finding a classification of two-locus models with continuous penetrance values. Results: We provide a complete classification of biallelic two-locus models. In addition to solving the classification problem for dichotomous trait disease models, our results apply to any instance where real numbers are assigned to genotypes, and provide a complete framework for studying epistasis in QTL data. Our approach is geometric and we show that there are 387 distinct types of two-locus models, which can be reduced to 69 when symmetry between loci and alleles is accounted for. The model types are defined by 86 circuits, which are linear combinations of genotype values, each of which measures a fundamental unit of interaction. Conclusion: The circuits provide information on epistasis beyond that contained in the additive × additive, additive × dominance, and dominance × dominance interaction terms. We discuss th

    Resultants in genetic linkage analysis

    Get PDF
    AbstractStatistical models for genetic linkage analysis of k locus diseases are k-dimensional subvarieties of a (3k−1)-dimensional probability simplex. We determine the algebraic invariants of these models with general characteristics for k=1; in particular we recover, and generalize, the Hardy–Weinberg curve. For k=2, the algebraic invariants are presented as determinants of 32×32-matrices of linear forms in nine unknowns, a suitable format for computations with numerical data

    Additional file 1: Figure S1. of A longitudinal genome-wide association study of anti-tumor necrosis factor response among Japanese patients with rheumatoid arthritis

    No full text
    Regional plots showing association results from GEE models at 6q15, 6q27 and 10q25.3, when the analyses were restricted to patients with moderate or severe disease activity at baseline (n = 413). (PDF 350 kb
    corecore