9 research outputs found

    Genetic Imputation: Accuracy to Application

    Get PDF
    Genotype imputation, the process of inferring genotypes for untyped variants, is used to identify and refine genetic association findings. This body of work focuses on assessing imputation accuracy and uses imputed data to identify genetic contributors to mentholated cigarette preference. Inaccuracies in imputed data can distort the observed association between variants and a disease. Many statistics are used to assess accuracy; some compare imputed to genotyped data and others are calculated without reference to true genotypes. Prior work has shown that the Imputation Quality Score (IQS), which is based on Cohens kappa statistic and compares imputed genotype probabilities to true genotypes, appropriately adjusts for chance agreement; however, it is not commonly used. To identify differences in accuracy assessment, we compared IQS with concordance rate, squared correlation, and accuracy measures built into imputation programs. Genotypes from the 1000 Genomes reference populations (AFR N = 246 and EUR N = 379) were masked to match the typed single nucleotide polymorphism (SNP) coverage of several SNP arrays and were imputed with BEAGLE 3.3.2 and IMPUTE2 in regions associated with smoking behaviors. Additional masking and imputation was conducted for sequenced subjects from the Collaborative Genetic Study of Nicotine Dependence and the Genetic Study of Nicotine Dependence in African Americans (N = 1,481 African Americans and N =1,480 European Americans). Our results offer further evidence that concordance rate inflates accuracy estimates, particularly for rare and low frequency variants. For common variants, squared correlation, BEAGLE R2, IMPUTE2 INFO, and IQS produce similar assessments of imputation accuracy. However, for rare and low frequency variants, compared to IQS, the other statistics tend to be more liberal in their assessment of accuracy. IQS is important to consider when evaluating imputation accuracy, particularly for rare and low frequency variants. This work directly impacts the interpretation of association studies by improving our understanding of accuracy assessments of imputed variants. Mentholated cigarettes are addictive, widely available, and commonly used, particularly by African American smokers. We aim to identify genetic variants that increase susceptibility to mentholated cigarette use in hopes of gaining biological insights into risk that may ultimately improve cessation efforts. We begin by pursuing hypothesis-driven candidate genes and regions (TAS2R38, CHRNA5/A3/B4, CHRNB3/A6, and CYP2A6/A7) and extend to a genome-wide approach. This study involves 1,365 African Americans and 2,206 European Americans (3,571 combined ancestry) nicotine dependent current smokers from The Collaborative Genetic Study of Nicotine Dependence (COGEND) and Transdisciplinary Tobacco Use Research Center (UW-TTURC). Analyses were conducted within each cohort, and meta-analysis was used to combine results across studies and across ancestral groups. We identified some suggestively associated variants, although none meet genome wide significance. This study represents a new, important aspect to understanding menthol cigarette preference. Further work is necessary to better understand this smoking behavior in efforts to improve cessation

    Associations among ancestry, geography and breast cancer incidence, mortality, and survival in Trinidad and Tobago

    Get PDF
    Breast cancer (BC) is the most common newly diagnosed cancer among women in Trinidad and Tobago (TT) and BC mortality rates are among the highest in the world. Globally, racial/ethnic trends in BC incidence, mortality and survival have been reported. However, such investigations have not been conducted in TT, which has been noted for its rich diversity. In this study, we investigated associations among ancestry, geography and BC incidence, mortality and survival in TT. Data on 3767 incident BC cases, reported to the National Cancer Registry of TT, from 1995 to 2007, were analyzed in this study. Women of African ancestry had significantly higher BC incidence and mortality rates (Incidence: 66.96; Mortality: 30.82 per 100,000) compared to women of East Indian (Incidence: 41.04, Mortality: 14.19 per 100,000) or mixed ancestry (Incidence: 36.72, Mortality: 13.80 per 100,000). Geographically, women residing in the North West Regional Health Authority (RHA) catchment area followed by the North Central RHA exhibited the highest incidence and mortality rates. Notable ancestral differences in survival were also observed. Women of East Indian and mixed ancestry experienced significantly longer survival than those of African ancestry. Differences in survival by geography were not observed. In TT, ancestry and geographical residence seem to be strong predictors of BC incidence and mortality rates. Additionally, disparities in survival by ancestry were found. These data should be considered in the design and implementation of strategies to reduce BC incidence and mortality rates in TT

    Gene–Environment Interactions at Nucleotide Resolution

    Get PDF
    Interactions among genes and the environment are a common source of phenotypic variation. To characterize the interplay between genetics and the environment at single nucleotide resolution, we quantified the genetic and environmental interactions of four quantitative trait nucleotides (QTN) that govern yeast sporulation efficiency. We first constructed a panel of strains that together carry all 32 possible combinations of the 4 QTN genotypes in 2 distinct genetic backgrounds. We then measured the sporulation efficiencies of these 32 strains across 8 controlled environments. This dataset shows that variation in sporulation efficiency is shaped largely by genetic and environmental interactions. We find clear examples of QTN:environment, QTN: background, and environment:background interactions. However, we find no QTN:QTN interactions that occur consistently across the entire dataset. Instead, interactions between QTN only occur under specific combinations of environment and genetic background. Thus, what might appear to be a QTN:QTN interaction in one background and environment becomes a more complex QTN:QTN:environment:background interaction when we consider the entire dataset as a whole. As a result, the phenotypic impact of a set of QTN alleles cannot be predicted from genotype alone. Our results instead demonstrate that the effects of QTN and their interactions are inextricably linked both to genetic background and to environmental variation

    When Does Choice of Accuracy Measure Alter Imputation Accuracy Assessments?

    No full text
    Imputation, the process of inferring genotypes for untyped variants, is used to identify and refine genetic association findings. Inaccuracies in imputed data can distort the observed association between variants and a disease. Many statistics are used to assess accuracy; some compare imputed to genotyped data and others are calculated without reference to true genotypes. Prior work has shown that the Imputation Quality Score (IQS), which is based on Cohen's kappa statistic and compares imputed genotype probabilities to true genotypes, appropriately adjusts for chance agreement; however, it is not commonly used. To identify differences in accuracy assessment, we compared IQS with concordance rate, squared correlation, and accuracy measures built into imputation programs. Genotypes from the 1000 Genomes reference populations (AFR N = 246 and EUR N = 379) were masked to match the typed single nucleotide polymorphism (SNP) coverage of several SNP arrays and were imputed with BEAGLE 3.3.2 and IMPUTE2 in regions associated with smoking behaviors. Additional masking and imputation was conducted for sequenced subjects from the Collaborative Genetic Study of Nicotine Dependence and the Genetic Study of Nicotine Dependence in African Americans (N = 1,481 African Americans and N = 1,480 European Americans). Our results offer further evidence that concordance rate inflates accuracy estimates, particularly for rare and low frequency variants. For common variants, squared correlation, BEAGLE R2, IMPUTE2 INFO, and IQS produce similar assessments of imputation accuracy. However, for rare and low frequency variants, compared to IQS, the other statistics tend to be more liberal in their assessment of accuracy. IQS is important to consider when evaluating imputation accuracy, particularly for rare and low frequency variants

    Scatterplots of squared correlation and IQS.

    No full text
    <p>Data for all 13,442 variants are displayed in panel A, while the results for variants with MAF>5% (N = 6,480) are found in panel B. The line y = x is denoted in red.</p

    Calculating concordance (P<sub>0</sub>) and IQS from imputed genotype probabilities and actual genotypes.

    No full text
    <p>The table was created by summing over probabilities for all N individuals (n = 1 to N) in each cell with p<sub>ij_n</sub> representing the probability that the nth individual has the imputed genotype i and actual genotype j, where 1 corresponds to AA, 2 corresponds to AB, and 3 corresponds to BB. N<sub>1</sub> = number of individuals with AA actual genotype, N<sub>2</sub> = number of individuals with AB actual genotype, N<sub>3</sub> = number of individuals with BB actual genotype, and N = number of total individuals.</p
    corecore