527 research outputs found

    Testing gene-environment interactions in gene-based association studies

    Get PDF
    Gene-based and single-nucleotide polymorphism (SNP) set association studies provide an important complement to SNP analysis. Kernel-based nonparametric regression has recently emerged as a powerful and flexible tool for this purpose. Our goal is to explore whether this approach can be extended to incorporate and test for interaction effects, especially for genes containing rare variant SNPs. Here, we construct nonparametric regression models that can be used to include a gene-environment interaction effect under the framework of the least-squares kernel machine and examine the performance of the proposed method on the Genetic Analysis Workshop 17 unrelated individuals data set. Two hundred simulated replicates were used to explore the power for detecting interaction. We demonstrate through a genome scan of the quantitative phenotype Q1 that the simulated gene-environment interaction effect in the data can be detected with reasonable power by using the least-squares kernel machine method

    SUP: an extension to SLINK to allow a larger number of marker loci to be simulated in pedigrees conditional on trait values

    Get PDF
    BACKGROUND: With the recent advances in high-throughput genotyping technologies that allow for large-scale association mapping of human complex traits, promising statistical designs and methods have been emerging. Efficient simulation software are key elements for the evaluation of the properties of new statistical tests. SLINK is a flexible simulation tool that has been widely used to generate the segregation and recombination processes of markers linked to, and possibly associated with, a trait locus, conditional on trait values in arbitrary pedigrees. In practice, its most serious limitation is the small number of loci that can be simulated, since the complexity of the algorithm scales exponentially with this number. RESULTS: I describe the implementation of a two-step algorithm to be used in conjunction with SLINK to enable the simulation of a large number of marker loci linked to a trait locus and conditional on trait values in families, with the possibility for the loci to be in linkage disequilibrium. SLINK is used in the first step to simulate genotypes at the trait locus conditional on the observed trait values, and also to generate an indicator of the descent path of the simulated alleles. In the second step, marker alleles or haplotypes are generated in the founders, conditional on the trait locus genotypes simulated in the first step. Then the recombination process between the marker loci takes place conditionally on the descent path and on the trait locus genotypes. This two-step implementation is often computationally faster than other software that are designed to generate marker data linked to, and possibly associated with, a trait locus. CONCLUSION: Because the proposed method uses SLINK to simulate the segregation process, it benefits from its flexibility: the trait may be qualitative with the possibility of defining different liability classes (which allows for the simulation of gene-environment interactions or even the simulation of multi-locus effects between unlinked susceptibility regions) or it may be quantitative and normally distributed. In particular, this implementation is the only one available that can generate a large number of marker loci conditional on the set of observed quantitative trait values in pedigrees

    Strong signature of natural selection within an FHIT intron implicated in prostate cancer risk

    Get PDF
    Previously, a candidate gene linkage approach on brother pairs affected with prostate cancer identified a locus of prostate cancer susceptibility at D3S1234 within the fragile histidine triad gene (FHIT), a tumor suppressor that induces apoptosis. Subsequent association tests on 16 SNPs spanning approximately 381 kb surrounding D3S1234 in Americans of European descent revealed significant evidence of association for a single SNP within intron 5 of FHIT. In the current study, resequencing and genotyping within a 28.5 kb region surrounding this SNP further delineated the association with prostate cancer risk to a 15 kb region. Multiple SNPs in sequences under evolutionary constraint within intron 5 of FHIT defined several related haplotypes with an increased risk of prostate cancer in European-Americans. Strong associations were detected for a risk haplotype defined by SNPs 138543, 142413, and 152494 in all cases (Pearson's χ2 = 12.34, df 1, P = 0.00045) and for the homozygous risk haplotype defined by SNPs 144716, 142413, and 148444 in cases that shared 2 alleles identical by descent with their affected brothers (Pearson's χ2 = 11.50, df 1, P = 0.00070). In addition to highly conserved sequences encompassing SNPs 148444 and 152413, population studies revealed strong signatures of natural selection for a 1 kb window covering the SNP 144716 in two human populations, the European American (π = 0.0072, Tajima's D= 3.31, 14 SNPs) and the Japanese (π = 0.0049, Fay & Wu's H = 8.05, 14 SNPs), as well as in chimpanzees (Fay & Wu's H = 8.62, 12 SNPs). These results strongly support the involvement of the FHIT intronic region in an increased risk of prostate cancer. © 2008 Ding et al

    Use of principal components to aggregate rare variants in case-control and family-based association studies in the presence of multiple covariates

    Get PDF
    Rare variants may help to explain some of the missing heritability of complex diseases. Technological advances in next-generation sequencing give us the opportunity to test this hypothesis. We propose two new methods (one for case-control studies and one for family-based studies) that combine aggregated rare variants and common variants located within a region through principal components analysis and allow for covariate adjustment. We analyzed 200 replicates consisting of 209 case subjects and 488 control subjects and compared the results to weight-based and step-up aggregation methods. The principal components and collapsing method showed an association between the gene FLT1 and the quantitative trait Q1 (P<10−30) in a fraction of the computation time of the other methods. The proposed family-based test has inconclusive results. The two methods provide a fast way to analyze simultaneously rare and common variants at the gene level while adjusting for covariates. However, further evaluation of the statistical efficiency of this approach is warranted

    Haplotype Reconstruction Error as a Classical Misclassification Problem: Introducing Sensitivity and Specificity as Error Measures

    Get PDF
    BACKGROUND: Statistically reconstructing haplotypes from single nucleotide polymorphism (SNP) genotypes, can lead to falsely classified haplotypes. This can be an issue when interpreting haplotype association results or when selecting subjects with certain haplotypes for subsequent functional studies. It was our aim to quantify haplotype reconstruction error and to provide tools for it. METHODS AND RESULTS: By numerous simulation scenarios, we systematically investigated several error measures, including discrepancy, error rate, and R(2), and introduced the sensitivity and specificity to this context. We exemplified several measures in the KORA study, a large population-based study from Southern Germany. We find that the specificity is slightly reduced only for common haplotypes, while the sensitivity was decreased for some, but not all rare haplotypes. The overall error rate was generally increasing with increasing number of loci, increasing minor allele frequency of SNPs, decreasing correlation between the alleles and increasing ambiguity. CONCLUSIONS: We conclude that, with the analytical approach presented here, haplotype-specific error measures can be computed to gain insight into the haplotype uncertainty. This method provides the information, if a specific risk haplotype can be expected to be reconstructed with rather no or high misclassification and thus on the magnitude of expected bias in association estimates. We also illustrate that sensitivity and specificity separate two dimensions of the haplotype reconstruction error, which completely describe the misclassification matrix and thus provide the prerequisite for methods accounting for misclassification

    CAG repeat length in the androgen receptor gene is related to age at diagnosis of prostate cancer and response to endocrine therapy, but not to prostate cancer risk

    Get PDF
    The length of the polymorphic CAG repeat in the N-terminal of the androgen receptor (AR) gene is inversely correlated with the transactivation function of the AR. Some studies have indicated that short CAG repeats are related to higher risk of prostate cancer. We performed a case–control study to investigate relations between CAG repeat length and prostate cancer risk, tumour grade, tumour stage, age at diagnosis and response to endocrine therapy. The study included 190 AR alleles from prostate cancer patients and 186 AR alleles from female control subjects. All were whites from southern Sweden. The frequency distribution of CAG repeat length was strikingly similar for cases and controls, and no significant correlation between CAG repeat length and prostate cancer risk was detected. However, for men with non-hereditary prostate cancer (n = 160), shorter CAG repeats correlated with younger age at diagnosis (P = 0.03). There were also trends toward associations between short CAG repeats and high grade (P = 0.07) and high stage (P = 0.07) disease. Furthermore, we found that patients with long CAG repeats responded better to endocrine therapy, even after adjusting for pretreatment level of prostate-specific antigen and tumour grade and stage (P = 0.05). We conclude that short CAG repeats in the AR gene correlate with young age at diagnosis of prostate cancer, but not with higher risk of the disease. Selection of patients with early onset prostate cancer in case–control studies could therefore lead to an over-estimation of the risk of prostate cancer for men with short CAG repeats. An association between long CAG repeats and good response to endocrine therapy was also found, but the mechanism and clinical relevance are unclear. © 1999 Cancer Research Campaig

    Meta-analysis on the effect of the N363S polymorphism of the glucocorticoid receptor gene (GRL) on human obesity

    Get PDF
    BACKGROUND: Since both excess glucocorticoid secretion and central obesity are clinical features of some obese patients, it is worthwhile to study a possible association of glucocorticoid receptor gene (GRL) variants with obesity. Previous studies have linked the N363S variant of the GRL gene to increased glucocorticoid effects such as higher body fat, a lower lean-body mass and a larger insulin response to dexamethasone. However, contradictory findings have been also reported about the association between this variant and obesity phenotypes. Individual studies may lack statistical power which may result in disparate results. This limitation can be overcome using meta-analytic techniques. METHODS: We conducted a meta-analysis to assess the association between the N363S polymorphism of the GRL gene and obesity risk. In addition to published research, we included also our own unpublished data -three novel case-control studies- in the meta-analysis The new case-control studies were conducted in German and Spanish children, adolescents and adults (total number of subjects: 1,117). Genotype was assessed by PCR-RFLP (Tsp509I). The final formal meta-analysis included a total number of 5,909 individuals. RESULTS: The meta-analysis revealed a higher body mass index (BMI) with an overall estimation of +0.18 kg/m(2 )(95% CI: +0.004 to +0.35) for homo-/heterozygous carriers of the 363S allele of the GRL gene in comparison to non-carriers. Moreover, differences in pooled BMI were statistically significant and positive when considering one-group studies from the literature in which participants had a BMI below 27 kg/m(2 )(+ 0.41 kg/m(2 )[95% CI +0.17 to +0.66]), but the differences in BMI were negative when only our novel data from younger (aged under 45) and normal weight subjects were pooled together (-0.50 kg/m(2 )[95% CI -0.84 to -0.17]). The overall risk for obesity for homo-/heterozygous carriers of the 363S allele was not statistically significant in the meta-analysis (pooled OR = 1.02; 95% CI: 0.56–1.87). CONCLUSION: Although certain genotypic effects could be population-specific, we conclude that there is no compelling evidence that the N363S polymorphism of the GRL gene is associated with either average BMI or obesity risk

    Robust Association Tests Under Different Genetic Models, Allowing for Binary or Quantitative Traits and Covariates

    Get PDF
    The association of genetic variants with outcomes is usually assessed under an additive model, for example by the trend test. However, misspecification of the genetic model will lead to a reduction in power. More robust tests for association might therefore be preferred. A useful approach is to consider the maximum of the three test statistics under additive, dominant and recessive models (MAX3). The p-value however has to be adjusted to maintain the type I error rate. Previous studies and software on robust association tests have focused on binary traits without covariates. In this study we developed an analytic approach to robust association tests using MAX3, allowing for quantitative or binary traits as well as covariates. The p-values from our theoretical calculations match very well with those from a bootstrap resampling procedure. The methodology is implemented in the R package RobustSNP which is able to handle both small-scale studies and GWAS. The package and documentation are available at http://sites.google.com/site/honcheongso/software/robustsnp
    corecore