323 research outputs found

    QTLRel: an R Package for Genome-wide Association Studies in which Relatedness is a Concern

    No full text
    BACKGROUND Existing software for quantitative trait mapping is either not able to model polygenic variation or does not allow incorporation of more than one genetic variance component. Improperly modeling the genetic relatedness among subjects can result in excessive false positives. We have developed an R package, QTLRel, to enable more flexible modeling of genetic relatedness as well as covariates and non-genetic variance components. RESULTS We have successfully used the package to analyze many datasets, including F₃₄ body weight data that contains 688 individuals genotyped at 3105 SNP markers and identified 11 QTL. It took 295 seconds to estimate variance components and 70 seconds to perform the genome scan on an Linux machine equipped with a 2.40GHz Intel(R) Core(TM)2 Quad CPU. CONCLUSIONS QTLRel provides a toolkit for genome-wide association studies that is capable of calculating genetic incidence matrices from pedigrees, estimating variance components, performing genome scans, incorporating interactive covariates and genetic and non-genetic variance components, as well as other functionalities such as multiple-QTL mapping and genome-wide epistasis.This project was supported by NIH grants R01DA021336, R01MH079103 and R21DA024845

    QTLRel: An R package for genome-wide association studies in which relatedness is a concern

    Get PDF
    Abstract Background: Existing software for quantitative trait mapping is either not able to model polygenic variation or does not allow incorporation of more than one genetic variance component. Improperly modeling the genetic relatedness among subjects can result in excessive false positives. We have developed an R package, QTLRel, to enable more flexible modeling of genetic relatedness as well as covariates and non-genetic variance components. Results: We have successfully used the package to analyze many datasets, including F 34 body weight data that contains 688 individuals genotyped at 3105 SNP markers and identified 11 QTL. It took 295 seconds to estimate variance components and 70 seconds to perform the genome scan on an Linux machine equipped with a 2.40GHz Intel(R) Core(TM)2 Quad CPU. Conclusions: QTLRel provides a toolkit for genome-wide association studies that is capable of calculating genetic incidence matrices from pedigrees, estimating variance components, performing genome scans, incorporating interactive covariates and genetic and non-genetic variance components, as well as other functionalities such as multiple-QTL mapping and genome-wide epistasis

    Promoter Variant of PIK3C3 Is Associated with Autoimmunity against Ro and Sm Epitopes in African-American Lupus Patients

    Get PDF
    The PIK3C3 locus was implicated in case-case genome-wide association study of systemic lupus erythematosus (SLE) which we had performed to detect genes associated with autoantibodies and serum interferon-alpha (IFN-α). Herein, we examine a PIK3C3 promoter variant (rs3813065/-442 C/T) in an independent multiancestral cohort of 478 SLE cases and 522 controls. rs3813065 C was strongly associated with the simultaneous presence of both anti-Ro and anti-Sm antibodies in African-American patients [OR = 2.24 (1.34–3.73), P = 2.0 × 10−3]. This autoantibody profile was associated with higher serum IFN-α (P = 7.6 × 10−6). In the HapMap Yoruba population, rs3813065 was associated with differential expression of ERAP2 (P = 2.0 × 10−5), which encodes an enzyme involved in MHC class I peptide processing. Thus, rs3813065 C is associated with a particular autoantibody profile and altered expression of an MHC peptide processing enzyme, suggesting that this variant modulates serologic autoimmunity in African-American SLE patients

    Trait-stratified genome-wide association study identifies novel and diverse genetic associations with serologic and cytokine phenotypes in systemic lupus erythematosus

    Get PDF
    INTRODUCTION: Systemic lupus erythematosus (SLE) is a highly heterogeneous disorder, characterized by differences in autoantibody profile, serum cytokines, and clinical manifestations. SLE-associated autoantibodies and high serum interferon alpha (IFN-α) are important heritable phenotypes in SLE which are correlated with each other, and play a role in disease pathogenesis. These two heritable risk factors are shared between ancestral backgrounds. The aim of the study was to detect genetic factors associated with autoantibody profiles and serum IFN-α in SLE. METHODS: We undertook a case-case genome-wide association study of SLE patients stratified by ancestry and extremes of phenotype in serology and serum IFN-α. Single nucleotide polymorphisms (SNPs) in seven loci were selected for follow-up in a large independent cohort of 538 SLE patients and 522 controls using a multi-step screening approach based on novel metrics and expert database review. The seven loci were: leucine-rich repeat containing 20 (LRRC20); protein phosphatase 1 H (PPM1H); lysophosphatidic acid receptor 1 (LPAR1); ankyrin repeat and sterile alpha motif domain 1A (ANKS1A); protein tyrosine phosphatase, receptor type M (PTPRM); ephrin A5 (EFNA5); and V-set and immunoglobulin domain containing 2 (VSIG2). RESULTS: SNPs in the LRRC20, PPM1H, LPAR1, ANKS1A, and VSIG2 loci each demonstrated strong association with a particular serologic profile (all odds ratios > 2.2 and P < 3.5 × 10(-4)). Each of these serologic profiles was associated with increased serum IFN-α. SNPs in both PTPRM and LRRC20 were associated with increased serum IFN-α independent of serologic profile (P = 2.2 × 10(-6 )and P = 2.6 × 10(-3 )respectively). None of the SNPs were strongly associated with SLE in case-control analysis, suggesting that the major impact of these variants will be upon subphenotypes in SLE. CONCLUSIONS: This study demonstrates the power of using serologic and cytokine subphenotypes to elucidate genetic factors involved in complex autoimmune disease. The distinct associations observed emphasize the heterogeneity of molecular pathogenesis in SLE, and the need for stratification by subphenotypes in genetic studies. We hypothesize that these genetic variants play a role in disease manifestations and severity in SLE

    Two-stage analyses of sequence variants in association with quantitative traits

    Get PDF
    We propose a two-stage design for the analysis of sequence variants in which a proportion of genes that show some evidence of association are identified initially and then followed up in an independent data set. We compare two different approaches. In both approaches the same summary measure (total number of minor alleles) is used for each gene in the initial analysis. In the first (simple) approach the same summary measure is used in the analysis of the independent data set. In the second (alternative) approach a more specific hypothesis is formed for the second stage; the summary measure used is the count of minor alleles in only those variants that in the initial data showed the same direction of association as was seen overall. We applied the methods to the simulated quantitative traits of Genetic Analysis Workshop 17, blind to the simulation model, and then evaluated their performance once the underlying model was known. Performance was similar for most genes, but the simple strategy considerably out-performed the alternative strategy for one gene, where most of the effect was due to very rare variants; this suggests that the alternative approach would not be advisable when the effect is seen in very rare variants. Further simulations are needed to investigate the potential superior power of the alternative method when some variants within a gene have opposing effects. Overall, the power to detect associations was low; this was also true when using a more powerful joint analysis that combined the two stages of the study

    Calibrating the Performance of SNP Arrays for Whole-Genome Association Studies

    Get PDF
    To facilitate whole-genome association studies (WGAS), several high-density SNP genotyping arrays have been developed. Genetic coverage and statistical power are the primary benchmark metrics in evaluating the performance of SNP arrays. Ideally, such evaluations would be done on a SNP set and a cohort of individuals that are both independently sampled from the original SNPs and individuals used in developing the arrays. Without utilization of an independent test set, previous estimates of genetic coverage and statistical power may be subject to an overfitting bias. Additionally, the SNP arrays' statistical power in WGAS has not been systematically assessed on real traits. One robust setting for doing so is to evaluate statistical power on thousands of traits measured from a single set of individuals. In this study, 359 newly sampled Americans of European descent were genotyped using both Affymetrix 500K (Affx500K) and Illumina 650Y (Ilmn650K) SNP arrays. From these data, we were able to obtain estimates of genetic coverage, which are robust to overfitting, by constructing an independent test set from among these genotypes and individuals. Furthermore, we collected liver tissue RNA from the participants and profiled these samples on a comprehensive gene expression microarray. The RNA levels were used as a large-scale set of quantitative traits to calibrate the relative statistical power of the commercial arrays. Our genetic coverage estimates are lower than previous reports, providing evidence that previous estimates may be inflated due to overfitting. The Ilmn650K platform showed reasonable power (50% or greater) to detect SNPs associated with quantitative traits when the signal-to-noise ratio (SNR) is greater than or equal to 0.5 and the causal SNP's minor allele frequency (MAF) is greater than or equal to 20% (N = 359). In testing each of the more than 40,000 gene expression traits for association to each of the SNPs on the Ilmn650K and Affx500K arrays, we found that the Ilmn650K yielded 15% times more discoveries than the Affx500K at the same false discovery rate (FDR) level

    Genetic Heterogeneity in Colorectal Cancer Associations in Americans of African vs. European Descent

    Get PDF
    Genome-wide association studies of colorectal cancer (CRC) have identified risk variants in 10 genomic regions. None of these studies included African Americans, who have the highest incidence and mortality from CRC in the US. For the 10 genomic regions, we performed an association study of Americans of African and European descent

    Effectiveness of strategies to increase the validity of findings from association studies: size vs. replication

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The capacity of multiple comparisons to produce false positive findings in genetic association studies is abundantly clear. To address this issue, the concept of false positive report probability (FPRP) measures "the probability of no true association between a genetic variant and disease given a statistically significant finding". This concept involves the notion of prior probability of an association between a genetic variant and a disease, making it difficult to achieve acceptable levels for the FPRP when the prior probability is low. Increasing the sample size is of limited efficiency to improve the situation.</p> <p>Methods</p> <p>To further clarify this problem, the concept of true report probability (TRP) is introduced by analogy to the positive predictive value (PPV) of diagnostic testing. The approach is extended to consider the effects of replication studies. The formula for the TRP after k replication studies is mathematically derived and shown to be only dependent on prior probability, alpha, power, and number of replication studies.</p> <p>Results</p> <p>Case-control association studies are used to illustrate the TRP concept for replication strategies. Based on power considerations, a relationship is derived between TRP after k replication studies and sample size of each individual study. That relationship enables study designers optimization of study plans. Further, it is demonstrated that replication is efficient in increasing the TRP even in the case of low prior probability of an association and without requiring very large sample sizes for each individual study.</p> <p>Conclusions</p> <p>True report probability is a comprehensive and straightforward concept for assessing the validity of positive statistical testing results in association studies. By its extension to replication strategies it can be demonstrated in a transparent manner that replication is highly effective in distinguishing spurious from true associations. Based on the generalized TRP method for replication designs, optimal research strategy and sample size planning become possible.</p

    The systemic lupus erythematosus IRF5 risk haplotype is associated with systemic sclerosis

    Get PDF
    Systemic sclerosis (SSc) is a fibrotic autoimmune disease in which the genetic component plays an important role. One of the strongest SSc association signals outside the human leukocyte antigen (HLA) region corresponds to interferon (IFN) regulatory factor 5 (IRF5), a major regulator of the type I IFN pathway. In this study we aimed to evaluate whether three different haplotypic blocks within this locus, which have been shown to alter the protein function influencing systemic lupus erythematosus (SLE) susceptibility, are involved in SSc susceptibility and clinical phenotypes. For that purpose, we genotyped one representative single-nucleotide polymorphism (SNP) of each block (rs10488631, rs2004640, and rs4728142) in a total of 3,361 SSc patients and 4,012 unaffected controls of Caucasian origin from Spain, Germany, The Netherlands, Italy and United Kingdom. A meta-analysis of the allele frequencies was performed to analyse the overall effect of these IRF5 genetic variants on SSc. Allelic combination and dependency tests were also carried out. The three SNPs showed strong associations with the global disease (rs4728142: P = 1.34×10&lt;sup&gt;−8&lt;/sup&gt;, OR = 1.22, CI 95% = 1.14–1.30; rs2004640: P = 4.60×10&lt;sup&gt;−7&lt;/sup&gt;, OR = 0.84, CI 95% = 0.78–0.90; rs10488631: P = 7.53×10&lt;sup&gt;−20&lt;/sup&gt;, OR = 1.63, CI 95% = 1.47–1.81). However, the association of rs2004640 with SSc was not independent of rs4728142 (conditioned P = 0.598). The haplotype containing the risk alleles (rs4728142*A-rs2004640*T-rs10488631*C: P = 9.04×10&lt;sup&gt;−22&lt;/sup&gt;, OR = 1.75, CI 95% = 1.56–1.97) better explained the observed association (likelihood P-value = 1.48×10&lt;sup&gt;−4&lt;/sup&gt;), suggesting an additive effect of the three haplotypic blocks. No statistical significance was observed in the comparisons amongst SSc patients with and without the main clinical characteristics. Our data clearly indicate that the SLE risk haplotype also influences SSc predisposition, and that this association is not sub-phenotype-specific
    corecore