86 research outputs found

    SNP imputation bias reduces effect size determination

    Get PDF
    Imputation is a commonly used technique that exploits linkage disequilibrium to infer missing genotypes in genetic datasets, using a well characterized reference population. While there is agreement that the reference population has to match the ethnicity of the query dataset, it is common practice to use the same reference to impute genotypes for a wide variety of phenotypes. We hypothesized that using a reference composed of samples with a different phenotype than the query dataset would introduce imputation bias.To test this hypothesis we used GWAS datasets from amyotrophic lateral sclerosis, Parkinson disease, and Crohn disease. First, we masked and then performed imputation of 100 disease-associated markers and 100 non-associated markers from each study. Two references for imputation were used in parallel: one consisting of healthy controls and another consisting of patients with the same disease. We assessed the discordance (imprecision) and bias (inaccuracy) of imputation by comparing predicted genotypes to those assayed by SNP-chip. We also assessed the bias on the observed effect size when the predicted genotypes were used in a GWAS study.When healthy controls were used as reference for imputation, a significant bias was observed, particularly in the disease-associated markers. Using cases as reference significantly attenuated this bias. For nearly all markers, the direction of the bias favored the non-risk allele. In GWAS studies of the three diseases (with healthy reference controls from the 1000 genomes as reference), the mean OR for disease-associated markers obtained by imputation was lower than that obtained using original assayed genotypes.We found that the bias is inherent to imputation as using different methods did not alter the results. In conclusion, imputation is a powerful method to predict genotypes and estimate genetic risk for GWAS. However, a careful choice of reference population is needed to minimize biases inherent to this approac

    Genetic variation in the odorant receptors family 13 and the mhc loci influence mate selection in a multiple sclerosis dataset

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>When selecting mates, many vertebrate species seek partners with major histocompatibility complex (MHC) genes different from their own, presumably in response to selective pressure against inbreeding and towards MHC diversity. Attempts at replication of these genetic results in human studies, however, have reached conflicting conclusions.</p> <p>Results</p> <p>Using a multi-analytical strategy, we report validated genome-wide relationships between genetic identity and human mate choice in 930 couples of European ancestry. We found significant similarity between spouses in the MHC at class I region in chromosome 6p21, and at the odorant receptor family 13 locus in chromosome 9. Conversely, there was significant dissimilarity in the MHC class II region, near the <it>HLA-DQA1 </it>and -<it>DQB1 </it>genes. We also found that genomic regions with significant similarity between spouses show excessive homozygosity in the general population (assessed in the HapMap CEU dataset). Conversely, loci that were significantly dissimilar among spouses were more likely to show excessive heterozygosity in the general population.</p> <p>Conclusions</p> <p>This study highlights complex patterns of genomic identity among partners in unrelated couples, consistent with a multi-faceted role for genetic factors in mate choice behavior in human populations.</p

    Pathway and network-based analysis of genome-wide association studies in multiple sclerosis

    Get PDF
    Genome-wide association studies (GWAS) testing several hundred thousand SNPs have been performed in multiple sclerosis (MS) and other complex diseases. Typically, the number of markers in which the evidence for association exceeds the genome-wide significance threshold is very small, and markers that do not exceed this threshold are generally neglected. Classical statistical analysis of these datasets in MS revealed genes with known immunological functions. However, many of the markers showing modest association may represent false negatives. We hypothesize that certain combinations of genes flagged by these markers can be identified if they belong to a common biological pathway. Here we conduct a pathway-oriented analysis of two GWAS in MS that takes into account all SNPs with nominal evidence of association (P < 0.05). Gene-wise P-values were superimposed on a human protein interaction network and searches were conducted to identify sub-networks containing a higher proportion of genes associated with MS than expected by chance. These sub-networks, and others generated at random as a control, were categorized for membership of biological pathways. GWAS from eight other diseases were analyzed to assess the specificity of the pathways identified. In the MS datasets, we identified sub-networks of genes from several immunological pathways including cell adhesion, communication and signaling. Remarkably, neural pathways, namely axon-guidance and synaptic potentiation, were also over-represented in MS. In addition to the immunological pathways previously identified, we report here for the first time the potential involvement of neural pathways in MS susceptibilit

    Pathway and network-based analysis of genome-wide association studies in multiple sclerosis

    Get PDF
    Genome-wide association studies (GWAS) testing several hundred thousand SNPs have been performed in multiple sclerosis (MS) and other complex diseases. Typically, the number of markers in which the evidence for association exceeds the genome-wide significance threshold is very small, and markers that do not exceed this threshold are generally neglected. Classical statistical analysis of these datasets in MS revealed genes with known immunological functions. However, many of the markers showing modest association may represent false negatives. We hypothesize that certain combinations of genes flagged by these markers can be identified if they belong to a common biological pathway. Here we conduct a pathway-oriented analysis of two GWAS in MS that takes into account all SNPs with nominal evidence of association (P < 0.05). Gene-wise P-values were superimposed on a human protein interaction network and searches were conducted to identify sub-networks containing a higher proportion of genes associated with MS than expected by chance. These sub-networks, and others generated at random as a control, were categorized for membership of biological pathways. GWAS from eight other diseases were analyzed to assess the specificity of the pathways identified. In the MS datasets, we identified sub-networks of genes from several immunological pathways including cell adhesion, communication and signaling. Remarkably, neural pathways, namely axon-guidance and synaptic potentiation, were also over-represented in MS. In addition to the immunological pathways previously identified, we report here for the first time the potential involvement of neural pathways in MS susceptibility

    Sequencing of the IL6 gene in a case–control study of cerebral palsy in children

    Get PDF
    BACKGROUND: Cerebral palsy (CP) is a group of nonprogressive disorders of movement and posture caused by abnormal development of, or damage to, motor control centers of the brain. A single nucleotide polymorphism (SNP), rs1800795, in the promoter region of the interleukin-6 (IL6) gene has been implicated in the pathogenesis of CP by mediating IL-6 protein levels in amniotic fluid and cord plasma and within brain lesions. This SNP has been associated with other neurological, vascular, and malignant processes as well, often as part of a haplotype block. METHODS: To refine the regional genetic association with CP, we sequenced (Sanger) the IL6 gene and part of the promoter region in 250 infants with CP and 305 controls. RESULTS: We identified a haplotype of 7 SNPs that includes rs1800795. In a recessive model of inheritance, the variant haplotype conferred greater risk (OR = 4.3, CI = [2.0-10.1], p = 0.00007) than did the lone variant at rs1800795 (OR = 2.5, CI = [1.4-4.6], p = 0.002). The risk haplotype contains one SNP (rs2069845, CI = [1.2-4.3], OR = 2.3, p = 0.009) that disrupts a methylation site. CONCLUSIONS: The risk haplotype identified in this study overlaps with previously identified haplotypes that include additional promoter SNPs. A risk haplotype at the IL6 gene likely confers risk to CP, and perhaps other diseases, via a multi-factorial mechanism

    Genetic overlap between autoimmune diseases and non-Hodgkin lymphoma subtypes

    Get PDF
    Epidemiologic studies show an increased risk of non-Hodgkin lymphoma (NHL) in patients with autoimmune disease (AD), due to a combination of shared environmental factors and/or genetic factors, or a causative cascade: chronic inflammation/antigen-stimulation in one disease leads to another. Here we assess shared genetic risk in genome-wide-association-studies (GWAS). Secondary analysis of GWAS of NHL subtypes (chronic lymphocytic leukemia, diffuse large B-cell lymphoma, follicular lymphoma, and marginal zone lymphoma) and ADs (rheumatoid arthritis, systemic lupus erythematosus, and multiple sclerosis). Shared genetic risk was assessed by (a) description of regional genetic of overlap, (b) polygenic risk score (PRS), (c)"diseasome", (d)meta-analysis. Descriptive analysis revealed few shared genetic factors between each AD and each NHL subtype. The PRS of ADs were not increased in NHL patients (nor vice versa). In the diseasome, NHLs shared more genetic etiology with ADs than solid cancers (p = .0041). A meta-analysis (combing AD with NHL) implicated genes of apoptosis and telomere length. This GWAS-based analysis four NHL subtypes and three ADs revealed few weakly-associated shared loci, explaining little total risk. This suggests common genetic variation, as assessed by GWAS in these sample sizes, may not be the primary explanation for the link between these ADs and NHLs

    Single nucleotide polymorphism (SNP)-strings: an alternative method for assessing genetic associations.

    No full text
    BACKGROUND: Genome-wide association studies (GWAS) identify disease-associations for single-nucleotide-polymorphisms (SNPs) from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association. METHODOLOGY/PRINCIPAL FINDINGS: Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person's SNP-genotype over specified chromosomal segments. Two multiple sclerosis (MS)-associated genetic regions were modeled; DRB1 (a Class II molecule of the major histocompatibility complex) and MMEL1 (an endopeptidase that degrades both neuropeptides and β-amyloid). For each locus, we considered sets of eleven adjacent SNPs, surrounding the putative disease-associated gene and spanning ∼200 kb of DNA. The SNP-information was converted into an ordered-set of eleven-numbers (subject-vectors) based on whether a person had zero, one, or two copies of particular SNP-variant at each sequential SNP-location. SNP-strings were defined as those ordered-combinations of eleven-numbers (0 or 1), representing a haplotype, two of which combined to form the observed subject-vector. Subject-vectors were resolved using probabilistic methods. In both regions, only a small number of SNP-strings were present. We compared our method to the SHAPEIT-2 phasing-algorithm. When the SNP-information spanning 200 kb was used, SHAPEIT-2 was inaccurate. When the SHAPEIT-2 window was increased to 2,000 kb, the concordance between the two methods, in both of these eleven-SNP regions, was over 99%, suggesting that, in these regions, both methods were quite accurate. Nevertheless, correspondence was not uniformly high over the entire DNA-span but, rather, was characterized by alternating peaks and valleys of concordance. Moreover, in the valleys of poor-correspondence, SHAPEIT-2 was also inconsistent with itself, suggesting that the SNP-string method is more accurate across the entire region. CONCLUSIONS/SIGNIFICANCE: Accurate haplotype identification will enhance the detection of genetic-associations. The SNP-string method provides a simple means to accomplish this and can be extended to cover larger genomic regions, thereby improving a GWAS's power, even for those published previously

    SNPs used for the SNP-String Analysis.

    No full text
    ‡<p>Nucleotide base (of the pair at each SNP), which is coded as having 0, 1, or 2 copies, is shown in parentheses.</p><p>*Distance from the center of each SNP-cluster. The DRB1cluster spans 160.7 kb and includes a gap of 136.5 kb between SNPs (n7) and (n8). The MMEL1 cluster spans 228.6 kb and includes a gap of 137.0 kb between SNPs (n7) and (n8).</p

    SNP-Strings “identified” by the SNP-String Analysis<sup>*</sup>.

    No full text
    <p>*Only “identified” SNP-strings are displayed in the Table (see text). Other “novel” SNP-strings, which had a frequency that rounded to 0 and are not included.</p

    Association of DRB1 SNP-Strings with MS (all participants)<sup>*</sup>.

    No full text
    <p>*Data is taken from the “complete” analysis (see text). Only selected SNP-strings are displayed.</p
    corecore