14,748 research outputs found

    A robust clustering algorithm for identifying problematic samples in genome-wide association studies

    Get PDF
    Summary: High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections

    Genome-Wide Multiple Sclerosis Association Data and Coagulation

    Get PDF
    The emerging concept of a crosstalk between hemostasis, inflammation, and immune system prompt recent works on coagulation cascade in multiple sclerosis (MS). Studies on MS pathology identified several coagulation factors since the beginning of the disease pathophysiology: fibrin deposition with breakdown of blood brain barrier, and coagulation factors within active plaques may exert pathogenic role, especially through the innate immune system. Studies on circulating coagulation factors showed complex imbalance involving several components of hemostasis cascade (thrombin, factor X, factor XII). To analyze the role of the coagulation process in connection with other pathogenic pathways, we implemented a systematic matching of genome-wide association studies (GWAS) data with an informative and unbiased network of coagulation pathways. Using MetaCore (version 6.35 build 69300, 2018) we analyzed the connectivity (i.e., direct and indirect interactions among two networks) between the network of the coagulation process and the network resulting from feeding into MetaCore the MS GWAS data. The two networks presented a remarkable over-connectivity: 958 connections vs. 561 expected by chance; z-score = 17.39; p-value < 0.00001. Moreover, genes coding for cluster of differentiation 40 (CD40) and plasminogen activator, urokinase (PLAU) shared both networks, pointed to an integral interplay between coagulation cascade and main pathogenic immune effectors. In fact, CD40 pathways is especially operative in B cells, that are currently a major therapeutic target in MS field. The potential interaction of PLAU with a signal of paramount importance for B cell pathogenicity, such as CD40, suggest new lines of research and pave the way to implement new therapeutic targets

    A “Candidate-Interactome” Aggregate Analysis of Genome-Wide Association Data in Multiple Sclerosis

    Get PDF
    Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a "candidate interactome" (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms

    A "Candidate-Interactome" Aggregate Analysis of Genome-Wide Association Data in Multiple Sclerosis

    Get PDF
    Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a “candidate interactome” (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms

    Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis.

    Get PDF
    Multiple sclerosis is a common disease of the central nervous system in which the interplay between inflammatory and neurodegenerative processes typically results in intermittent neurological disturbance followed by progressive accumulation of disability. Epidemiological studies have shown that genetic factors are primarily responsible for the substantially increased frequency of the disease seen in the relatives of affected individuals, and systematic attempts to identify linkage in multiplex families have confirmed that variation within the major histocompatibility complex (MHC) exerts the greatest individual effect on risk. Modestly powered genome-wide association studies (GWAS) have enabled more than 20 additional risk loci to be identified and have shown that multiple variants exerting modest individual effects have a key role in disease susceptibility. Most of the genetic architecture underlying susceptibility to the disease remains to be defined and is anticipated to require the analysis of sample sizes that are beyond the numbers currently available to individual research groups. In a collaborative GWAS involving 9,772 cases of European descent collected by 23 research groups working in 15 different countries, we have replicated almost all of the previously suggested associations and identified at least a further 29 novel susceptibility loci. Within the MHC we have refined the identity of the HLA-DRB1 risk alleles and confirmed that variation in the HLA-A gene underlies the independent protective effect attributable to the class I region. Immunologically relevant genes are significantly overrepresented among those mapping close to the identified loci and particularly implicate T-helper-cell differentiation in the pathogenesis of multiple sclerosis

    C6orf10 low-frequency and rare variants in italian multiple sclerosis patients

    Get PDF
    In light of the complex nature of multiple sclerosis (MS) and the recently estimated contribution of low-frequency variants into disease, decoding its genetic risk components requires novel variant prioritization strategies. We selected, by reviewing MS Genome Wide Association Studies (GWAS), 107 candidate loci marked by intragenic single nucleotide polymorphisms (SNPs) with a remarkable association (p-value <= 5 x 10(-6)). A whole exome sequencing (WES)-based pilot study of SNPs with minor allele frequency (MAF) <= 0.04, conducted in three Italian families, revealed 15 exonic low-frequency SNPs with affected parent-child transmission. These variants were detected in 65/120 Italian unrelated MS patients, also in combination (22 patients). Compared with databases (controls gnomAD, dbSNP150, ExAC, Tuscany-1000 Genome), the allelic frequencies of C6orf10 rs 16870005 and IL2RA rs12722600 were significantly higher (i.e., controls gnomAD, p = 9.89 x 10(-7) and p < 1 x 10(-20)). TET2 rs61744960 and TRAF3 rs138943371 frequencies were also significantly higher, except in Tuscany-1000 Genome. Interestingly, the association of C6orf10 rs16870005 (Ala431Thr) with MS did not depend on its linkage disequilibrium with the HLA-DRB1 locus. Sequencing in the MS cohort of the C6orf10 3' region revealed 14 rare mutations (10 not previously reported). Four variants were null, and significantly more frequent than in the databases. Further, the C6orf10 rare variants were observed in combinations, both intra-locus and with other low-frequency SNPs. The C6orf10 Ser389Xfr was found homozygous in a patient with early onset of the MS. Taking into account the potentially functional impact of the identified exonic variants, their expression in combination at the protein level could provide functional insights in the heterogeneous pathogenetic mechanisms contributing to MS.In light of the complex nature of multiple sclerosis (MS) and the recently estimated contribution of low-frequency variants into disease, decoding its genetic risk components requires novel variant prioritization strategies. We selected, by reviewing MS Genome Wide Association Studies (GWAS), 107 candidate loci marked by intragenic single nucleotide polymorphisms (SNPs) with a remarkable association (p-value ≀ 5 × 10−6). A whole exome sequencing (WES)-based pilot study of SNPs with minor allele frequency (MAF) ≀ 0.04, conducted in three Italian families, revealed 15 exonic low-frequency SNPs with affected parent-child transmission. These variants were detected in 65/120 Italian unrelated MS patients, also in combination (22 patients). Compared with databases (controls gnomAD, dbSNP150, ExAC, Tuscany-1000 Genome), the allelic frequencies of C6orf10 rs16870005 and IL2RA rs12722600 were significantly higher (i.e., controls gnomAD, p = 9.89 × 10−7 and p < 1 × 10−20). TET2 rs61744960 and TRAF3 rs138943371 frequencies were also significantly higher, except in Tuscany-1000 Genome. Interestingly, the association of C6orf10 rs16870005 (Ala431Thr) with MS did not depend on its linkage disequilibrium with the HLA-DRB1 locus. Sequencing in the MS cohort of the C6orf10 3â€Č region revealed 14 rare mutations (10 not previously reported). Four variants were null, and significantly more frequent than in the databases. Further, the C6orf10 rare variants were observed in combinations, both intra-locus and with other low-frequency SNPs. The C6orf10 Ser389Xfr was found homozygous in a patient with early onset of the MS. Taking into account the potentially functional impact of the identified exonic variants, their expression in combination at the protein level could provide functional insights in the heterogeneous pathogenetic mechanisms contributing to MS

    Association of Genetic Markers with CSF Oligoclonal Bands in Multiple Sclerosis Patients

    Get PDF
    Objective:to explore the association between genetic markers and Oligoclonal Bands (OCB) in the Cerebro Spinal Fluid (CSF) of Italian Multiple Sclerosis patients.Methods:We genotyped 1115 Italian patients for HLA-DRB1*15 and HLA-A*02. In a subset of 925 patients we tested association with 52 non-HLA SNPs associated with MS susceptibility and we calculated a weighted Genetic Risk Score. Finally, we performed a Genome Wide Association Study (GWAS) with OCB status on a subset of 562 patients. The best associated SNPs of the Italian GWAS were replicated in silico in Scandinavian and Belgian populations, and meta-analyzed.Results:HLA-DRB1*15 is associated with OCB+: p = 0.03, Odds Ratio (OR) = 1.6, 95% Confidence Limits (CL) = 1.1-2.4. None of the 52 non-HLA MS susceptibility loci was associated with OCB, except one SNP (rs2546890) near IL12B gene (OR: 1.45; 1.09-1.92). The weighted Genetic Risk Score mean was significantly (p = 0.0008) higher in OCB+ (7.668) than in OCB- (7.412) patients. After meta-analysis on the three datasets (Italian, Scandinavian and Belgian) for the best associated signals resulted from the Italian GWAS, the strongest signal was a SNP (rs9320598) on chromosome 6q (p = 9.4×10-7) outside the HLA region (65 Mb).Discussion:genetic factors predispose to the development of OCB
    • 

    corecore