61 research outputs found
VSEAMS: a pipeline for variant set enrichment analysis using summary GWAS data identifies IKZF3, BATF and ESRRA as key transcription factors in type 1 diabetes.
MOTIVATION: Genome-wide association studies (GWAS) have identified many loci implicated in disease susceptibility. Integration of GWAS summary statistics (P-values) and functional genomic datasets should help to elucidate mechanisms. RESULTS: We extended a non-parametric SNP set enrichment method to test for enrichment of GWAS signals in functionally defined loci to a situation where only GWAS P-values are available. The approach is implemented in VSEAMS, a freely available software pipeline. We use VSEAMS to identify enrichment of type 1 diabetes (T1D) GWAS associations near genes that are targets for the transcription factors IKZF3, BATF and ESRRA. IKZF3 lies in a known T1D susceptibility region, while BATF and ESRRA overlap other immune disease susceptibility regions, validating our approach and suggesting novel avenues of research for T1D. AVAILABILITY AND IMPLEMENTATION: VSEAMS is available for download (http://github.com/ollyburren/vseams).This work was funded by the JDRF (9-2011-253),
the Wellcome Trust (091157) and the National Institute for
Health Research Cambridge Biomedical Research Centre.
The research leading to these results has received funding from
the European Unions seventh Framework Programme (FP7/2007-2013) under grant agreement no. 241447
(NAIMIT). The Cambridge Institute for Medical Research is
in receipt of a Wellcome Trust Strategic Award (100140). C.W.
and H.G. are supported by the Wellcome Trust (089989).
ImmunoBase.org is supported by Eli Lilly and Company.This is the final published version, also available from OUP at http://bioinformatics.oxfordjournals.org/content/early/2014/09/18/bioinformatics.btu571.short?rss=1
Prioritisation of Candidate Genes Underpinning COVID-19 Host Genetic Traits Based on High-Resolution 3D Chromosomal Topology
Genetic variants showing associations with specific biological traits and diseases detected by genome-wide association studies (GWAS) commonly map to non-coding DNA regulatory regions. Many of these regions are located considerable distances away from the genes they regulate and come into their proximity through 3D chromosomal interactions. We previously developed COGS, a statistical pipeline for linking GWAS variants with their putative target genes based on 3D chromosomal interaction data arising from high-resolution assays such as Promoter Capture Hi-C (PCHi-C). Here, we applied COGS to COVID-19 Host Genetic Consortium (HGI) GWAS meta-analysis data on COVID-19 susceptibility and severity using our previously generated PCHi-C results in 17 human primary cell types and SARS-CoV-2-infected lung carcinoma cells. We prioritise 251 genes putatively associated with these traits, including 16 out of 47 genes highlighted by the GWAS meta-analysis authors. The prioritised genes are expressed in a broad array of tissues, including, but not limited to, blood and brain cells, and are enriched for genes involved in the inflammatory response to viral infection. Our prioritised genes and pathways, in conjunction with results from other prioritisation approaches and targeted validation experiments, will aid in the understanding of COVID-19 pathology, paving the way for novel treatments
A method for gene-based pathway analysis using genomewide association study summary statistics reveals nine new type 1 diabetes associations.
Pathway analysis can complement point-wise single nucleotide polymorphism (SNP) analysis in exploring genomewide association study (GWAS) data to identify specific disease-associated genes that can be candidate causal genes. We propose a straightforward methodology that can be used for conducting a gene-based pathway analysis using summary GWAS statistics in combination with widely available reference genotype data. We used this method to perform a gene-based pathway analysis of a type 1 diabetes (T1D) meta-analysis GWAS (of 7,514 cases and 9,045 controls). An important feature of the conducted analysis is the removal of the major histocompatibility complex gene region, the major genetic risk factor for T1D. Thirty-one of the 1,583 (2%) tested pathways were identified to be enriched for association with T1D at a 5% false discovery rate. We analyzed these 31 pathways and their genes to identify SNPs in or near these pathway genes that showed potentially novel association with T1D and attempted to replicate the association of 22 SNPs in additional samples. Replication P-values were skewed (P=9.85×10-11) with 12 of the 22 SNPs showing P<0.05. Support, including replication evidence, was obtained for nine T1D associated variants in genes ITGB7 (rs11170466, P=7.86×10-9), NRP1 (rs722988, 4.88×10-8), BAD (rs694739, 2.37×10-7), CTSB (rs1296023, 2.79×10-7), FYN (rs11964650, P=5.60×10-7), UBE2G1 (rs9906760, 5.08×10-7), MAP3K14 (rs17759555, 9.67×10-7), ITGB1 (rs1557150, 1.93×10-6), and IL7R (rs1445898, 2.76×10-6). The proposed methodology can be applied to other GWAS datasets for which only summary level data are available.This is the final version. It was first published by Wiley at http://onlinelibrary.wiley.com/doi/10.1002/gepi.21853/abstract
Epigenetic analysis of regulatory T cells using multiplex bisulfite sequencing.
This work was supported by Wellcome Trust Grant 096388, JDRF Grant 9-2011-253, the National Institute for Health Research Cambridge Biomedical Research Centre (BRC) and Award P01AI039671 (to LSW. and JAT.) from the National Institute of Allergy and Infectious Diseases (NIAID). CW is supported by the Wellcome Trust (089989). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of NIAID or the National Institutes of Health. The Cambridge Institute for Medical Research is in receipt of Wellcome Trust Strategic Award 100140. We gratefully acknowledge the participation of all NIHR Cambridge BioResource volunteers. We thank the Cambridge BioResource staff for their help with volunteer recruitment. We thank members of the Cambridge BioResource SAB and Management Committee for their support of our study and the National Institute for Health Research Cambridge Biomedical Research Centre for funding. We thank Fay Rodger and Ruth Littleboy for running the Illumina MiSeq in the Molecular Genetics Laboratories, Addenbrooke's Hospital, Cambridge. This research was supported by the Cambridge NIHR BRC Cell Phenotyping Hub. In particular, we wish to thank Anna Petrunkina Harrison, Simon McCallum, Christopher Bowman, Natalia Savinykh, Esther Perez and Jelena Markovic Djuric for their advice and support in cell sorting. We also thank Helen Stevens, Pamela Clarke, Gillian Coleman, Sarah Dawson, Jennifer Denesha, Simon Duley, Meeta Maisuria-Armer and Trupti Mistry for acquisition and preparation of samples.This is the final version of the article. It first appeared from Wiley via http://dx.doi.org/10.1002/eji.20154564
Resolving mechanisms of immune-mediated disease in primary CD4 T cells
ABSTRACT Deriving mechanisms of immune-mediated disease from GWAS data remains a formidable challenge, with attempts to identify causal variants being frequently hampered by linkage disequilibrium. To determine whether causal variants could be identified via their functional effects, we adapted a massively-parallel reporter assay for use in primary CD4 T-cells, key effectors of many immune-mediated diseases. Using the results to guide further study, we provide a generalisable framework for resolving disease mechanisms from non-coding associations – illustrated by a locus linked to 6 immune-mediated diseases, where the lead functional variant causally disrupts a super-enhancer within an NF-κB-driven regulatory circuit, triggering unrestrained T-cell activation
Recommended from our members
Discovery, linkage disequilibrium and association analyses of polymorphisms of the immune complement inhibitor, decay-accelerating factor gene (DAF/CD55) in type 1 diabetes.
BACKGROUND: Type 1 diabetes (T1D) is a common autoimmune disease resulting from T-cell mediated destruction of pancreatic beta cells. Decay accelerating factor (DAF, CD55), a glycosylphosphatidylinositol-anchored membrane protein, is a candidate for autoimmune disease susceptibility based on its role in restricting complement activation and evidence that DAF expression modulates the phenotype of mice models for autoimmune disease. In this study, we adopt a linkage disequilibrium (LD) mapping approach to test for an association between the DAF gene and T1D. RESULTS: Initially, we used HapMap II genotype data to examine LD across the DAF region. Additional resequencing was required, identifying 16 novel polymorphisms. Combining both datasets, a LD mapping approach was adopted to test for association with T1D. Seven tag SNPs were selected and genotyped in case-control (3,523 cases and 3,817 controls) and family (725 families) collections. CONCLUSION: We obtained no evidence of association between T1D and the DAF region in two independent collections. In addition, we assessed the impact of using only HapMap II genotypes for the selection of tag SNPs and, based on this study, found that HapMap II genotypes may require additional SNP discovery for comprehensive LD mapping of some genes in common disease.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Widespread seasonal gene expression reveals annual differences in human immunity and physiology.
Seasonal variations are rarely considered a contributing component to human tissue function or health, although many diseases and physiological process display annual periodicities. Here we find more than 4,000 protein-coding mRNAs in white blood cells and adipose tissue to have seasonal expression profiles, with inverted patterns observed between Europe and Oceania. We also find the cellular composition of blood to vary by season, and these changes, which differ between the United Kingdom and The Gambia, could explain the gene expression periodicity. With regards to tissue function, the immune system has a profound pro-inflammatory transcriptomic profile during European winter, with increased levels of soluble IL-6 receptor and C-reactive protein, risk biomarkers for cardiovascular, psychiatric and autoimmune diseases that have peak incidences in winter. Circannual rhythms thus require further exploration as contributors to various aspects of human physiology and disease.The Gambian study providing data for analysis was supported by core funding MC-A760-5QX00 to the International Nutrition Group by the UK Medical Research Council (MRC) and the UK Department for the International Development (DFID) under the MRC/DFID Concordat agreement. This work was supported by the JDRF UK Centre for Diabetes-Genes, Autoimmunity and Prevention (D-GAP; 4-2007-1003), the JDRF (9-2011-253), the Wellcome Trust (WT061858/091157), the National Institute for Health Research Cambridge Biomedical Research Centre (CBRC) and the Medical Research Council (MRC) Cusrow Wadia Fund. The research leading to these results has received funding from the European Union’s 7th Framework Programme (FP7/2007–2013) under grant agreement no.241447 (NAIMIT). The Cambridge Institute for Medical Research (CIMR) is in receipt of a Wellcome Trust Strategic Award (WT100140). X.C.D. was a University of Cambridge/Wellcome Trust Infection and Immunity PhD student. R.C.F. is funded by a JDRF post-doctoral fellowship (3-2011-374). C.W. and H.G are funded by the Wellcome Trust (WT089989). The BABYDIET study was supported by grants from the Deutsche Forschungsgemeinschaft (DFG ZI-310/14-1 to-4), the JDRF (JDRF 17-2012-16 and 1-2006-665) and the German Center for Diabetes Research (DZD e.V.). E.B. is supported by the DFG Research Center and Cluster of Excellence—Center for Regenerative Therapies Dresden (FZ 111).This is the final published version. It first appeared at http://www.nature.com/ncomms/2015/150512/ncomms8000/full/ncomms8000.html
Dissection of a Complex Disease Susceptibility Region Using a Bayesian Stochastic Search Approach to Fine Mapping.
Identification of candidate causal variants in regions associated with risk of common diseases is complicated by linkage disequilibrium (LD) and multiple association signals. Nonetheless, accurate maps of these variants are needed, both to fully exploit detailed cell specific chromatin annotation data to highlight disease causal mechanisms and cells, and for design of the functional studies that will ultimately be required to confirm causal mechanisms. We adapted a Bayesian evolutionary stochastic search algorithm to the fine mapping problem, and demonstrated its improved performance over conventional stepwise and regularised regression through simulation studies. We then applied it to fine map the established multiple sclerosis (MS) and type 1 diabetes (T1D) associations in the IL-2RA (CD25) gene region. For T1D, both stepwise and stochastic search approaches identified four T1D association signals, with the major effect tagged by the single nucleotide polymorphism, rs12722496. In contrast, for MS, the stochastic search found two distinct competing models: a single candidate causal variant, tagged by rs2104286 and reported previously using stepwise analysis; and a more complex model with two association signals, one of which was tagged by the major T1D associated rs12722496 and the other by rs56382813. There is low to moderate LD between rs2104286 and both rs12722496 and rs56382813 (r2 ≃ 0:3) and our two SNP model could not be recovered through a forward stepwise search after conditioning on rs2104286. Both signals in the two variant model for MS affect CD25 expression on distinct subpopulations of CD4+ T cells, which are key cells in the autoimmune process. The results support a shared causal variant for T1D and MS. Our study illustrates the benefit of using a purposely designed model search strategy for fine mapping and the advantage of combining disease and protein expression data.We acknowledge use of DNA from The UK Blood Services collection of Common Controls (UKBS-CC collection), which is funded by the Wellcome Trust grant 076113/C/04/Z and by the USA National Institute for Health Research program grant to the National Health Service Blood and Transplant (RP-PG-0310-1002). We acknowledge the use of DNA from the British 1958 Birth Cohort collection, which is funded by the UK Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. This research utilized resources provided by the Type 1
Diabetes Genetics Consortium, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases, the National Institute of Allergy and Infectious Diseases, the National Human Genome Research Institute, the National Institute of Child Health and Human Development and the JDRF and is supported by the USA National Institutes of Health grant U01-DK062418. The JDRF/Wellcome Trust Diabetes and Inflammation Laboratory is funded by the JDRF (9-2011-253), the Wellcome Trust (091157) and the National Institute for Health Research
Cambridge Biomedical Centre. The research leading to these results has received funding from the European Union's 7th Framework Programme (FP7/2007-2013) under grant agreement no.241447 (NAIMIT). The Cambridge Institute for Medical Research (CIMR) is in receipt of a Wellcome Trust Strategic Award (100140). CW is supported by the Wellcome Trust (089989). We acknowledge the National Institute for Health Research Cambridge Biomedical Research Centre for funding.This is the final version of the article. It first appeared from PLOS via http://dx.doi.org/10.1371/journal.pgen.100527
Recommended from our members
Development of an integrated genome informatics, data management and workflow infrastructure: a toolbox for the study of complex disease genetics.
The genetic dissection of complex disease remains a significant challenge. Sample-tracking and the recording, processing and storage of high-throughput laboratory data with public domain data, require integration of databases, genome informatics and genetic analyses in an easily updated and scaleable format. To find genes involved in multifactorial diseases such as type 1 diabetes (T1D), chromosome regions are defined based on functional candidate gene content, linkage information from humans and animal model mapping information. For each region, genomic information is extracted from Ensembl, converted and loaded into ACeDB for manual gene annotation. Homology information is examined using ACeDB tools and the gene structure verified. Manually curated genes are extracted from ACeDB and read into the feature database, which holds relevant local genomic feature data and an audit trail of laboratory investigations. Public domain information, manually curated genes, polymorphisms, primers, linkage and association analyses, with links to our genotyping database, are shown in Gbrowse. This system scales to include genetic, statistical, quality control (QC) and biological data such as expression analyses of RNA or protein, all linked from a genomics integrative display. Our system is applicable to any genetic study of complex disease, of either large or small scale.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Recommended from our members
Genetic feature engineering enables characterisation of shared risk factors in immune-mediated diseases
Abstract: Background: Genome-wide association studies (GWAS) have identified pervasive sharing of genetic architectures across multiple immune-mediated diseases (IMD). By learning the genetic basis of IMD risk from common diseases, this sharing can be exploited to enable analysis of less frequent IMD where, due to limited sample size, traditional GWAS techniques are challenging. Methods: Exploiting ideas from Bayesian genetic fine-mapping, we developed a disease-focused shrinkage approach to allow us to distill genetic risk components from GWAS summary statistics for a set of related diseases. We applied this technique to 13 larger GWAS of common IMD, deriving a reduced dimension “basis” that summarised the multidimensional components of genetic risk. We used independent datasets including the UK Biobank to assess the performance of the basis and characterise individual axes. Finally, we projected summary GWAS data for smaller IMD studies, with less than 1000 cases, to assess whether the approach was able to provide additional insights into genetic architecture of less common IMD or IMD subtypes, where cohort collection is challenging. Results: We identified 13 IMD genetic risk components. The projection of independent UK Biobank data demonstrated the IMD specificity and accuracy of the basis even for traits with very limited case-size (e.g. vitiligo, 150 cases). Projection of additional IMD-relevant studies allowed us to add biological interpretation to specific components, e.g. related to raised eosinophil counts in blood and serum concentration of the chemokine CXCL10 (IP-10). On application to 22 rare IMD and IMD subtypes, we were able to not only highlight subtype-discriminating axes (e.g. for juvenile idiopathic arthritis) but also suggest eight novel genetic associations. Conclusions: Requiring only summary-level data, our unsupervised approach allows the genetic architectures across any range of clinically related traits to be characterised in fewer dimensions. This facilitates the analysis of studies with modest sample size by matching shared axes of both genetic and biological risk across a wider disease domain, and provides an evidence base for possible therapeutic repurposing opportunities
- …