39 research outputs found
eQTL Catalogue 2023: New datasets, X chromosome QTLs, and improved detection and visualisation of transcript-level QTLs
The eQTL Catalogue is an open database of uniformly processed human molecular quantitative trait loci (QTLs). We are continuously updating the resource to further increase its utility for interpreting genetic associations with complex traits. Over the past two years, we have increased the number of uniformly processed studies from 21 to 31 and added X chromosome QTLs for 19 compatible studies. We have also implemented Leafcutter to directly identify splice-junction usage QTLs in all RNA sequencing datasets. Finally, to improve the interpretability of transcript-level QTLs, we have developed static QTL coverage plots that visualise the association between the genotype and average RNA sequencing read coverage in the region for all 1.7 million fine mapped associations. To illustrate the utility of these updates to the eQTL Catalogue, we performed colocalisation analysis between vitamin D levels in the UK Biobank and all molecular QTLs in the eQTL Catalogue. Although most GWAS loci colocalised both with eQTLs and transcript-level QTLs, we found that visual inspection could sometimes be used to distinguish primary splicing QTLs from those that appear to be secondary consequences of large-effect gene expression QTLs. While these visually confirmed primary splicing QTLs explain just 6/53 of the colocalising signals, they are significantly less pleiotropic than eQTLs and identify a prioritised causal gene in 4/6 cases
Recommended from our members
Increased brain expression of GPNMB is associated with genome wide significant risk for Parkinson's disease on chromosome 7p15.3
Genome wide association studies (GWAS) for Parkinson's disease (PD) have previously revealed a significant association with a locus on chromosome 7p15.3, initially designated as the glycoprotein non-metastatic melanoma protein B (GPNMB) locus. In this study, the functional consequences of this association on expression were explored in depth by integrating different expression quantitative trait locus (eQTL) datasets (Braineac, CAGEseq, GTEx, and Phenotype-Genotype Integrator (PheGenI)). Top risk SNP rs199347 eQTLs demonstrated increased expressions of GPNMB, KLHL7, and NUPL2 with the major allele (AA) in brain, with most significant eQTLs in cortical regions, followed by putamen. In addition, decreased expression of the antisense RNA KLHL7-AS1 was observed in GTEx. Furthermore, rs199347 is an eQTL with long non-coding RNA (AC005082.12) in human tissues other than brain. Interestingly, transcript-specific eQTLs in immune-related tissues (spleen and lymphoblastoid cells) for NUPL2 and KLHL7-AS1 were observed, which suggests a complex functional role of this eQTL in specific tissues, cell types at specific time points. Significantly increased expression of GPNMB linked to rs199347 was consistent across all datasets, and taken in combination with the risk SNP being located within the GPNMB gene, these results suggest that increased expression of GPNMB is the causative link explaining the association of this locus with PD. However, other transcript eQTLs and subsequent functional roles cannot be excluded. This highlights the importance of further investigations to understand the functional interactions between the coding genes, antisense, and non-coding RNA species considering the tissue and cell-type specificity to understand the underlying biological mechanisms in PD
Novel C12orf65 mutations in patients with axonal neuropathy and optic atrophy
Charcot-Marie Tooth disease (CMT) forms a clinically and genetically heterogeneous group of disorders. Although a number of disease genes have been identified for CMT, the gene discovery for some complex form of CMT has lagged behind. The association of neuropathy and optic atrophy (also known as CMT type 6) has been described with autosomaldominant, recessive and X-linked modes of inheritance. Mutations in Mitofusin 2 have been found to cause dominant forms of CMT6. Phosphoribosylpyrophosphate synthetase-I mutations cause X-linked CMT6, but until now, mutations in the recessive forms of disease have never been identified
Fine-Mapping, Gene Expression and Splicing Analysis of the Disease Associated LRRK2 Locus
Association studies have identified several signals at the LRRK2 locus for Parkinson's disease (PD), Crohn's disease (CD) and leprosy. However, little is known about the molecular mechanisms mediating these effects. To further characterize this locus, we fine-mapped the risk association in 5,802 PD and 5,556 controls using a dense genotyping array (ImmunoChip). Using samples from 134 post-mortem control adult human brains (UK Human Brain Expression Consortium), where up to ten brain regions were available per individual, we studied the regional variation, splicing and regulation of LRRK2. We found convincing evidence for a common variant PD association located outside of the LRRK2 protein coding region (rs117762348, A>G, P = 2.56×10(-8), case/control MAF 0.083/0.074, odds ratio 0.86 for the minor allele with 95% confidence interval [0.80-0.91]). We show that mRNA expression levels are highest in cortical regions and lowest in cerebellum. We find an exon quantitative trait locus (QTL) in brain samples that localizes to exons 32-33 and investigate the molecular basis of this eQTL using RNA-Seq data in n = 8 brain samples. The genotype underlying this eQTL is in strong linkage disequilibrium with the CD associated non-synonymous SNP rs3761863 (M2397T). We found two additional QTLs in liver and monocyte samples but none of these explained the common variant PD association at rs117762348. Our results characterize the LRRK2 locus, and highlight the importance and difficulties of fine-mapping and integration of multiple datasets to delineate pathogenic variants and thus develop an understanding of disease mechanisms
Identification of novel risk loci, causal insights, and heritable risk for Parkinson's disease: a meta-analysis of genome-wide association studies
Background Genome-wide association studies (GWAS) in Parkinson's disease have increased the scope of biological knowledge about the disease over the past decade. We aimed to use the largest aggregate of GWAS data to identify novel risk loci and gain further insight into the causes of Parkinson's disease. Methods We did a meta-analysis of 17 datasets from Parkinson's disease GWAS available from European ancestry samples to nominate novel loci for disease risk. These datasets incorporated all available data. We then used these data to estimate heritable risk and develop predictive models of this heritability. We also used large gene expression and methylation resources to examine possible functional consequences as well as tissue, cell type, and biological pathway enrichments for the identified risk factors. Additionally, we examined shared genetic risk between Parkinson's disease and other phenotypes of interest via genetic correlations followed by Mendelian randomisation. Findings Between Oct 1, 2017, and Aug 9, 2018, we analysed 7·8 million single nucleotide polymorphisms in 37 688 cases, 18 618 UK Biobank proxy-cases (ie, individuals who do not have Parkinson's disease but have a first degree relative that does), and 1·4 million controls. We identified 90 independent genome-wide significant risk signals across 78 genomic regions, including 38 novel independent risk signals in 37 loci. These 90 variants explained 16–36% of the heritable risk of Parkinson's disease depending on prevalence. Integrating methylation and expression data within a Mendelian randomisation framework identified putatively associated genes at 70 risk signals underlying GWAS loci for follow-up functional studies. Tissue-specific expression enrichment analyses suggested Parkinson's disease loci were heavily brain-enriched, with specific neuronal cell types being implicated from single cell data. We found significant genetic correlations with brain volumes (false discovery rate-adjusted p=0·0035 for intracranial volume, p=0·024 for putamen volume), smoking status (p=0·024), and educational attainment (p=0·038). Mendelian randomisation between cognitive performance and Parkinson's disease risk showed a robust association (p=8·00 × 10−7). Interpretation These data provide the most comprehensive survey of genetic risk within Parkinson's disease to date, to the best of our knowledge, by revealing many additional Parkinson's disease risk loci, providing a biological context for these risk factors, and showing that a considerable genetic component of this disease remains unidentified. These associations derived from European ancestry datasets will need to be followed-up with more diverse data. Funding The National Institute on Aging at the National Institutes of Health (USA), The Michael J Fox Foundation, and The Parkinson's Foundation (see appendix for full list of funding sources)
Novel genetic loci underlying human intracranial volume identified through genome-wide association
Intracranial volume reflects the maximally attained brain size during development, and remains stable with loss of tissue in late life. It is highly heritable, but the underlying genes remain largely undetermined. In a genome-wide association study of 32,438 adults, we discovered five novel loci for intracranial volume and confirmed two known signals. Four of the loci are also associated with adult human stature, but these remained associated with intracranial volume after adjusting for height. We found a high genetic correlation with child head circumference (ρgenetic=0.748), which indicated a similar genetic background and allowed for the identification of four additional loci through meta-analysis (Ncombined = 37,345). Variants for intracranial volume were also related to childhood and adult cognitive function, Parkinson’s disease, and enriched near genes involved in growth pathways including PI3K–AKT signaling. These findings identify biological underpinnings of intracranial volume and provide genetic support for theories on brain reserve and brain overgrowth
Recommended from our members
Identification of candidate Parkinson disease genes by integrating genome-wide association study, expression, and epigenetic data sets
Importance Substantial genome-wide association study (GWAS) work in Parkinson disease (PD) has led to the discovery of an increasing number of loci shown reliably to be associated with increased risk of disease. Improved understanding of the underlying genes and mechanisms at these loci will be key to understanding the pathogenesis of PD.
Objective To investigate what genes and genomic processes underlie the risk of sporadic PD.
Design and Setting This genetic association study used the bioinformatic tools Coloc and transcriptome-wide association study (TWAS) to integrate PD case-control GWAS data published in 2017 with expression data (from Braineac, the Genotype-Tissue Expression [GTEx], and CommonMind) and methylation data (derived from UK Parkinson brain samples) to uncover putative gene expression and splicing mechanisms associated with PD GWAS signals. Candidate genes were further characterized using cell-type specificity, weighted gene coexpression networks, and weighted protein-protein interaction networks.
Main Outcomes and Measures It was hypothesized a priori that some genes underlying PD loci would alter PD risk through changes to expression, splicing, or methylation. Candidate genes are presented whose change in expression, splicing, or methylation are associated with risk of PD as well as the functional pathways and cell types in which these genes have an important role.
Results Gene-level analysis of expression revealed 5 genes (WDR6 [OMIM 606031], CD38 [OMIM 107270], GPNMB [OMIM 604368], RAB29 [OMIM 603949], and TMEM163 [OMIM 618978]) that replicated using both Coloc and TWAS analyses in both the GTEx and Braineac expression data sets. A further 6 genes (ZRANB3 [OMIM 615655], PCGF3 [OMIM 617543], NEK1 [OMIM 604588], NUPL2 [NCBI 11097], GALC [OMIM 606890], and CTSB [OMIM 116810]) showed evidence of disease-associated splicing effects. Cell-type specificity analysis revealed that gene expression was overall more prevalent in glial cell types compared with neurons. The weighted gene coexpression performed on the GTEx data set showed that NUPL2 is a key gene in 3 modules implicated in catabolic processes associated with protein ubiquitination and in the ubiquitin-dependent protein catabolic process in the nucleus accumbens, caudate, and putamen. TMEM163 and ZRANB3 were both important in modules in the frontal cortex and caudate, respectively, indicating regulation of signaling and cell communication. Protein interactor analysis and simulations using random networks demonstrated that the candidate genes interact significantly more with known mendelian PD and parkinsonism proteins than would be expected by chance.
Conclusions and Relevance Together, these results suggest that several candidate genes and pathways are associated with the findings observed in PD GWAS studies
The transcriptional landscape of age in human peripheral blood
Disease incidences increase with age, but the molecular characteristics of ageing that lead to increased disease susceptibility remain inadequately understood. Here we perform a whole-blood gene expression meta-analysis in 14,983 individuals of European ancestry (including replication) and identify 1,497 genes that are differentially expressed with chronological age. The age-associated genes do not harbor more age-associated CpG-methylation sites than other genes, but are instead enriched for the presence of potentially functional CpG-methylation sites in enhancer and insulator regions that associate with both chronological age and gene expression levels. We further used the gene expression profiles to calculate the 'transcriptomic age' of an individual, and show that differences between transcriptomic age and chronological age are associated with biological features linked to ageing, such as blood pressure, cholesterol levels, fasting glucose, and body mass index. The transcriptomic prediction model adds biological relevance and complements existing epigenetic prediction models, and can be used by others to calculate transcriptomic age in external cohorts.Peer reviewe
Recursive splicing in long vertebrate genes
It is generally believed that splicing removes introns as single units from precursor messenger RNA transcripts. However, some long Drosophila melanogaster introns contain a cryptic site, known as a recursive splice site (RS-site), that enables a multi-step process of intron removal termed recursive splicing. The extent to which recursive splicing occurs in other species and its mechanistic basis have not been examined. Here we identify highly conserved RS-sites in genes expressed in the mammalian brain that encode proteins functioning in neuronal development. Moreover, the RS-sites are found in some of the longest introns across vertebrates. We find that vertebrate recursive splicing requires initial definition of an RS-exon that follows the RS-site. The RS-exon is then excluded from the dominant mRNA isoform owing to competition with a reconstituted 5 splice site formed at the RS-site after the first splicing step. Conversely, the RS-exon is included when preceded by cryptic promoters or exons that fail to reconstitute an efficient 5 splice site. Most RS-exons contain a premature stop codon such that their inclusion can decrease mRNA stability. Thus, by establishing a binary splicing switch, RS-sites demarcate different mRNA isoforms emerging from long genes by coupling cryptic elements with inclusion of RS-exons