102 research outputs found
Personalized Proteome: Comparing Proteogenomics and Open Variant Search Approaches for Single Amino Acid Variant Detection
Item does not contain fulltex
Clinical improvement of DM1 patients reflected by reversal of disease-induced gene expression in blood
Background: Myotonic dystrophy type 1 (DM1) is an incurable multisystem disease caused by a CTG-repeat expansion in the DM1 protein kinase (DMPK) gene. The OPTIMISTIC clinical trial demonstrated positive and heterogenous effects of cognitive behavioral therapy (CBT) on the capacity for activity and social participations in DM1 patients. Through a process of reverse engineering, this study aims to identify druggable molecular biomarkers associated with the clinical improvement in the OPTIMISTIC cohort. Methods: Based on full blood samples collected during OPTIMISTIC, we performed paired mRNA sequencing for 27 patients before and after the CBT intervention. Linear mixed effect models were used to identify biomarkers associated with the disease-causing CTG expansion and the mean clinical improvement across all clinical outcome measures. Results: We identified 608 genes for which their expression was significantly associated with the CTG-repeat expansion, as well as 1176 genes significantly associated with the average clinical response towards the intervention. Remarkably, all 97 genes associated with both returned to more normal levels in patients who benefited the most from CBT. This main finding has been replicated based on an external dataset of mRNA data of DM1 patients and controls, singling these genes out as candidate biomarkers for therapy response. Among these candidate genes were DNAJB12, HDAC5, and TRIM8, each belonging to a protein family that is being studied in the context of neurological disorders or muscular dystrophies. Across the different gene sets, gene pathway enrichment analysis revealed disease-relevant impaired signaling in, among others, insulin-, metabolism-, and immune-related pathways. Furthermore, evidence for shared dysregulations with another neuromuscular disease, Duchenne muscular dystrophy, was found, suggesting a partial overlap in blood-based gene dysregulation. Conclusions: DM1-relevant disease signatures can be identified on a molecular level in peripheral blood, opening new avenues for drug discovery and therapy efficacy assessments.</p
Towards FAIRification of sensitive and fragmented rare disease patient data:challenges and solutions in European reference network registries
INTRODUCTION: Rare disease patient data are typically sensitive, present in multiple registries controlled by different custodians, and non-interoperable. Making these data Findable, Accessible, Interoperable, and Reusable (FAIR) for humans and machines at source enables federated discovery and analysis across data custodians. This facilitates accurate diagnosis, optimal clinical management, and personalised treatments. In Europe, twenty-four European Reference Networks (ERNs) work on rare disease registries in different clinical domains. The process and the implementation choices for making data FAIR (‘FAIRification’) differ among ERN registries. For example, registries use different software systems and are subject to different legal regulations. To support the ERNs in making informed decisions and to harmonise FAIRification, the FAIRification steward team was established to work as liaisons between ERNs and researchers from the European Joint Programme on Rare Diseases. RESULTS: The FAIRification steward team inventoried the FAIRification challenges of the ERN registries and proposed solutions collectively with involved stakeholders to address them. Ninety-eight FAIRification challenges from 24 ERNs’ registries were collected and categorised into “training” (31), “community” (9), “modelling” (12), “implementation” (26), and “legal” (20). After curating and aggregating highly similar challenges, 41 unique FAIRification challenges remained. The two categories with the most challenges were “training” (15) and “implementation” (9), followed by “community” (7), and then “modelling” (5) and “legal” (5). To address all challenges, eleven types of solutions were proposed. Among them, the provision of guidelines and the organisation of training activities resolved the “training” challenges, which ranged from less-technical “coffee-rounds” to technical workshops, from informal FAIR Games to formal hackathons. Obtaining implementation support from technical experts was the solution type for tackling the “implementation” challenges. CONCLUSION: This work shows that a dedicated team of FAIR data stewards is an asset for harmonising the various processes of making data FAIR in a large organisation with multiple stakeholders. Additionally, multi-levelled training activities are required to accommodate the diverse needs of the ERNs. Finally, the lessons learned from the experience of the FAIRification steward team described in this paper may help to increase FAIR awareness and provide insights into FAIRification challenges and solutions of rare disease registries. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13023-022-02558-5
A Resource for Guiding Data Stewards to Make European Rare Disease Patient Registries FAIR
Objective: This paper reports on the development of a dynamic data management planning questionnaire to guide data stewards of the European Reference Network (ERN) rare disease patient registries to make their data findable, accessible, interoperable, and reusable (FAIR). As part of this work, the questionnaire was validated through expert review and aligned with existing resources on rare diseases and FAIR data management. Materials and Methods: The questionnaire was developed for the Data Stewardship Wizard, a tool for data management planning. Knowledge sources on FAIR data, ERN patient registries, and data management were used to compose questions. Ten domain experts validated the questionnaire. The topics in the questionnaire were aligned with existing knowledge bases. Results: A total of 57 questions were included in the questionnaire. Twenty-three references to the FAIR Cookbook and Research Data Management toolkit for Life Sciences were added. Expert validation provided a total of 166 comments on content, structure, and software-related issues. A public instance of the Data Stewardship Wizard was deployed for use by data stewards of ERN patient registries. Discussion: The questionnaire addresses issues that ERNs encounter when making their registries FAIR and follows the implementation choices made by the European rare disease community. A challenging task for future research is to extend the questionnaire to other types of registries and to validate with users. Conclusion: This smart questionnaire is the first model created for the Data Stewardship Wizard that helps ERN patient registries with making their data FAIR. It will assist data stewards in aligning their efforts and providing guidance on FAIR data
Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution
We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking
Refining Attention-Deficit/Hyperactivity Disorder and Autism Spectrum Disorder Genetic Loci by Integrating Summary Data From Genome-wide Association, Gene Expression, and DNA Methylation Studies
Background: Recent genome-wide association studies (GWASs) identified the first genetic loci associated with attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). The next step is to use these results to increase our understanding of the biological mechanisms involved. Most of the identified variants likely influence gene regulation. The aim of the current study is to shed light on the mechanisms underlying the genetic signals and prioritize genes by integrating GWAS results with gene expression and DNA methylation (DNAm) levels. Methods: We applied summary-data–based Mendelian randomization to integrate ADHD and ASD GWAS data with fetal brain expression and methylation quantitative trait loci, given the early onset of these disorders. We also analyzed expression and methylation quantitative trait loci datasets of adult brain and blood, as these provide increased statistical power. We subsequently used summary-data–based Mendelian randomization to investigate if the same variant influences both DNAm and gene expression levels. Results: We identified multiple gene expression and DNAm levels in fetal brain at chromosomes 1 and 17 that were associated with ADHD and ASD, respectively, through pleiotropy at shared genetic variants. The analyses in brain and blood showed additional associated gene expression and DNAm levels at the same and additional loci, likely because of increased statistical power. Several of the associated genes have not been identified in ADHD and ASD GWASs before. Conclusions: Our findings identified the genetic variants associated with ADHD and ASD that likely act through gene regulation. This facilitates prioritization of candidate genes for functional follow-up studies
Blood lipids influence DNA methylation in circulating cells
Background: Cells can be primed by external stimuli to obtain a long-term epigenetic memory. We hypothesize that long-term exposure to elevated blood lipids can prime circulating immune cells through changes in DNA methylation, a process that may contribute to the development of atherosclerosis. To interrogate the causal relationship between triglyceride, low-density lipoprotein (LDL) cholesterol, and high-density lipoprotein (HDL) cholesterol levels and genome-wide DNA methylation while excluding confounding and pleiotropy, we perform a stepwise Mendelian randomization analysis in whole blood of 3296 individuals. Results: This analysis shows that differential methylation is the consequence of inter-individual variation in blood lipid levels and not vice versa. Specifically, we observe an effect of triglycerides on DNA methylation at three CpGs, of LDL cholesterol at one CpG, and of HDL cholesterol at two CpGs using multivariable Mendelian randomization. Using RNA-seq data available for a large subset of individuals (N = 2044), DNA methylation of these six CpGs is associated with the expression of CPT1A and SREBF1 (for triglycerides), DHCR24 (for LDL cholesterol) and
Improving Phenotypic Prediction by Combining Genetic and Epigenetic Associations
We tested whether DNA-methylation profiles account for inter-individual variation in body mass index (BMI) and height and whether they predict these phenotypes over and above genetic factors. Genetic predictors were derived from published summary results from the largest genome-wide association studies on BMI (n ∼ 350,000) and height (n ∼ 250,000) to date. We derived methylation predictors by estimating probe-trait effects in discovery samples and tested them in external samples. Methylation profiles associated with BMI in older individuals from the Lothian Birth Cohorts (LBCs, n = 1,366) explained 4.9% of the variation in BMI in Dutch adults from the LifeLines DEEP study (n = 750) but did not account for any BMI variation in adolescents from the Brisbane Systems Genetic Study (BSGS, n = 403). Methylation profiles based on the Dutch sample explained 4.9% and 3.6% of the variation in BMI in the LBCs and BSGS, respectively. Methylation profiles predicted BMI independently of genetic profiles in an additive manner: 7%, 8%, and 14% of variance of BMI in the LBCs were explained by the methylation predictor, the genetic predictor, and a model containing both, respectively. The corresponding percentages for LifeLines DEEP were 5%, 9%, and 13%, respectively, suggesting that the methylation profiles represent environmental effects. The differential effects of the BMI methylation profiles by age support previous observations of age modulation of genetic contributions. In contrast, methylation profiles accounted for almost no variation in height, consistent with a mainly genetic contribution to inter-individual variation. The BMI results suggest that combining genetic and epigenetic information might have greater utility for complex-trait prediction
Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism
- …