6 research outputs found
Unravelling Inflammatory Pathways in Parkinson's Disease: Insights from Pathway-Based Machine Learning Analysis of Transcriptomics Data.
The analysis of Parkinson's disease transcriptomics data using aggregated higher-level funcitonal representations identified neuroinflammaotry and immune response pathways. In particular several TNF-alpha/NF-kappa beta signalling complexes were identified
Pathway-based machine learning analysis of Parkinson’s disease transcriptomics data reveals coordinated alterations in inflammatory pathways
peer reviewedIntroduction:
Neuroinflammation may be a critical component in the progression of Parkinson's disease (PD), as increasing evidence suggests that it contributes to the degeneration of dopaminergic neurons. Despite extensive research, the underlying molecular pathways that drive neuroinflammation in PD remain largely unknown. However, machine learning and systems-level pathway analyses of omics data have the potential to uncover relevant mechanisms and processes that can help to improve the understanding of neuroinflammation in PD.
Methods:
We apply statistical and machine learning analyses on cross-sectional and longitudinal transcriptomics data from PD patients and controls, investigating both gene level alterations and aggregated functional representations, such as pathway-level, cell compartment-level and protein complex-level features. These higher-level representations allow us to identify coordinated changes in cellular compartments and processes in PD, focusing on inflammatory and immune system pathways. Apart from comparing these features statistically at baseline between patients and controls, we study consistent longitudinal changes in PD over consecutive clinical visits. Finally, we build interpretable machine learning models for motor-stage PD vs. control classification and validate them using a nested cross-validation and testing on independent hold-out data.
Results:
The results highlight significant alterations in individual genes and pathways associated with inflammation and immune response in PD patients vs. controls at baseline and longitudinally in patients only. Specifically, we identified PD progression associated patterns for pathways associated with humoral immune response, complement receptor signaling, and response to cytokine stimuli. For the prediction of baseline diagnostic status, alterations were observed in the transcriptomics of cellular processes related to the production of interleukins, receptor signaling, TNF-alpha/NF-kappa B complex, and B-cell specific protein complexes.
Conclusion:
Overall, our analyses reveal distinct patterns of gene expression alterations in inflammation and immune response pathways in PD patients compared to controls at baseline and longitudinally. These alterations also provided significant information content for building predictive machine learning models. Interestingly, the time series analyses mainly identified different affected genes and pathways than those found altered in the baseline comparison against controls, underscoring the relevance of considering temporal profiles and dynamic biomarkers. These results may contribute to a better understanding of coordinated inflammation and immune system associated changes in PD, with applications in diagnostic and prognostic biomarker development and the prioritization of cellular pathways for drug target discovery
LuxPARK Metabolomics data analysis and modelling
Statistical, network and machine learning analyses of metabolomics data from the LuxPARK cohort reveals coordinated alterations in xanthine metabolism
Interpreting Omics Data in Parkinson’s Disease: A Statistical, Machine Learning, and Graph Representation Learning Approach
Parkinson’s disease (PD) is characterized by the heterogeneity and complexity of both its clinical symptoms and molecular mechanisms, which hinders the development of reliable diagnostic and prognostic biomarkers. This thesis presents an integrated approach to identify cross-sectional and longitudinal molecular signatures associated with PD diagnosis and motor symptoms by incorporating domain-specific knowledge into the analysis and modeling of blood transcriptomics and metabolomics.
Statistical analyses and machine learning algorithms were applied to identify, compare, and interpret relevant factors for predicting PD diagnosis and motor dysfunction severity using molecular measurements at baseline and over time. Both individual molecules and aggregated, higher-level functional representations of global activity changes in cellular pathways, compartments, and protein complex signatures were examined. In addition, two modelling pipelines exploiting graph representation learning on sample similarity networks and molecular interaction networks were implemented for PD case-control classification.
Although the resulting machine learning models still have limitations in terms of predictive performance, they highlight a number of robust and pronounced PD-specific changes at baseline and over time, including changes in mitochondrial β-oxidation of fatty acids and purine/xanthine metabolism. These changes remain significant when the analyses are adjusted for relevant confounders, such as the effects of dopaminergic medications on plasma metabolomics.
In addition to different machine learning methods, different feature selection approaches were evaluated, highlighting the Lasso approach with unsupervised filters as a favorable strategy. Furthermore, the investigation of longitudinal data showed that even with a limited number of available time points, identified candidate dynamic biomarkers hold promise for further validation studies in larger cohorts with multiple follow-up examinations.
Finally, the study of omics data using graph representation learning on molecular interaction networks provided mechanistic insights, confirming changes in known PD-associated genes and metabolites, and uncovering promising new candidate markers. While the use of molecular interaction networks is limited by experimental biases and the incompleteness of known interactions, networks built upon sample similarity among omics profiles can provide an unbiased graph structure, although interpretation of the results may be more challenging.
Overall, the comprehensive study of statistical, machine learning, and graph representation learning models presented in this thesis highlights the benefits of using prior domain knowledge for omics data analysis and reveals robust disease associations at the level of single molecules and higher-level representations. The work illustrates the potential of higherlevel functional and network representations, together with dynamic biomarker analysis of longitudinal data, for building predictive models to study a complex and heterogeneous disease such as PD. In addition to these methodological findings, the biological results provide new insights into relevant disease mechanisms in PD and lay the groundwork for validation studies in larger, independent cohorts
Predictive Factors and Risk Model for Positive Circumferential Resection Margin Rate after Transanal Total Mesorectal Excision in 2653 Patients with Rectal Cancer
The aim of this study was to determine the incidence of, and preoperative risk factors for, positive circumferential resection margin (CRM) after transanal total mesorectal excision (TaTME). Background: TaTME has the potential to further reduce the rate of positive CRM for patients with low rectal cancer, thereby improving oncological outcome. Methods: A prospective registry-based study including all cases recorded on the international TaTME registry between July 2014 and January 2018 was performed. Endpoints were the incidence of, and predictive factors for, positive CRM. Univariate and multivariate logistic regressions were performed, and factors for positive CRM were then assessed by formulating a predictive model. Results: In total, 2653 patients undergoing TaTME for rectal cancer were included. The incidence of positive CRM was 107 (4.0%). In multivariate logistic regression analysis, a positive CRM after TaTME was significantly associated with tumors located up to 1 cm from the anorectal junction, anterior tumors, cT4 tumors, extra-mural venous invasion (EMVI), and threatened or involved CRM on baseline MRI (odds ratios 2.09, 1.66, 1.93, 1.94, and 1.72, respectively). The predictive model showed adequate discrimination (area under the receiver-operating characteristic curve >0.70), and predicted a 28% risk of positive CRM if all risk factors were present. Conclusion: Five preoperative tumor-related characteristics had an adverse effect on CRM involvement after TaTME. The predicted risk of positive CRM after TaTME for a specific patient can be calculated preoperatively with the proposed model and may help guide patient selection for optimal treatment and enhance a tailored treatment approach to further optimize oncological outcomes