10 research outputs found

    Machine learning applied to higher order functional representations of omics data reveals biological pathways associated with Parkinson‘s Disease

    Get PDF
    Background: Despite the increasing prevalence of Parkinson’s Disease (PD) and research efforts to understand its underlying molecular pathogenesis, early diagnosis of PD remains a challenge. Machine learning analysis of blood-based omics data is a promising non-invasive approach to finding molecular fingerprints associated with PD that may enable an early and accurate diagnosis. Description: We applied several machine learning classification methods to public omics data from PD case/control studies. We used aggregation statistics and Pathifier’s pathway deregulation scores to generate higher order functional representations of the data such as pathway-level features. The models’ performance and most relevant predictive features were compared with individual feature level predictors. The resulting diagnostic models from individual features and Pathifier’s pathway deregulation scores achieve significant Area Under the Curve (AUC, a receiver operating characteristic curve) scores for both cross-validation and external testing. Furthermore, we identify plausible biological pathways associated with PD diagnosis. Conclusions: We have successfully built machine learning models at pathway-level and single-feature level to study blood-based omics data for PD diagnosis. Plausible biological pathway associations were identified. Furthermore, we show that pathway deregulation scores can serve as robust and biologically interpretable predictors for PD

    Graph neural networks for investigating complex diseases: A case study on Parkinson's Disease

    Get PDF
    Omics data analysis is a critical component in the study of complex diseases, but the high dimension and heterogeneity of the data often pose challenges that are difficult to address by classical statistical and machine learning methods. Recently, structured data analyses using graph neural networks (GNNs) have emerged as a promising complementary approach, particularly for investigating the relational information between samples. However, it is still unclear which strategies for designing and optimizing GNNs are most effective when working with real-world data from complex disorders, such as Parkinson's disease (PD). Our study addresses this gap by examining the application of various GNN models, including Graph Convolutional Network, ChebyNet, and Graph Attention Network, to identify and interpret discriminative patterns between PD patients and controls using omics data. The developed pipeline integrates Lasso penalty-based feature selection, similarity graph construction, and final modeling for sample classification. Through an end-to-end model building and evaluation process, we assess the practical utility of the pipeline on independent PD omics datasets. Overall, our analyses highlight some of the benefits and challenges of using graph structure data for machine learning analysis of disease-related omics data and provide directions for further research.R-AGR-0621 - Dons Alzheimer Projekt (Dr. Glaab) (20151026-20480119) - SCHNEIDER Reinhar

    Cytogenetic status in newborns and their parents in Madrid: The BioMadrid study

    Get PDF
    Monitoring cytogenetic damage is frequently used to assess population exposure to environmental mutagens. The cytokinesis-block micronucleus assay is one of the most widely used methods employed in these studies. In the present study we used this assay to assess the baseline frequency of micronuclei in a healthy population of father-pregnant woman-newborn trios drawn from two Madrid areas. We also investigated the association between micronucleus frequency and specific socioeconomic, environmental, and demographic factors collected by questionnaire. Mercury, arsenic, lead, and cadmium blood levels were measured by atomic absorption spectrometry. The association between micronucleated cell frequency and the variables collected by questionnaire, as well as, the risk associated with the presence of elevated levels of metals in blood, was estimated using Poisson models, taking the number of micronucleated cells in 1,000 binucleated cells (MNBCs) as the dependent variable. Separate analyses were conducted for the 110 newborns, 136 pregnant women, and 134 fathers in whom micronuclei could be assessed. The mean number of micronucleated cells per 1,000 binucleated cells was 3.9, 6.5, and 6.1 respectively. Our results show a statistically significant correlation in MNBC frequency between fathers and mothers, and between parents and newborns. Elevated blood mercury levels in fathers were associated with significantly higher MNBC frequency, compared with fathers who had normal mercury levels (RR:1.21; 95%CI:1.02-1.43). This last result suggests the need to implement greater control over populations which, by reason of their occupation or life style, are among those most exposed to this metal.Peer reviewe

    Age at onset as stratifier in idiopathic Parkinson’s disease – effect of ageing and polygenic risk score on clinical phenotypes

    Get PDF
    Several phenotypic differences observed in Parkinson’s disease (PD) patients have been linked to age at onset (AAO). We endeavoured to find out whether these differences are due to the ageing process itself by using a combined dataset of idiopathic PD (n = 430) and healthy controls (HC; n = 556) excluding carriers of known PD-linked genetic mutations in both groups. We found several significant effects of AAO on motor and non-motor symptoms in PD, but when comparing the effects of age on these symptoms with HC (using age at assessment, AAA), only positive associations of AAA with burden of motor symptoms and cognitive impairment were significantly different between PD vs HC. Furthermore, we explored a potential effect of polygenic risk score (PRS) on clinical phenotype and identified a significant inverse correlation of AAO and PRS in PD. No significant association between PRS and severity of clinical symptoms was found. We conclude that the observed non-motor phenotypic differences in PD based on AAO are largely driven by the ageing process itself and not by a specific profile of neurodegeneration linked to AAO in the idiopathic PD patients

    Ten Quick Tips for Biomarker Discovery and Validation Analyses Using Machine Learning

    Get PDF
    High-throughput experimental methods for biosample profiling and growing collections of clinical and health record data provide ample opportunities for biomarker discovery and medical decision support. However, many of the new data types, including single-cell omics and high-resolution cellular imaging data, also pose particular challenges for data analysis. A high dimensionality of the data in relation to small numbers of available samples, influences of additive and multiplicative noise, large numbers of uninformative or redundant data features, outliers, confounding factors and imbalanced sample group numbers are all common characteristics of current biomedical data collections. While first successes have been achieved in developing clinical decision support tools using multifactorial omics data, there is still an unmet need and great potential for earlier, more accurate and robust diagnostic and prognostic tools for many complex diseases. Here, we provide a set of broadly applicable tips to address some of the most common pitfalls and limitations for biomarker signature development, including supervised and unsupervised machine learning, feature selection and hypothesis testing approaches. In contrast to previous guidelines discussing detailed aspects of quality control, statistics or study reporting, we give a broader overview of the typical challenges and sort the quick tips to address them chronologically by the study phase (starting with study design, then covering consecutive phases of biomarker signature discovery and validation, see also the overview in Fig. 1). While these tips are not comprehensive, they are chosen to cover what we consider as the most frequent, significant, and practically relevant issues and risks in biomarker development. By pointing the reader to further relevant literature on the covered aspects of biomarker discovery and validation, we hope to provide an initial guideline and entry point into the more detailed technical and application-specific aspects of this field

    Accurate long-read sequencing identified GBA1 as major risk factor in the Luxembourgish Parkinson’s study

    No full text
    Heterozygous variants in the glucocerebrosidase GBA1 gene are an increasingly recognized risk factor for Parkinson’s disease (PD). Due to the GBAP1 pseudogene, which shares 96% sequence homology with the GBA1 coding region, accurate variant calling by array-based or short-read sequencing methods remains a major challenge in understanding the genetic landscape of GBA1-associated PD. We analyzed 660 patients with PD, 100 patients with Parkinsonism and 808 healthy controls from the Luxembourg Parkinson’s study, sequenced using amplicon-based long-read DNA sequencing technology. We found that 12.1% (77/637) of PD patients carried GBA1 variants, with 10.5% (67/637) of them carrying known pathogenic variants (including severe, mild, risk variants). In comparison, 5% (34/675) of the healthy controls carried GBA1 variants, and among them, 4.3% (29/675) were identified as pathogenic variant carriers. We found four GBA1 variants in patients with atypical parkinsonism. Pathogenic GBA1 variants were 2.6-fold more frequently observed in PD patients compared to controls (OR = 2.6; CI = [1.6,4.1]). Three novel variants of unknown significance (VUS) were identified. Using a structure-based approach, we defined a potential risk prediction method for VUS. This study describes the full landscape of GBA1-related parkinsonism in Luxembourg, showing a high prevalence of GBA1 variants as the major genetic risk for PD. Although the long-read DNA sequencing technique used in our study may be limited in its effectiveness to detect potential structural variants, our approach provides an important advancement for highly accurate GBA1 variant calling, which is essential for providing access to emerging causative therapies for GBA1 carriers

    Education as Risk Factor of Mild Cognitive Impairment: The Link to the Gut Microbiome

    Get PDF
    peer reviewedBackground: With differences apparent in the gut microbiome in mild cognitive impairment (MCI) and dementia, and risk factors of dementia linked to alterations of the gut microbiome, the question remains if gut microbiome characteristics may mediate associations of education with MCI. Objectives: We sought to examine potential mediation of the association of education and MCI by gut microbiome diversity or composition. Design: Cross-sectional study. Setting: Luxembourg, the Greater Region (surrounding areas in Belgium, France, Germany). Participants: Control participants of the Luxembourg Parkinson’s Study. Measurements: Gut microbiome composition, ascertained with 16S rRNA gene amplicon sequencing. Differential abundance, assessed across education groups (0–10, 11–16, 16+ years of education). Alpha diversity (Chao1, Shannon and inverse Simpson indices). Mediation analysis with effect decomposition was conducted with education as exposure, MCI as outcome and gut microbiome metrics as mediators. Results: After exclusion of participants below 50, or with missing data, n=258 participants (n=58 MCI) were included (M [SD] Age=64.6 [8.3] years). Higher education (16+ years) was associated with MCI (Odds ratio natural direct effect=0.35 [95% CI 0.15–0.81]. Streptococcus and Lachnospiraceae-UCG-001 genera were more abundant in higher education. Conclusions: Education is associated with gut microbiome composition and MCI risk without clear evidence for mediation. However, our results suggest signatures of the gut microbiome that have been identified previously in AD and MCI to be reflected in lower education and suggest education as important covariate in microbiome studies.MCI-BIOME_20193. Good health and well-bein
    corecore