51 research outputs found

    A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification

    Get PDF
    Introduction: Metabolomics is increasingly being used in the clinical setting for disease diagnosis, prognosis and risk prediction. Machine learning algorithms are particularly important in the construction of multivariate metabolite prediction. Historically, partial least squares (PLS) regression has been the gold standard for binary classification. Nonlinear machine learning methods such as random forests (RF), kernel support vector machines (SVM) and artificial neural networks (ANN) may be more suited to modelling possible nonlinear metabolite covariance, and thus provide better predictive models. Objectives: We hypothesise that for binary classification using metabolomics data, non-linear machine learning methods will provide superior generalised predictive ability when compared to linear alternatives, in particular when compared with the current gold standard PLS discriminant analysis. Methods: We compared the general predictive performance of eight archetypal machine learning algorithms across ten publicly available clinical metabolomics data sets. The algorithms were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks. Results: There was only marginal improvement in predictive ability for SVM and ANN over PLS across all data sets. RF performance was comparatively poor. The use of out-of-bag bootstrap confidence intervals provided a measure of uncertainty of model prediction such that the quality of metabolomics data was observed to be a bigger influence on generalised performance than model choice. Conclusion: The size of the data set, and choice of performance metric, had a greater influence on generalised predictive performance than the choice of machine learning algorithm

    Migrating from partial least squares discriminant analysis to artificial neural networks: A comparison of functionally equivalent visualisation and feature contribution tools using Jupyter Notebooks

    Get PDF
    Introduction: Metabolomics data is commonly modelled multivariately using partial least squares discriminant analysis (PLS-DA). Its success is primarily due to ease of interpretation, through projection to latent structures, and transparent assessment of feature importance using regression coefficients and Variable Importance in Projection scores. In recent years several non-linear machine learning (ML) methods have grown in popularity but with limited uptake essentially due to convoluted optimisation and interpretation. Artificial neural networks (ANNs) are a non-linear projection-based ML method that share a structural equivalence with PLS, and as such should be amenable to equivalent optimisation and interpretation methods. Objectives: We hypothesise that standardised optimisation, visualisation, evaluation and statistical inference techniques commonly used by metabolomics researchers for PLS-DA can be migrated to a non-linear, single hidden layer, ANN. Methods: We compared a standardised optimisation, visualisation, evaluation and statistical inference techniques workflow for PLS with the proposed ANN workflow. Both workflows were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks on GitHub. Results: The migration of the PLS workflow to a non-linear, single hidden layer, ANN was successful. There was a similarity in significant metabolites determined using PLS model coefficients and ANN Connection Weight Approach. Conclusion: We have shown that it is possible to migrate the standardised PLS-DA workflow to simple non-linear ANNs. This result opens the door for more widespread use and to the investigation of transparent interpretation of more complex ANN architectures

    Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

    Get PDF
    Background A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike. Aim of Review To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science. Key Scientific Concepts of Review This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform

    Protection of specific maternal messenger RNAs by the P body protein CGH-1 (Dhh1/RCK) during Caenorhabditis elegans oogenesis

    Get PDF
    During oogenesis, numerous messenger RNAs (mRNAs) are maintained in a translationally silenced state. In eukaryotic cells, various translation inhibition and mRNA degradation mechanisms congregate in cytoplasmic processing bodies (P bodies). The P body protein Dhh1 inhibits translation and promotes decapping-mediated mRNA decay together with Pat1 in yeast, and has been implicated in mRNA storage in metazoan oocytes. Here, we have investigated in Caenorhabditis elegans whether Dhh1 and Pat1 generally function together, and how they influence mRNA sequestration during oogenesis. We show that in somatic tissues, the Dhh1 orthologue (CGH-1) forms Pat1 (patr-1)-dependent P bodies that are involved in mRNA decapping. In contrast, during oogenesis, CGH-1 forms patr-1–independent mRNA storage bodies. CGH-1 then associates with translational regulators and a specific set of maternal mRNAs, and prevents those mRNAs from being degraded. Our results identify somatic and germ cell CGH-1 functions that are distinguished by the involvement of PATR-1, and reveal that during oogenesis, numerous translationally regulated mRNAs are specifically protected by a CGH-1–dependent mechanism

    Metabolomics reveals mouse plasma metabolite responses to acute exercise and effects of disrupting AMPK-glycogen interactions

    Get PDF
    Introduction: The AMP-activated protein kinase (AMPK) is a master regulator of energy homeostasis that becomes activated by exercise and binds glycogen, an important energy store required to meet exercise-induced energy demands. Disruption of AMPK-glycogen interactions in mice reduces exercise capacity and impairs whole-body metabolism. However, the mechanisms underlying these phenotypic effects at rest and following exercise are unknown. Furthermore, the plasma metabolite responses to an acute exercise challenge in mice remain largely uncharacterized. Methods: Plasma samples were collected from wild type (WT) and AMPK double knock-in (DKI) mice with disrupted AMPK-glycogen binding at rest and following 30-min submaximal treadmill running. An untargeted metabolomics approach was utilized to determine the breadth of plasma metabolite changes occurring in response to acute exercise and the effects of disrupting AMPK-glycogen binding. Results: Relative to WT mice, DKI mice had reduced maximal running speed (p < 0.0001) concomitant with increased body mass (p < 0.01) and adiposity (p < 0.001). A total of 83 plasma metabolites were identified/annotated, with 17 metabolites significantly different (p < 0.05; FDR<0.1) in exercised (↑6; ↓11) versus rested mice, including amino acids, acylcarnitines and steroid hormones. Pantothenic acid was reduced in DKI mice versus WT. Distinct plasma metabolite profiles were observed between the rest and exercise conditions and between WT and DKI mice at rest, while metabolite profiles of both genotypes converged following exercise. These differences in metabolite profiles were primarily explained by exercise-associated increases in acylcarnitines and steroid hormones as well as decreases in amino acids and derivatives following exercise. DKI plasma showed greater decreases in amino acids following exercise versus WT. Conclusion: This is the first study to map mouse plasma metabolomic changes following a bout of acute exercise in WT mice and the effects of disrupting AMPK-glycogen interactions in DKI mice. Untargeted metabolomics revealed alterations in metabolite profiles between rested and exercised mice in both genotypes, and between genotypes at rest. This study has uncovered known and previously unreported plasma metabolite responses to acute exercise in WT mice, as well as greater decreases in amino acids following exercise in DKI plasma. Reduced pantothenic acid levels may contribute to differences in fuel utilization in DKI mice

    Changes to the gut microbiome in young children showing early behavioral signs of autism

    Get PDF
    The human gut microbiome has increasingly been associated with autism spectrum disorder (ASD), which is a neurological developmental disorder, characterized by impairments to social interaction. The ability of the gut microbiota to signal across the gut-brain-microbiota axis with metabolites, including short-chain fatty acids, impacts brain health and has been identified to play a role in the gastrointestinal and developmental symptoms affecting autistic children. The fecal microbiome of older children with ASD has repeatedly shown particular shifts in the bacterial and fungal microbial community, which are significantly different from age-matched neurotypical controls, but it is still unclear whether these characteristic shifts are detectable before diagnosis. Early microbial colonization patterns can have long-lasting effects on human health, and pre-emptive intervention may be an important mediator to more severe autism. In this study, we characterized both the microbiome and short-chain fatty acid concentrations of fecal samples from young children between 21 and 40 months who were showing early behavioral signs of ASD. The fungal richness and acetic acid concentrations were observed to be higher with increasing autism severity, and the abundance of several bacterial taxa also changed due to the severity of ASD. Bacterial diversity and SCFA concentrations were also associated with stool form, and some bacterial families were found with differential abundance according to stool firmness. An exploratory analysis of the microbiome associated with pre-emptive treatment also showed significant differences at multiple taxonomic levels. These differences may impact the microbial signaling across the gut-brain-microbiota axis and the neurological development of the children

    Metabolomics analysis identifies sex-associated metabotypes of oxidative stress and the autotaxin-lysoPA axis in COPD.

    Get PDF
    Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease and a leading cause of mortality and morbidity worldwide. The aim of this study was to investigate the sex dependency of circulating metabolic profiles in COPD.Serum from healthy never-smokers (healthy), smokers with normal lung function (smokers), and smokers with COPD (COPD; Global Initiative for Chronic Obstructive Lung Disease stages I-II/A-B) from the Karolinska COSMIC cohort (n=116) was analysed using our nontargeted liquid chromatography-high resolution mass spectrometry metabolomics platform.Pathway analyses revealed that several altered metabolites are involved in oxidative stress. Supervised multivariate modelling showed significant classification of smokers from COPD (p=2.8×10-7). Sex stratification indicated that the separation was driven by females (p=2.4×10-7) relative to males (p=4.0×10-4). Significantly altered metabolites were confirmed quantitatively using targeted metabolomics. Multivariate modelling of targeted metabolomics data confirmed enhanced metabolic dysregulation in females with COPD (p=3.0×10-3) relative to males (p=0.10). The autotaxin products lysoPA (16:0) and lysoPA (18:2) correlated with lung function (forced expiratory volume in 1 s) in males with COPD (r=0.86; p&lt;0.0001), but not females (r=0.44; p=0.15), potentially related to observed dysregulation of the miR-29 family in the lung.These findings highlight the role of oxidative stress in COPD, and suggest that sex-enhanced dysregulation in oxidative stress, and potentially the autotaxin-lysoPA axis, are associated with disease mechanisms and/or prevalence

    Metabolomics reveals mouse plasma metabolite responses to acute exercise and effects of disrupting AMPK-glycogen interactions

    Get PDF
    Introduction: The AMP-activated protein kinase (AMPK) is a master regulator of energy homeostasis that becomes activated by exercise and binds glycogen, an important energy store required to meet exercise-induced energy demands. Disruption of AMPK-glycogen interactions in mice reduces exercise capacity and impairs whole-body metabolism. However, the mechanisms underlying these phenotypic effects at rest and following exercise are unknown. Furthermore, the plasma metabolite responses to an acute exercise challenge in mice remain largely uncharacterized. Methods : Plasma samples were collected from wild type (WT) and AMPK double knock-in (DKI) mice with disrupted AMPK-glycogen binding at rest and following 30-min submaximal treadmill running. An untargeted metabolomics approach was utilized to determine the breadth of plasma metabolite changes occurring in response to acute exercise and the effects of disrupting AMPK-glycogen binding. Results: Relative to WT mice, DKI mice had reduced maximal running speed (p \u3c 0.0001) concomitant with increased body mass (p \u3c 0.01) and adiposity (p \u3c 0.001). A total of 83 plasma metabolites were identified/annotated, with 17 metabolites significantly different (p \u3c 0.05; FDR \u3c 0.1) in exercised (↑ 6; ↓ 11) versus rested mice, including amino acids, acylcarnitines and steroid hormones. Pantothenic acid was reduced in DKI mice versus WT. Distinct plasma metabolite profiles were observed between the rest and exercise conditions and between WT and DKI mice at rest, while metabolite profiles of both genotypes converged following exercise. These differences in metabolite profiles were primarily explained by exercise-associated increases in acylcarnitines and steroid hormones as well as decreases in amino acids and derivatives following exercise. DKI plasma showed greater decreases in amino acids following exercise versus WT. Conclusion : This is the first study to map mouse plasma metabolomic changes following a bout of acute exercise in WT mice and the effects of disrupting AMPK-glycogen interactions in DKI mice. Untargeted metabolomics revealed alterations in metabolite profiles between rested and exercised mice in both genotypes, and between genotypes at rest. This study has uncovered known and previously unreported plasma metabolite responses to acute exercise in WT mice, as well as greater decreases in amino acids following exercise in DKI plasma. Reduced pantothenic acid levels may contribute to differences in fuel utilization in DKI mice

    Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies

    Get PDF
    Background Quality assurance (QA) and quality control (QC) are two quality management processes that are integral to the success of metabolomics including their application for the acquisition of high quality data in any high-throughput analytical chemistry laboratory. QA defines all the planned and systematic activities implemented before samples are collected, to provide confidence that a subsequent analytical process will fulfil predetermined requirements for quality. QC can be defined as the operational techniques and activities used to measure and report these quality requirements after data acquisition. Aim of review This tutorial review will guide the reader through the use of system suitability and QC samples, why these samples should be applied and how the quality of data can be reported. Key scientific concepts of review System suitability samples are applied to assess the operation and lack of contamination of the analytical platform prior to sample analysis. Isotopically-labelled internal standards are applied to assess system stability for each sample analysed. Pooled QC samples are applied to condition the analytical platform, perform intra-study reproducibility measurements (QC) and to correct mathematically for systematic errors. Standard reference materials and long-term reference QC samples are applied for inter-study and inter-laboratory assessment of data

    Maternal prebiotic supplementation during pregnancy and lactation modifies the microbiome and short chain fatty acid profile of both mother and infant

    Get PDF
    Background & aims: Improving maternal gut health in pregnancy and lactation is a potential strategy to improve immune and metabolic health in offspring and curtail the rising rates of inflammatory diseases linked to alterations in gut microbiota. Here, we investigate the effects of a maternal prebiotic supplement (galacto-oligosaccharides and fructo-oligosaccharides), ingested daily from \u3c 21 weeks\u27 gestation to six months’ post-partum, in a double-blinded, randomised placebo-controlled trial. Methods: Stool samples were collected at multiple timepoints from 74 mother–infant pairs as part of a larger, double-blinded, randomised controlled allergy intervention trial. The participants were randomised to one of two groups; with one group receiving 14.2 g per day of prebiotic powder (galacto-oligosaccharides GOS and fructo-oligosaccharides FOS in ratio 9:1), and the other receiving a placebo powder consisting of 8.7 g per day of maltodextrin. The faecal microbiota of both mother and infants were assessed based on the analysis of bacterial 16S rRNA gene (V4 region) sequences, and short chain fatty acid (SCFA) concentrations in stool. Results: Significant differences in the maternal microbiota profiles between baseline and either 28-weeks’ or 36-weeks’ gestation were found in the prebiotic supplemented women. Infant microbial beta-diversity also significantly differed between prebiotic and placebo groups at 12-months of age. Supplementation was associated with increased abundance of commensal Bifidobacteria in the maternal microbiota, and a reduction in the abundance of Negativicutes in both maternal and infant microbiota. There were also changes in SCFA concentrations with maternal prebiotics supplementation, including significant differences in acetic acid concentration between intervention and control groups from 20 to 28-weeks’ gestation. Conclusion: Maternal prebiotic supplementation of 14.2 g per day GOS/FOS was found to favourably modify both the maternal and the developing infant gut microbiome. These results build on our understanding of the importance of maternal diet during pregnancy, and indicate that it is possible to intervene and modify the development of the infant microbiome by dietary modulation of the maternal gut microbiome
    corecore