33 research outputs found

    An interactive online software platform for the analysis of small molecules using hyphenated mass spectrometry: MeltDB and ALLocator

    Get PDF
    Kessler N. An interactive online software platform for the analysis of small molecules using hyphenated mass spectrometry: MeltDB and ALLocator. Bielefeld: Universität Bielefeld; 2018

    Learning to classify organic and conventional wheat - a machine-learning driven approach using the MeltDB 2.0 metabolomics analysis platform

    Get PDF
    Kessler N, Bonte A, Albaum S, et al. Learning to classify organic and conventional wheat - a machine-learning driven approach using the MeltDB 2.0 metabolomics analysis platform. Frontiers in Bioinformatics and Computational Biology. 2015;3: 35.We present results of our machine learning approach to the problem of classifying GC-MS data originating from wheat grains of different farming systems. The aim is to investigate the potential of learning algorithms to classify GC-MS data to be either from conventionally grown or from organically grown samples and considering different cultivars. The motivation of our work is rather obvious on the background of nowadays increased demand for organic food in post-industrialized societies and the necessity to prove organic food authenticity. The background of our data set is given by up to eleven wheat cultivars that have been cultivated in both farming systems, organic and conventional, throughout three years. More than 300 GC-MS measurements were recorded and subsequently processed and analyzed in the MeltDB 2.0 metabolomics analysis platform, being briefly outlined in this paper. We further describe how unsupervised (t-SNE, PCA) and supervised (RF, SVM) methods can be applied for sample visualization and classification. Our results clearly show that years have most and wheat cultivars have second-most influence on the metabolic composition of a sample. We can also show, that for a given year and cultivar, organic and conventional cultivation can be distinguished by machine-learning algorithms

    DATA ANALYSIS WORKFLOW FOR GAS CHROMATOGRAPHY MASS SPECTROMETRY-BASED METABOLOMICS STUDIES

    Get PDF
    Metabolomics has emerged as an integral part of systems biology research that attempts to comprehensively study low molecular weight organic and inorganic metabolites under certain conditions within a biological system. Technological advances in the past decade have made it possible to carry out metabolomics studies in a high- throughput fashion using gas chromatography coupled with mass spectrometry. As a result, large volumes of data are produced from these studies and there is a pressing need for algorithms that can efficiently process and analyze the data in a high-throughput fashion as well. To address this need, we have developed computational algorithms and the associated software tool named an Automated Data Analysis Pipeline (ADAP). ADAP allows data to flow seamlessly through the data processing steps that include de- nosing, peak detection, deconvolution, alignment, compound identification and quantitation. The development of ADAP started in 2009 and the past four years have witnessed continuous improvements in its performance from ADAP-GC 1.0, to ADAP- GC 2.0, and to the current ADAP-GC 3.0. As part of the performance assessment of ADAP-GC, we have compared it with three other software tools. In this dissertation, I will present the computational details about these three versions of ADAP-GC, the capabilities of the software tool, and the results from software comparison

    FELLA: an R package to enrich metabolomics data

    Get PDF
    Background: Pathway enrichment techniques are useful for understanding experimental metabolomics data. Their purpose is to give context to the affected metabolites in terms of the prior knowledge contained in metabolic pathways. However, the interpretation of a prioritized pathway list is still challenging, as pathways show overlap and cross talk effects. Results: We introduce FELLA, an R package to perform a network-based enrichment of a list of affected metabolites. FELLA builds a hierarchical representation of an organism biochemistry from the Kyoto Encyclopedia of Genes and Genomes (KEGG), containing pathways, modules, enzymes, reactions and metabolites. In addition to providing a list of pathways, FELLA reports intermediate entities (modules, enzymes, reactions) that link the input metabolites to them. This sheds light on pathway cross talk and potential enzymes or metabolites as targets for the condition under study. FELLA has been applied to six public datasets -three from Homo sapiens, two from Danio rerio and one from Mus musculus- and has reproduced findings from the original studies and from independent literature. Conclusions: The R package FELLA offers an innovative enrichment concept starting from a list of metabolites, based on a knowledge graph representation of the KEGG database that focuses on interpretability. Besides reporting a list of pathways, FELLA suggests intermediate entities that are of interest per se. Its usefulness has been shown at several molecular levels on six public datasets, including human and animal models. The user can run the enrichment analysis through a simple interactive graphical interface or programmatically. FELLA is publicly available in Bioconductor under the GPL-3 license.Peer ReviewedPostprint (published version

    IMass time: The future, in future!

    Get PDF
    Joseph John Thomson discovered and proved the existence of electrons through a series of experiments. His work earned him a Nobel Prize in 1906 and initiated the era of mass spectrometry (MS). In the intervening time, other researchers have also been awarded the Nobel Prize for significant advances in MS technology. The development of soft ionization techniques was central to the application of MS to large biological molecules and led to an unprecedented interest in the study of biomolecules such as proteins (proteomics), metabolites (metabolomics), carbohydrates (glycomics), and lipids (lipidomics), allowing a better understanding of the molecular underpinnings of health and disease. The interest in large molecules drove improvements in MS resolution and now the challenge is in data deconvolution, intelligent exploitation of heterogeneous data, and interpretation, all of which can be ameliorated with a proposed IMass technology. We define IMass as a combination of MS and artificial intelligence, with each performing a specific role. IMass will offer advantages such as improving speed, sensitivity, and analyses of large data that are presently not possible with MS alone. In this study, we present an overview of the MS considering historical perspectives and applications, challenges, as well as insightful highlights of IMass

    Development and application of a platform for harmonisation and integration of metabolomics data

    Get PDF
    Integrating diverse metabolomics data for molecular epidemiology analyses provides both opportuni- ties and challenges in the field of human health research. Combining patient cohorts may improve power and sensitivity of analyses but is challenging due to significant technical and analytical vari- ability. Additionally, current systems for the storage and analysis of metabolomics data suffer from scalability, query-ability, and integration issues that limit their adoption for molecular epidemiological research. Here, a novel platform for integrative metabolomics is developed, which addresses issues of storage, harmonisation, querying, scaling, and analysis of large-scale metabolomics data. Its use is demonstrated through an investigation of molecular trends of ageing in an integrated four-cohort dataset where the advantages and disadvantages of combining balanced and unbalanced cohorts are explored, and robust metabolite trends are successfully identified and shown to be concordant with previous studies.Open Acces

    Ranking metabolite sets by their activity levels

    Get PDF
    Related metabolites can be grouped into sets in many ways, e.g., by their participation in series of chemical reactions (forming metabolic pathways), or based on fragmentation spectral similarities or shared chemical substructures. Understanding how such metabolite sets change in relation to experimental factors can be incredibly useful in the interpretation and understanding of complex metabolomics data sets. However, many of the available tools that are used to perform this analysis are not entirely suitable for the analysis of untargeted metabolomics measurements. Here, we present PALS (Pathway Activity Level Scoring), a Python library, command line tool, and Web application that performs the ranking of significantly changing metabolite sets over different experimental conditions. The main algorithm in PALS is based on the pathway level analysis of gene expression (PLAGE) factorisation method and is denoted as mPLAGE (PLAGE for metabolomics). As an example of an application, PALS is used to analyse metabolites grouped as metabolic pathways and by shared tandem mass spectrometry fragmentation patterns. A comparison of mPLAGE with two other commonly used methods (overrepresentation analysis (ORA) and gene set enrichment analysis (GSEA)) is also given and reveals that mPLAGE is more robust to missing features and noisy data than the alternatives. As further examples, PALS is also applied to human African trypanosomiasis, Rhamnaceae, and American Gut Project data. In addition, normalisation can have a significant impact on pathway analysis results, and PALS offers a framework to further investigate this. PALS is freely available from our project Web site

    Computational methods for high-throughput metabolomics

    Get PDF
    Hoffmann N. Computational methods for high-throughput metabolomics. Bielefeld: Universität Bielefeld; 2014.The advent of analytical technologies being broadly and routinely applied in biology and biochemistry for the analysis and characterization of small molecules in biological organisms has brought with it the need to process, analyze, compare, and evaluate large amounts of experimental data in a highly automated fashion. The most prominent methods used in these fields are chromatographic methods capable of separating complex mixtures of chemical compounds by properties like size or charge, coupled to mass spectrometry detectors that measure the mass and intensity of a compound's ion or its fragments eluting from the chromatographic separation system. One major problem in these high-throughput applications is the automatic extraction of features quantifying the compounds contained in the measured results and their reliable association among multiple measurements for quantification and statistical analysis. The main goal of this thesis is the creation of scalable and robust methods for highly automated processing of large numbers of samples. Of special importance is the comparison of different samples in order to find similarities and differences in the context of metabolomics, the study of small chemical compounds in biological organisms. We herein describe novel algorithms for retention time alignment of peak and chromatogram data from one- and two-dimensional gas chromatography-mass spectrometry experiments in the application area of metabolomics. We also perform a comprehensive evaluation of each method against other state-of-the-art methods on publicly available datasets with genuine biological backgrounds. In addition to these methods, we also describe the underlying software framework Maltcms and the accompanying graphical user interface Maui, and demonstrate their use on instructive application examples
    corecore