75 research outputs found

    The metaRbolomics Toolbox in Bioconductor and beyond

    Get PDF
    Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub

    Responsabilités juridiques des acteurs de la gestion des risques

    No full text
    International audienc

    biosigner: A New Method for the Discovery of Significant Molecular Signatures from Omics Data

    Get PDF
    International audienceHigh-throughput technologies such as transcriptomics, proteomics, and metabolomics show great promise for the discovery of biomarkers for diagnosis and prognosis. Selection of the most promising candidates between the initial untargeted step and the subsequent validation phases is critical within the pipeline leading to clinical tests. Several statistical and data mining methods have been described for feature selection: in particular, wrapper approaches iteratively assess the performance of the classifier on distinct subsets of variables. Current wrappers, however, do not estimate the significance of the selected features. We therefore developed a new methodology to find the smallest feature subset which significantly contributes to the model performance, by using a combination of resampling, ranking of variable importance, significance assessment by permutation of the feature values in the test subsets, and half-interval search. We wrapped our biosigner algorithm around three reference binary classifiers (Partial Least Squares—Discriminant Analysis, Random Forest, and Support Vector Machines) which have been shown to achieve specific performances depending on the structure of the dataset. By using three real biological and clinical metabolomics and transcriptomics datasets (containing up to 7000 features), complementary signatures were obtained in a few minutes, generally providing higher prediction accuracies than the initial full model. Comparison with alternative feature selection approaches further indicated that our method provides signatures of restricted size and high stability. Finally, by using our methodology to seek metabolites discriminating type 1 from type 2 diabetic patients, several features were selected, including a fragment from the taurochenodeoxycholic bile acid. Our methodology, implemented in the biosigner R/Bioconductor package and Galaxy/Workflow4metabolomics module, should be of interest for both experimenters and statisticians to identify robust molecular signatures from large omics datasets in the process of developing new diagnostics

    ptairMS: real-time processing and analysis of PTR-TOF-MS data for biomarker discovery in exhaled breath

    No full text
    International audienceMotivation: Analysis of volatile organic compounds (VOCs) in exhaled breath by proton transfer reaction time-of-flight mass spectrometry (PTR-TOF-MS) is of increasing interest for real-time, non-invasive diagnosis, phenotyping and therapeutic drug monitoring in the clinics. However, there is currently a lack of methods and software tools for the processing of PTR-TOF-MS data from cohorts and suited for biomarker discovery studies. Results: We developed a comprehensive suite of algorithms that process raw data from patient acquisitions and generate the table of feature intensities. Notably, we included an innovative two-dimensional peak deconvolution model based on penalized splines signal regression for accurate estimation of the temporal profile and feature quantification, as well as a method to specifically select the VOCs from exhaled breath. The workflow was implemented as the ptairMS software, which contains a graphical interface to facilitate cohort management and data analysis. The approach was validated on both simulated and experimental datasets, and we showed that the sensitivity and specificity of the VOC detection reached 99% and 98.4%, respectively, and that the error of quantification was below 8.1% for concentrations down to 19 ppb. Availability and implementation: The ptairMS software is publicly available as an R package on Bioconductor (doi: 10.18129/B9.bioc.ptairMS), as well as its companion experiment package ptairData (doi: 10.18129/B9.bioc.ptairData

    Sacurine toy dataset - 6 samples / 3 QC / 3 Blanks

    No full text
    Objective: Influence of age, body mass index, and gender on the urine metabolome Cohort: 183 employees from CEA LC-HRMS: LTQ-Orbitrap (negative ionization mode

    Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses

    No full text
    International audienceUrine metabolomics is widely used for biomarker research in the fields of medicine and toxicology. As a consequence, characterization of the variations of the urine metabolome under basal conditions becomes critical in order to avoid confounding effects in cohort studies. Such physiological information is however very scarce in the literature and in metabolomics databases so far. Here we studied the influence of age, body mass index (BMI), and gender on metabolite concentrations in a large cohort of 183 adults by using liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS). We implemented a comprehensive statistical workflow for univariate hypothesis testing and modeling by orthogonal partial least-squares (OPLS), which we made available to the metabolomics community within the online Workflow4Metabolomics.org resource. We found 108 urine metabolites displaying concentration variations with either age, BMI, or gender, by integrating the results from univariate p-values and multivariate variable importance in projection (VIP). Several metabolite clusters were further evidenced by correlation analysis, and they allowed stratification of the cohort. In conclusion, our study highlights the impact of gender and age on the urinary metabolome, and thus it indicates that these factors should be taken into account for the design of metabolomics studies

    Impact of collection conditions on the metabolite content of human urine samples as analyzed by liquid chromatography coupled to mass spectrometry and nuclear magnetic resonance spectroscopy

    No full text
    International audienceThere is a lack of comprehensive studies documenting the impact of sample collection conditions on metabolic composition of human urine. To address this issue, two experiments were performed at a 3-month interval, in which midstream urine samples from healthy individuals were collected, pooled, divided into several aliquots and kept under specific conditions (room temperature, 4 A degrees C, with or without preservative) up to 72 h before storage at -80 A degrees C. Samples were analyzed by high-performance liquid chromatography coupled to high-resolution mass spectrometry and bacterial contamination was monitored by turbidimetry. Multivariate analyses showed that urinary metabolic fingerprints were affected by the presence of preservatives and also by storage at room temperature from 24 to 72 h, whereas no change was observed for urine samples stored at 4 A degrees C over a 72-h period. Investigations were then focused on 280 metabolites previously identified in urine: 19 of them were impacted by the kind of sample collection protocol in both experiments, including 12 metabolites affected by bacterial contamination and 7 exhibiting poor chemical stability. Finally, our results emphasize that the use of preservative prevents bacterial overgrowth, but does not avoid metabolite instability in solution, whereas storage at 4 A degrees C inhibits bacterial overgrowth at least over a 72-h period and slows the chemical degradation process. Consequently, and for further LC/MS analyses, human urine samples should be kept at 4 A degrees C if their collection is performed over 24 h

    proFIA: A data preprocessing workflow for Flow Injection Analysis coupled to High-Resolution Mass Spectrometry

    No full text
    International audienceMotivation: Flow Injection Analysis coupled to High-Resolution Mass Spectrometry (FIA-HRMS) is a promising approach for high-throughput metabolomics. FIA-HRMS data, however, cannot be preprocessed with current software tools which rely on liquid chromatography separation, or handle low resolution data only. Results: We thus developed the proFIA package, which implements a suite of innovative algorithms to preprocess FIA-HRMS raw files, and generates the table of peak intensities. The workflow consists of 3 steps: i) noise estimation, peak detection and quantification, ii) peak grouping across samples, and iii) missing value imputation. In addition, we have implemented a new indicator to quantify the potential alteration of the feature peak shape due to matrix effect. The preprocessing is fast (less than 15 s per file), and the value of the main parameters (ppm and dmz) can be easily inferred from the mass resolution of the instrument. Application to two metabolomics datasets (including spiked serum samples) showed high precision (96%) and recall (98%) compared with manual integration. These results demonstrate that proFIA achieves very efficient and robust detection and quantification of FIA-HRMS data, and opens new opportunities for high-throughput phenotyping. Availability: The proFIA software (as well as the plasFIA data set) is available as an R package on the Bioconductor repository (http://bioconductor.org/packages/proFIA), and as a Galaxy module on the Main Toolshed (https://toolshed.g2.bx.psu.edu/) and on the Workflow4Metabolomics online infrastructure (http://workflow4metabolomics.org). Contacts: [email protected] and [email protected]

    Spectral Database: from data model to web interface

    No full text
    MetaboHUB is a metabolomics and fluxomics infrastructure that provides tools to research teams and partners. The Bioinformatics and Biostatistics service is specialized in NMR, GC- and LC-MS data processing and analysis, from raw data to metabolite identification. To challenge the annotation of these data and centralize knowledge, a dedicated team is building a software to assist in identification, including a compound and spectra database. The core of the “MetaboHUB Spectral Database”, called "data-model", is a computational representation of each entity involved in Spectra analysis and Chemical Compounds identification. One of the strengths of the project is the common work between chemical experts and bioinformaticians in data model design permitting respect of logics and constraints uses in Metabolomics during data manipulation and storage. The software architecture allows us to use parts of the project as standalone software, available for the community. The data-model seems to be able to manage several types of chemical compounds (like standards or sub-structures) and different types of Spectra (MS, MS/MS and NMR, simple, JRES and multidimensional). We will be able to approve the data-model with data from the chemical libraries provided by MetaboHUB members. One of the final goals of the spectral database is to provide a computed aided spectra identification tool, using all these data.thought a web-portal. Two milestones are coming: a first to provide a mechanism to import spectral data in the data-model (which means in the database too), a second to define metadata around spectral analysis
    corecore