32 research outputs found

    An open software development-based ecosystem of R packages for metabolomics data analysis

    Get PDF
    A frequent problem with scientific research software is the lack of support, maintenance and further development. In particular, development by a single researcher can easily result in orphaned software packages, especially if combined with poor documentation or lack of adherence to open software development standards. The RforMassSpectrometry initiative aims to develop an efficient and stable infrastructure for mass spectrometry (MS) data analysis. As part of this initiative, a growing ecosystem of R software packages is being developed covering different aspects of metabolomics and proteomics data analysis. To avoid the aforementioned problems, community contributions are fostered, and open development, documentation and long-term support emphasized. At the heart of the package ecosystem is the Spectra package that provides the core infrastructure to handle and analyze MS data. Its design allows easy expansion to support additional file or data formats including data representations with minimal memory footprint or remote data access. The xcms package for LC-MS data preprocessing was updated to reuse this infrastructure, enabling now also the analysis of very large, or remote, data. This integration simplifies in addition complete analysis workflows which can include the MsFeatures package for compounding, and the MetaboAnnotation package for annotation of untargeted metabolomics experiments. Public annotation resources can be easily accessed through packages such as MsBackendMassbank, MsBackendMgf, MsBackendMsp or CompoundDb, the latter also allowing to create and manage lab-specific compound databases. Finally, the MsCoreUtils and MetaboCoreUtils packages provide efficient implementations of commonly used algorithms, designed to be re-used in other R packages. Ultimately, and in contrast to a monolithic software design, the package ecosystem enables to build customized, modular, and reproducible analysis workflows. Future development will focus on improved data structures and analysis methods for chromatographic data, and better interoperability with other open source softwares including a direct integration with Python MS libraries

    Visualization of proteomics data using R and bioconductor.

    Get PDF
    Data visualization plays a key role in high-throughput biology. It is an essential tool for data exploration allowing to shed light on data structure and patterns of interest. Visualization is also of paramount importance as a form of communicating data to a broad audience. Here, we provided a short overview of the application of the R software to the visualization of proteomics data. We present a summary of R's plotting systems and how they are used to visualize and understand raw and processed MS-based proteomics data.LG was supported by the European Union 7th Framework Program (PRIME-XS project, grant agreement number 262067) and a BBSRC Strategic Longer and Larger grant (Award BB/L002817/1). LMB was supported by a BBSRC Tools and Resources Development Fund (Award BB/K00137X/1). TN was supported by a ERASMUS Placement scholarship.This is the final published version of the article. It was originally published in Proteomics (PROTEOMICS Special Issue: Proteomics Data Visualisation Volume 15, Issue 8, pages 1375–1389, April 2015. DOI: 10.1002/pmic.201400392). The final version is available at http://onlinelibrary.wiley.com/doi/10.1002/pmic.201400392/abstract

    Genome-wide association of the metabolic shifts underpinning dark-induced senescence in Arabidopsis

    Full text link
    Dark-induced senescence provokes profound metabolic shifts to recycle nutrients and to guarantee plant survival. To date, research on these processes has largely focused on characterizing mutants deficient in individual pathways. Here, we adopted a time-resolved genome-wide association-based approach to characterize dark-induced senescence by evaluating the photochemical efficiency and content of primary and lipid metabolites at the beginning, or after 3 or 6 days in darkness. We discovered six patterns of metabolic shifts and identified 215 associations with 81 candidate genes being involved in this process. Among these associations, we validated the roles of four genes associated with glycine, galactinol, threonine, and ornithine levels. We also demonstrated the function of threonine and galactinol catabolism during dark-induced senescence. Intriguingly, we determined that the association between tyrosine contents and TYROSINE AMINOTRANSFERASE 1 influences enzyme activity of the encoded protein and transcriptional activity of the gene under normal and dark conditions, respectively. Moreover, the single-nucleotide polymorphisms affecting the expression of THREONINE ALDOLASE 1 and the amino acid transporter gene AVT1B, respectively, only underlie the variation in threonine and glycine levels in the dark. Taken together, these results allow us to present a very detailed model of the metabolic aspects of dark-induced senescence, as well as the process itself

    Conifers are a major source of sedimentary leaf wax n-alkanes when dominant on the landscape: Case studies from the Paleogene

    Full text link
    Paleobotanical site information, terpenoid, n-alkane, and other and biomarker quantification and carbon isotope data from sediment samples collected from North America Paleogene fossil leaf sites that extend from Colorado to the High Arctic. Sediment samples were collected laterally along fossil leaf-bearing zones. To disentangle the vegetation source of sediment n-alkanes, we measured the carbon isotope (δ13C) values of nonsteroidal triterpenoids (angiosperm biomarkers) and tricyclic diterpenoids (conifer biomarkers) to determine angiosperm and conifer end member δ13C values. Compounds were isolated using column chromatography and identified and quantified with an Agilent 7890A gas chromatograph (GC) interfaced to an Agilent 5975C quadrupole mass selective detector (MSD) and flame ionization detector (FID). Compound-specific carbon isotope analyses were performed, where possible, on n-C27 through n-C35 alkanes, diterpenoids, and triterpenoids by gas chromatograph-combustion-isotope ratio mass spectrometry (GC-C-IRMS)

    Annotation of Specialized Metabolites from High-Throughput and High Resolution Mass Spectrometry Metabolomics

    Full text link
    High-throughput mass spectrometry (MS) metabolomics profiling of highly complex samples allows the comprehensive detection of hundreds to thousands of metabolites under a given condition and point in time and produces information-rich data sets on known and unknown metabolites. One of the main challenges is the identification and annotation of metabolites from these complex data sets since the number of authentic standards available for specialized metabolites is far lower than an account for the number of mass spectral features. Previously, we reported two novel tools, MetNet and MetCirc, for putative annotation and structural prediction on unknown metabolites using known metabolites as baits. MetNet employs differences between m/z values of MS1 features, which correspond to metabolic transformations, and statistical associations, while MetCirc uses MS/MS features as input and calculates similarity scores of aligned spectra between features to guide the annotation of metabolites. Here, we showcase the use of MetNet and MetCirc to putatively annotate metabolites and provide detailed instructions as to how those can be used. While our case studies are from plants, the tools find equal utility in studies on bacterial, fungal, or mammalian xenobiotic samples
    corecore