10,788 research outputs found

    Updates in metabolomics tools and resources: 2014-2015

    Get PDF
    Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

    Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics.

    Get PDF
    The annotation of small molecules remains a major challenge in untargeted mass spectrometry-based metabolomics. We here critically discuss structured elucidation approaches and software that are designed to help during the annotation of unknown compounds. Only by elucidating unknown metabolites first is it possible to biologically interpret complex systems, to map compounds to pathways and to create reliable predictive metabolic models for translational and clinical research. These strategies include the construction and quality of tandem mass spectral databases such as the coalition of MassBank repositories and investigations of MS/MS matching confidence. We present in silico fragmentation tools such as MS-FINDER, CFM-ID, MetFrag, ChemDistiller and CSI:FingerID that can annotate compounds from existing structure databases and that have been used in the CASMI (critical assessment of small molecule identification) contests. Furthermore, the use of retention time models from liquid chromatography and the utility of collision cross-section modelling from ion mobility experiments are covered. Workflows and published examples of successfully annotated unknown compounds are included

    The metaRbolomics Toolbox in Bioconductor and beyond

    Get PDF
    Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub

    Navigating the Human Metabolome for Biomarker Identification and Design of Pharmaceutical Molecules

    Get PDF
    Metabolomics is a rapidly evolving discipline that involves the systematic study of endogenous small molecules that characterize the metabolic pathways of biological systems. The study of metabolism at a global level has the potential to contribute significantly to biomedical research, clinical medical practice, as well as drug discovery. In this paper, we present the most up-to-date metabolite and metabolic pathway resources, and we summarize the statistical, and machine-learning tools used for the analysis of data from clinical metabolomics. Through specific applications on cancer, diabetes, neurological and other diseases, we demonstrate how these tools can facilitate diagnosis and identification of potential biomarkers for use within disease diagnosis. Additionally, we discuss the increasing importance of the integration of metabolomics data in drug discovery. On a case-study based on the Human Metabolome Database (HMDB) and the Chinese Natural Product Database (CNPD), we demonstrate the close relatedness of the two data sets of compounds, and we further illustrate how structural similarity with human metabolites could assist in the design of novel pharmaceuticals and the elucidation of the molecular mechanisms of medicinal plants

    Molecular biological and biochemical approaches to expand the spectrum of fungal natural products

    Get PDF
    At least 3.5 billion years ago, the first life on earth arose. This was the starting point of the evolutionary development of numerous living beings. According to current estimations, there are 1012 different species on our planet. Most of this enormous biodiversity originates from the kingdom of bacteria and archaea. Based on these estimations, only 0.001 % of all species are known to this day. The omnipresent competition between living beings led to the development of secondary metabolism. The metabolites derived from this metabolism are not essential for survival, yet their production offers the organism various selection advantages. Plants, bacteria, and fungi are the main producers of secondary metabolites. The more than 2,140,000 million known secondary metabolites can be divided into five large groups: (1) non-ribosomal polypeptides, (2) polyketides, (3) alkaloids, (4) terpenoids and steroids and (5) enzyme cofactors. Many of these natural compounds show a biological or pharmaceutical activity and were used for the development of drugs. The large number of not yet identified microorganisms harbors an enormous, mostly unused genetic potential to produce further new natural compounds. Such compounds may be suitable for the development of urgently needed new drugs. Various approaches, such as heterologous expression in suitable host organisms, are being investigated to make this potential accessible. Additionally, through synthetic biology approaches, the diversity of natural substances can be further extended, and new natural substances can be discovered or produced. In the context of research on secondary metabolites, this work focuses on three main topics: 1. The extension of the spectrum of possible substrates for prenyltransferases, by using a database to predict new substrates. 2. The identification and characterization of previously unknown biosynthetic gene clusters, as well as the investigation of a possible application of the enzymes involved to produce new natural substances. 3. The generation of a host for the heterologous expression of secondary metabolite genes and investigation of their unknown products. Prenyltransferases catalyze the transfer of prenyl units (n × C5) to their target substrates. This is of importance, as an increase in the biological activity of prenylated compounds compared to their unprenylated counterparts has been observed for many compounds. A special property of prenyltransferases is their promiscuity with respect to the substrates. This makes them suitable candidates to produce pharmaceutically active substances. However, in practice, it is difficult to identify new substrates for prenyltransferases. In order to address this problem, a database, PrenDB, was developed for the prediction of such substrates. The predictive power of this database was experimentally tested with 38 predicted substrates by their acceptance with the prenyltransferases FtmPT1, FgapT2, and CdpNPT. For 27 of the 38 substrates, prenylation by at least one of the three tested enzymes was observed, 17 with conversion yields of more than 50 %. This proved the predictive power of the developed database and enabled the targeted selection of new potential substrates and the identification of new substrate classes. The identification of biosynthetic gene clusters and the subsequent biochemical characterization of the enzymes involved in the biosynthetic pathways form the basis for synthetic biology approaches to produce natural products. Based on the cyclic dipeptide echinulin, a possible procedure for the identification of the responsible gene cluster and the use of the involved enzymes for the biosynthesis of new substances was described. The enzymatic prerequisites for the biosynthesis of echinulin were determined based on the structural peculiarities of echinulin. Potential candidate gene clusters must encode one non-ribosomal peptide synthetase and several prenyltransferases. In the genome of the echinulin producer Aspergillus ruber, a gene cluster with these prerequisites was identified. Enzyme assays with the echinulin precursor cyclo-L-tryptophanyl-L-alaninyl and the heterologously produced prenyltransferases EchPT1 and EchPT2 led to a well-founded biosynthetic hypothesis and confirmed the involvement of this cluster in the biosynthesis of echinulin. The combination of EchPT1 and EchPT2 with cyclo-L-tryptophanyl-L-alaninyl as a substrate led to the formation of 7 products with different degrees of prenylation. This special property was subsequently used to prenylate further cyclic dipeptides. The stereoisomers of cyclo-tryptophanyl-alaninyl and cyclo-tryptophanyl-prolinyl were used for this purpose. Analogous to the biosynthesis of echinulin, this led to the formation of triprenylated main products prenylated at position C2, C5 and C7, as well as further di-, tri- and tetraprenylated side products. Another possibility to investigate and produce secondary metabolites is the heterologous expression in a suitable host. A potential new host for heterologous expression, Penicillium crustosum, was examined in this thesis. The genome of the fungus was sequenced and the involvement of the polyketide synthase Pcr4401 in the biosynthesis of the melanin precursor YWA1 was confirmed by deletion and expression experiments. Successful integration of foreign genes in the pcr4401 gene locus can easily be recognized by the occurrence of an albino phenotype. For better use as an expression host, a pyrG deficient strain and two plasmids were generated to integrate foreign genes into the pcr4401 gene locus. The applicability as an expression host was subsequently verified by the successful expression of three PKS genes and the structural elucidation of the formed products

    Advances in structure elucidation of small molecules using mass spectrometry

    Get PDF
    The structural elucidation of small molecules using mass spectrometry plays an important role in modern life sciences and bioanalytical approaches. This review covers different soft and hard ionization techniques and figures of merit for modern mass spectrometers, such as mass resolving power, mass accuracy, isotopic abundance accuracy, accurate mass multiple-stage MS(n) capability, as well as hybrid mass spectrometric and orthogonal chromatographic approaches. The latter part discusses mass spectral data handling strategies, which includes background and noise subtraction, adduct formation and detection, charge state determination, accurate mass measurements, elemental composition determinations, and complex data-dependent setups with ion maps and ion trees. The importance of mass spectral library search algorithms for tandem mass spectra and multiple-stage MS(n) mass spectra as well as mass spectral tree libraries that combine multiple-stage mass spectra are outlined. The successive chapter discusses mass spectral fragmentation pathways, biotransformation reactions and drug metabolism studies, the mass spectral simulation and generation of in silico mass spectra, expert systems for mass spectral interpretation, and the use of computational chemistry to explain gas-phase phenomena. A single chapter discusses data handling for hyphenated approaches including mass spectral deconvolution for clean mass spectra, cheminformatics approaches and structure retention relationships, and retention index predictions for gas and liquid chromatography. The last section reviews the current state of electronic data sharing of mass spectra and discusses the importance of software development for the advancement of structure elucidation of small molecules
    corecore