422 research outputs found

    Advances in structure elucidation of small molecules using mass spectrometry

    Get PDF
    The structural elucidation of small molecules using mass spectrometry plays an important role in modern life sciences and bioanalytical approaches. This review covers different soft and hard ionization techniques and figures of merit for modern mass spectrometers, such as mass resolving power, mass accuracy, isotopic abundance accuracy, accurate mass multiple-stage MS(n) capability, as well as hybrid mass spectrometric and orthogonal chromatographic approaches. The latter part discusses mass spectral data handling strategies, which includes background and noise subtraction, adduct formation and detection, charge state determination, accurate mass measurements, elemental composition determinations, and complex data-dependent setups with ion maps and ion trees. The importance of mass spectral library search algorithms for tandem mass spectra and multiple-stage MS(n) mass spectra as well as mass spectral tree libraries that combine multiple-stage mass spectra are outlined. The successive chapter discusses mass spectral fragmentation pathways, biotransformation reactions and drug metabolism studies, the mass spectral simulation and generation of in silico mass spectra, expert systems for mass spectral interpretation, and the use of computational chemistry to explain gas-phase phenomena. A single chapter discusses data handling for hyphenated approaches including mass spectral deconvolution for clean mass spectra, cheminformatics approaches and structure retention relationships, and retention index predictions for gas and liquid chromatography. The last section reviews the current state of electronic data sharing of mass spectra and discusses the importance of software development for the advancement of structure elucidation of small molecules

    Hydrocarbon phenotyping of algal species using pyrolysis-gas chromatography mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biofuels derived from algae biomass and algae lipids might reduce dependence on fossil fuels. Existing analytical techniques need to facilitate rapid characterization of algal species by phenotyping hydrocarbon-related constituents.</p> <p>Results</p> <p>In this study, we compared the hydrocarbon rich algae <it>Botryococcus braunii </it>against the photoautotrophic model algae <it>Chlamydomonas reinhardtii </it>using pyrolysis-gas chromatography quadrupole mass spectrometry (pyGC-MS). Sequences of up to 48 dried samples can be analyzed using pyGC-MS in an automated manner without any sample preparation. Chromatograms of 30-min run times are sufficient to profile pyrolysis products from C8 to C40 carbon chain length. The freely available software tools AMDIS and SpectConnect enables straightforward data processing. In <it>Botryococcus </it>samples, we identified fatty acids, vitamins, sterols and fatty acid esters and several long chain hydrocarbons. The algae species <it>C. reinhardtii, B. braunii </it>race A and <it>B. braunii </it>race B were readily discriminated using their hydrocarbon phenotypes. Substructure annotation and spectral clustering yielded network graphs of similar components for visual overviews of abundant and minor constituents.</p> <p>Conclusion</p> <p>Pyrolysis-GC-MS facilitates large scale screening of hydrocarbon phenotypes for comparisons of strain differences in algae or impact of altered growth and nutrient conditions.</p

    Finding Characteristic Substructures for Metabolite Classes

    Get PDF
    We introduce a method for finding a characteristic substructure for a set of molecular structures. Different from common approaches, such as computing the maximum common subgraph, the resulting substructure does not have to be contained in its exact form in all input molecules. Our approach is part of the identification pipeline for unknown metabolites using fragmentation trees. Searching databases using fragmentation tree alignment results in hit lists containing compounds with large structural similarity to the unknown metabolite. The characteristic substructure of the molecules in the hit list may be a key structural element of the unknown compound and might be used as starting point for structure elucidation. We evaluate our method on different data sets and find that it retrieves essential substructures if the input lists are not too heterogeneous. We apply our method to predict structural elements for five unknown samples from Icelandic poppy

    Novel methods for the analysis of small molecule fragmentation mass spectra

    Get PDF
    The identification of small molecules, such as metabolites, in a high throughput manner plays an important in many research areas. Mass spectrometry (MS) is one of the predominant analysis technologies and is much more sensitive than nuclear magnetic resonance spectroscopy. Fragmentation of the molecules is used to obtain information beyond its mass. Gas chromatography-MS is one of the oldest and most widespread techniques for the analysis of small molecules. Commonly, the molecule is fragmented using electron ionization (EI). Using this technique, the molecular ion peak is often barely visible in the mass spectrum or even absent. We present a method to calculate fragmentation trees from high mass accuracy EI spectra, which annotate the peaks in the mass spectrum with molecular formulas of fragments and explain relevant fragmentation pathways. Fragmentation trees enable the identification of the molecular ion and its molecular formula if the molecular ion is present in the spectrum. The method works even if the molecular ion is of very low abundance. MS experts confirm that the calculated trees correspond very well to known fragmentation mechanisms.Using pairwise local alignments of fragmentation trees, structural and chemical similarities to already-known molecules can be determined. In order to compare a fragmentation tree of an unknown metabolite to a huge database of fragmentation trees, fast algorithms for solving the tree alignment problem are required. Unfortunately the alignment of unordered trees, such as fragmentation trees, is NP-hard. We present three exact algorithms for the problem. Evaluation of our methods showed that thousands of alignments can be computed in a matter of minutes. Both the computation and the comparison of fragmentation trees are rule-free approaches that require no chemical knowledge about the unknown molecule and thus will be very helpful in the automated analysis of metabolites that are not included in common libraries

    Comparing Fragmentation Trees from Electron Impact Mass Spectra with Annotated Fragmentation Pathways

    Get PDF
    Electron impact ionization (EI) is the most common form of ionization for GC-MS analysis of small molecules. This ionization method results in a mass spectrum not necessarily containing the molecular ion peak. The fragmentation of small compounds during EI is well understood, but manual interpretation of mass spectra is tedious and time-consuming. Methods for automated analysis are highly sought, but currently limited to database searching and rule-based approaches. With the computation of hypothetical fragmentation trees from high mass GC-MS data the high-throughput interpretation of such spectra may become feasible. We compare these trees with annotated fragmentation pathways. We find that fragmentation trees explain the origin of the ions found in the mass spectra in accordance to the literature. No peak is annotated with an incorrect fragment formula and 78.7% of the fragmentation processes are correctly reconstructed

    Bayesian methods for small molecule identification

    Get PDF
    Confident identification of small molecules remains a major challenge in untargeted metabolomics, natural product research and related fields. Liquid chromatography-tandem mass spectrometry is a predominant technique for the high-throughput analysis of small molecules and can detect thousands of different compounds in a biological sample. The automated interpretation of the resulting tandem mass spectra is highly non-trivial and many studies are limited to re-discovering known compounds by searching mass spectra in spectral reference libraries. But these libraries are vastly incomplete and a large portion of measured compounds remains unidentified. This constitutes a major bottleneck in the comprehensive, high-throughput analysis of metabolomics data. In this thesis, we present two computational methods that address different steps in the identification process of small molecules from tandem mass spectra. ZODIAC is a novel method for de novo that is, database-independent molecular formula annotation in complete datasets. It exploits similarities of compounds co-occurring in a sample to find the most likely molecular formula for each individual compound. ZODIAC improves on the currently best-performing method SIRIUS; on one dataset by 16.5 fold. We show that de novo molecular formula annotation is not just a theoretical advantage: We discover multiple novel molecular formulas absent from PubChem, one of the biggest structure databases. Furthermore, we introduce a novel scoring for CSI:FingerID, a state-of-the-art method for searching tandem mass spectra in a structure database. This scoring models dependencies between different molecular properties in a predicted molecular fingerprint via Bayesian networks. This problem has the unusual property, that the marginal probabilities differ for each predicted query fingerprint. Thus, we need to apply Bayesian networks in a novel, non-standard fashion. Modeling dependencies improves on the currently best scoring

    Non-target screening with high-resolution mass spectrometry: critical review using a collaborative trial on water analysis

    Get PDF
    In this article, a dataset from a collaborative nontarget screening trial organised by the NORMAN Association is used to review the state-of-the-art and discuss future perspectives of non-target screening using high-resolution mass spectrometry in water analysis. A total of 18 institutes from 12 European countries analysed an extract of the same water sample collected from the River Danube with either one or both of liquid and gas chromatography coupled with mass spectrometry detection. This article focuses mainly on the use of high resolution screening techniques with target, suspect, and non-target workflows to identify substances in environmental samples. Specific examples are given to emphasise major challenges including isobaric and co-eluting substances, dependence on target and suspect lists, formula assignment, the use of retention information, and the confidence of identification. Approaches and methods applicable to unit resolution data are also discussed. Although most substances were identified using high resolution data with target and suspect-screening approaches, some participants proposed tentative non-target identifications. This comprehensive dataset revealed that nontarget analytical techniques are already substantially harmonised between the participants, but the data processing remains time-consuming. Although the objective of a Bfullyautomated identification workflow^ remains elusive in the short term, important steps in this direction have been taken, exemplified by the growing popularity of suspect screening approaches. Major recommendations to improve non-target screening include better integration and connection of desired features into software packages, the exchange of target and suspect lists, and the contribution of more spectra from standard substances into (openly accessible) databases.This work was supported in part by the SOLUTIONS project, which received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant Agreement No. 603437

    Confident metabolite structure annotation with COSMIC

    Get PDF
    Small molecules are key to biomarker discovery, drug development, toxicity screenings of ecosystems like rivers and lakes, and many more important research areas in multiple life sciences. Elucidating the exact structure of these metabolites is often crucial in determining their functionality, however, confident annotation of these structures remains a major challenge. To analyse samples of small molecules occurring in nature, mass spectrometry is the currently predominant technique. While mass spectrometry is used to measure the mass of a compound, tandem mass spectrometry can be used to additionally measure the mass of its fragments. The resulting spectral data however is highly non-trivial to interpret. This bottleneck accelerates the development of computational tools to annotate metabolite structures from mass spectrometry data, which enables rapid, large-scale structure annotation independent from spectral libraries. These tools return some proportion of incorrect annotations, which can vastly outnumber correct annotations. Scientists using these tools need to be able to differentiate correct from incorrect annotations. We develop an E-value computation that is based on proxy decoys drawn from the PubChem database and show that this E-value score outperforms the current CSI:FingerID hit score for the task of separating correct from incorrect annotations. To further improve on this, we develop a Percolator inspired machine learning approach, where we train linear support vector machines for this separation task. The confidence score outperforms the original CSI:FingerID hit score, the E-value score and all other tools that participated in the CASMI 2016 contest by a wide margin. Arguably, our confidence score enables confident structure annotation for a relevant portion of a dataset for the first time. We then show the power of this COSMIC workflow by annotating novel bile acid conjugate structures never reported before in a mouse fecal dataset
    corecore