Search CORE

1,596 research outputs found

Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics.

Author: Blaženović Ivana
Fiehn Oliver
Ji Jian
Kind Tobias
Publication venue: eScholarship, University of California
Publication date: 01/05/2018
Field of study

The annotation of small molecules remains a major challenge in untargeted mass spectrometry-based metabolomics. We here critically discuss structured elucidation approaches and software that are designed to help during the annotation of unknown compounds. Only by elucidating unknown metabolites first is it possible to biologically interpret complex systems, to map compounds to pathways and to create reliable predictive metabolic models for translational and clinical research. These strategies include the construction and quality of tandem mass spectral databases such as the coalition of MassBank repositories and investigations of MS/MS matching confidence. We present in silico fragmentation tools such as MS-FINDER, CFM-ID, MetFrag, ChemDistiller and CSI:FingerID that can annotate compounds from existing structure databases and that have been used in the CASMI (critical assessment of small molecule identification) contests. Furthermore, the use of retention time models from liquid chromatography and the utility of collision cross-section modelling from ion mobility experiments are covered. Workflows and published examples of successfully annotated unknown compounds are included

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

eScholarship - University of California

Rapid prediction of NMR spectral properties with quantified uncertainty

Author: Jonas Eric
Kuhn Stefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/07/2018
Field of study

open access articleAccurate calculation of specific spectral properties for NMR is an important step for molecular structure elucidation. Here we report the development of a novel machine learning technique for accurately predicting chemical shifts of both 1H and 13C nuclei which exceeds DFT-accessible accuracy for 13C and 1H for a subset of nuclei, while being orders of magnitude more performant. Our method produces estimates of uncertainty, allowing for robust and confident predictions, and suggests future avenues for improved performance

De Montfort University Open Research Archive

Mass Spectra Prediction with Structural Motif-based Graph Neural Networks

Author: Jo Jeonghee
Park Jiwon
Yoon Sungroh
Publication venue
Publication date: 28/06/2023
Field of study

Mass spectra, which are agglomerations of ionized fragments from targeted molecules, play a crucial role across various fields for the identification of molecular structures. A prevalent analysis method involves spectral library searches,where unknown spectra are cross-referenced with a database. The effectiveness of such search-based approaches, however, is restricted by the scope of the existing mass spectra database, underscoring the need to expand the database via mass spectra prediction. In this research, we propose the Motif-based Mass Spectrum Prediction Network (MoMS-Net), a system that predicts mass spectra using the information derived from structural motifs and the implementation of Graph Neural Networks (GNNs). We have tested our model across diverse mass spectra and have observed its superiority over other existing models. MoMS-Net considers substructure at the graph level, which facilitates the incorporation of long-range dependencies while using less memory compared to the graph transformer model.Comment: 19 pages, 3figure

arXiv.org e-Print Archive

MassFormer: Tandem Mass Spectrum Prediction with Graph Transformers

Author: Röst Hannes
Wang Bo
Young Adamo
Publication venue
Publication date: 15/11/2021
Field of study

Mass spectrometry is a key tool in the study of small molecules, playing an important role in metabolomics, drug discovery, and environmental chemistry. Tandem mass spectra capture fragmentation patterns that provide key structural information about a molecule and help with its identification. Practitioners often rely on spectral library searches to match unknown spectra with known compounds. However, such search-based methods are limited by availability of reference experimental data. In this work we show that graph transformers can be used to accurately predict tandem mass spectra. Our model, MassFormer, outperforms competing deep learning approaches for spectrum prediction, and includes an interpretable attention mechanism to help explain predictions. We demonstrate that our model can be used to improve reference library coverage on a synthetic molecule identification task. Through quantitative analysis and visual inspection, we verify that our model recovers prior knowledge about the effect of collision energy on the generated spectrum. We evaluate our model on different types of mass spectra from two independent MS datasets and show that its performance generalizes. Code available at github.com/Roestlab/massformer.Comment: 14 pages (10 without bibliography), 5 figures, 3 table

arXiv.org e-Print Archive

Analyzing Learned Molecular Representations for Property Prediction

Author: Barzilay Regina
Coley Connor
Eiden Philipp
Gao Hua
Guzman-Perez Angel
Hopper Timothy
Jaakkola Tommi
Jensen Klavs
Jin Wengong
Kelley Brian
Mathea Miriam
Palmer Andrew
Settels Volker
Swanson Kyle
Yang Kevin
Publication venue
Publication date: 03/04/2019
Field of study

Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors, and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial datasets spanning a wide variety of chemical endpoints. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary datasets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows

arXiv.org e-Print Archive

DSpace@MIT

Crossref

FigShare

Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning

Author: Preud’homme H.
Samanipour S.
van Herwerden D.
Yang F.
Publication venue: 'MDPI AG'
Publication date: 29/09/2022
Field of study

High-resolution mass spectrometry is a promising technique in non-target screening (NTS) to monitor contaminants of emerging concern in complex samples. Current chemical identification strategies in NTS experiments typically depend on spectral libraries, chemical databases, and in silico fragmentation tools. However, small molecule identification remains challenging due to the lack of orthogonal sources of information (e.g., unique fragments). Collision cross section (CCS) values measured by ion mobility spectrometry (IMS) offer an additional identification dimension to increase the confidence level. Thanks to the advances in analytical instrumentation, an increasing application of IMS hybrid with high-resolution mass spectrometry (HRMS) in NTS has been reported in the recent decades. Several CCS prediction tools have been developed. However, limited CCS prediction methods were based on a large scale of chemical classes and cross-platform CCS measurements. We successfully developed two prediction models using a random forest machine learning algorithm. One of the approaches was based on chemicals’ super classes; the other model was direct CCS prediction using molecular fingerprint. Over 13,324 CCS values from six different laboratories and PubChem using a variety of ion-mobility separation techniques were used for training and testing the models. The test accuracy for all the prediction models was over 0.85, and the median of relative residual was around 2.2%. The models can be applied to different IMS platforms to eliminate false positives in small molecule identification

Multidisciplinary Digital Publishing Institute

PubMed Central

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Novel methods for the analysis of small molecule fragmentation mass spectra

Author: Hufsky Franziska
Publication venue
Publication date: 05/06/2014
Field of study

The identification of small molecules, such as metabolites, in a high throughput manner plays an important in many research areas. Mass spectrometry (MS) is one of the predominant analysis technologies and is much more sensitive than nuclear magnetic resonance spectroscopy. Fragmentation of the molecules is used to obtain information beyond its mass. Gas chromatography-MS is one of the oldest and most widespread techniques for the analysis of small molecules. Commonly, the molecule is fragmented using electron ionization (EI). Using this technique, the molecular ion peak is often barely visible in the mass spectrum or even absent. We present a method to calculate fragmentation trees from high mass accuracy EI spectra, which annotate the peaks in the mass spectrum with molecular formulas of fragments and explain relevant fragmentation pathways. Fragmentation trees enable the identification of the molecular ion and its molecular formula if the molecular ion is present in the spectrum. The method works even if the molecular ion is of very low abundance. MS experts confirm that the calculated trees correspond very well to known fragmentation mechanisms.Using pairwise local alignments of fragmentation trees, structural and chemical similarities to already-known molecules can be determined. In order to compare a fragmentation tree of an unknown metabolite to a huge database of fragmentation trees, fast algorithms for solving the tree alignment problem are required. Unfortunately the alignment of unordered trees, such as fragmentation trees, is NP-hard. We present three exact algorithms for the problem. Evaluation of our methods showed that thousands of alignments can be computed in a matter of minutes. Both the computation and the comparison of fragmentation trees are rule-free approaches that require no chemical knowledge about the unknown molecule and thus will be very helpful in the automated analysis of metabolites that are not included in common libraries

Digitale Bibliothek Thüringen