Search CORE

3 research outputs found

Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features

Author: Bork P.
Gavin A.C.
Kuhn M.
Li Y.
Publication venue: Oxford University Press
Publication date: 01/01/2019
Field of study

MOTIVATION: Untargeted mass spectrometry is a powerful method for detecting metabolites in biological samples. However, fast and accurate identification of the metabolites' structures from MS/MS spectra is still a great challenge.RESULTS: We present a new analysis method, called SF-Matching, that is based on the hypothesis that molecules with similar structural features will exhibit similar fragmentation patterns. We combine information on fragmentation patterns of molecules with shared substructures and then use random forest models to predict whether a given structure can yield a certain fragmentation pattern. These models can then be used to score candidate molecules for a given mass spectrum. For rapid identification, we pre-compute such scores for common biological molecular structure databases. Using benchmarking datasets, we find that our method has similar performance to CSI:FingerID and that very high accuracies can be achieved by combining our method with CSI:FingerID. Rarefaction analysis of the training dataset shows that the performance of our method will increase as more experimental data become available. AVAILABILITY: SF-Matching is available from http://www.bork.embl.de/Docu/sf_matching. CONTACT: [email protected] (M.K.), [email protected] (P.B.

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

MDC Repository

Archive ouverte UNIGE

Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data

Author: Bach Eric
Rousu Juho
Schymanski Emma
Publication venue
Publication date: 01/01/2022
Field of study

Abstract Structural annotation of small molecules in biological samples remains a key bottleneck in untargeted metabolomics, despite rapid progress in predictive methods and tools during the past decade. Liquid chromatography–tandem mass spectrometry, one of the most widely used analysis platforms, can detect thousands of molecules in a sample, the vast majority of which remain unidentified even with best-of-class methods. Here we present LC-MS2Struct, a machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements. LC-MS2Struct jointly predicts the annotations for a set of mass spectrometry features in a sample, using a novel structured prediction model trained to optimally combine the output of state-of-the-art MS2 scorers and observed retention orders. We evaluate our method on a dataset covering all publicly available reversed-phase LC-MS2 data in the MassBank reference database, including 4,327 molecules measured using 18 different LC conditions from 16 contributors, greatly expanding the chemical analytical space covered in previous multi-MSscorer evaluations. LC-MS2Struct obtains significantly higher annotation accuracy than earlier methods and improves the annotation accuracy of state-of-the-art MS2 scorers by up to 106\%. The use of stereochemistry-aware molecular fingerprints improves prediction performance, which highlights limitations in existing approaches and has strong implications for future computational LC-MS2 developments

Open Repository and Bibliography - Luxembourg

Identification of metabolites from tandem mass spectra with a machine learning approach utilizing structural features

Author: Alfonso Valencia
Allen
Anne-Claude Gavin
Beger
Benton
Blaženović
Brouard
Dührkop
Dührkop
Gaulton
Guijas
Halgren
Hastings
Heinonen
Horai
Hummel
Kanehisa
Kangas
Kind
Laponogov
Ludwig
Michael Kuhn
Nguyen
O’Kell
Palmer
Pedregosa
Peer Bork
Ruttkies
Schrimpe-Rutledge
Schymanski
Schüler
Tsugawa
van der Hooft
van der Hooft
Vaniya
Wang
Wishart
Yuanyue Li
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref