Search CORE

Identifying metabolites by integrating metabolome databases with mass spectrometry cheminformatics.

Author: A El-Tayeb
AD Hanson
AL Hartman
Atsushi Ogiwara
C Fattuoni
C Ruttkies
CL Linster
DY Lee
F Allen
Gert Wohlgemuth
GJ Patti
H Sperber
H Tsugawa
H Tsugawa
H Tsugawa
HD Flosadóttir
Hiroshi Tsugawa
I Yamamoto
J Budczies
JG Jeffryes
John Meissen
K Haug
Kohei Takeuchi
M Sud
Masanori Arita
Matthew Mueller
Megan Showalter
MP Styczynski
MR Showalter
O Fiehn
O Fiehn
O Fiehn
O Khersonsky
Oliver Fiehn
Peter Beal
RR da Silva
S Kim
S Kumari
Sajjan Mehta
SE Stein
SM Rappaport
T Kind
Tobias Kind
WR Wikoff
Yuxuan Zheng
Zijuan Lai
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Novel metabolites distinct from canonical pathways can be identified through the integration of three cheminformatics tools: BinVestigate, which queries the BinBase gas chromatography-mass spectrometry (GC-MS) metabolome database to match unknowns with biological metadata across over 110,000 samples; MS-DIAL 2.0, a software tool for chromatographic deconvolution of high-resolution GC-MS or liquid chromatography-mass spectrometry (LC-MS); and MS-FINDER 2.0, a structure-elucidation program that uses a combination of 14 metabolome databases in addition to an enzyme promiscuity library. We showcase our workflow by annotating N-methyl-uridine monophosphate (UMP), lysomonogalactosyl-monopalmitin, N-methylalanine, and two propofol derivatives

eScholarship - University of California

How Large Is the Metabolome? A Critical Analysis of Data Exchange Practices in Chemistry

Author: A Oikawa
AJ Williams
BJ Strasser
C Steinbeck
CA Smith
CF Taylor
D Flaxbart
DB Baker
DL Wheeler
DS Wishart
DW Hill
EL Schymanski
EL Willighagen
F Mu
FH Allen
IV Filippov
J Downing
J Park
J Rhodes
JR McDaniel
LB De Silva
LW Sumner
M Arita
MA Ott
Martin Scholz
Michael Polymenis
O Casher
O Fiehn
O Fiehn
Oliver Fiehn
P Corbett
P Ibison
P Jaiswal
P Murray-Rust
Q Cui
R Apodaca
R Austin
R Caspi
R Guha
R Kidd
S Kuhn
SE Stein
SM Paley
SR Heller
SR Heller
T Kind
T Kind
T Kind
T Kind
Tobias Kind
Y Zhou
Publication venue: Public Library of Science
Publication date
Field of study

Calculating the metabolome size of species by genome-guided reconstruction of metabolic pathways misses all products from orphan genes and from enzymes lacking annotated genes. Hence, metabolomes need to be determined experimentally. Annotations by mass spectrometry would greatly benefit if peer-reviewed public databases could be queried to compile target lists of structures that already have been reported for a given species. We detail current obstacles to compile such a knowledge base of metabolites.As an example, results are presented for rice. Two rice (oryza sativa) subspecies have been fully sequenced, oryza japonica and oryza indica. Several major small molecule databases were compared for listing known rice metabolites comprising PubChem, Chemical Abstracts, Beilstein, Patent databases, Dictionary of Natural Products, SetupX/BinBase, KNApSAcK DB, and finally those databases which were obtained by computational approaches, i.e. RiceCyc, KEGG, and Reactome. More than 5,000 small molecules were retrieved when searching these databases. Unfortunately, most often, genuine rice metabolites were retrieved together with non-metabolite database entries such as pesticides. Overlaps from database compound lists were very difficult to compare because structures were either not encoded in machine-readable format or because compound identifiers were not cross-referenced between databases.We conclude that present databases are not capable of comprehensively retrieving all known metabolites. Metabolome lists are yet mostly restricted to genome-reconstructed pathways. We suggest that providers of (bio)chemical databases enrich their database identifiers to PubChem IDs and InChIKeys to enable cross-database queries. In addition, peer-reviewed journal repositories need to mandate submission of structures and spectra in machine readable format to allow automated semantic annotation of articles containing chemical structures. Such changes in publication standards and database architectures will enable researchers to compile current knowledge about the metabolome of species, which may extend to derived information such as spectral libraries, organ-specific metabolites, and cross-study comparisons

Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry

Author: A Makarov
AA Pontet
AJ Dempster
AL Rockwood
AM Richard
AW Jensen
B Seebass
BG Buchanan
C Djerassi
C Steinbeck
C Steinbeck
DA Laws
DL Olson
DL Wheeler
DR Scott
DS Wishart
F Csizmadia
H Budzikiewicz
HE Dayringer
J Braun
J Chen
J Lederberg
JC Lindon
JF Zhang
JJ Irwin
JK Senior
JL Faulon
JM Halket
JR De Laeter
L Sleno
M Badertscher
MD Soffer
ME Elyashberg
MP Balogh
N Huang
O Fiehn
O Fiehn
Oliver Fiehn
P Murray-Rust
QY Wu
RG Dromey
S Heuerding
S Noury
S Omura
SE Stein
SR Heller
SR Heller
T Fink
T Kind
T Morikawa
Tobias Kind
V Wray
W Windig
WD Ihlenfeldt
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80–99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65–81%. Corresponding software and supplemental data are available for downloads from the authors' website

Public Library of Science (PLOS)

Pharmacogenetics Meets Metabolomics: Discovery of Tryptophan as a New Endogenous OCT2 Substrate Related to Metformin Disposition

Author: A Saghatelian
BM Warrack
C Gieger
Christian Holscher
CM Slupsky
D Kidd
Do Yup Lee
EJ Saude
H Gu
H Koepsell
HJ Kang
Hyunmi Kim
I González
Im-Sook Song
Inmyoung Park
IS Song
IS Song
Jae-Gook Shin
JC Lindon
JK Nicholson
JL Griffin
JW Jonker
KA Lê Cao
Kwang-Hyeon Liu
Kyoung Heon Kim
L Zhang
M Assfalg
M Nakakariya
M Okuda
M Scholz
Min-Hye Shin
MK Choi
MK Leabman
N Kimura
N Paterson
O Fiehn
O Fiehn
Oliver Fiehn
P Arndt
R Zamora-Ros
RC Meyer
S Jozefczuk
S Kaewmokul
T Fujita
T Kind
Tobias Kind
X Zhang
Y Chen
Yun Gyong Ahn
Publication venue: Public Library of Science
Publication date: 08/05/2012
Field of study

Genetic polymorphisms of the organic cation transporter 2 (OCT2), encoded by SLC22A2, have been investigated in association with metformin disposition. A functional decrease in transport function has been shown to be associated with the OCT2 variants. Using metabolomics, our study aims at a comprehensive monitoring of primary metabolite changes in order to understand biochemical alteration associated with OCT2 polymorphisms and discovery of potential endogenous metabolites related to the genetic variation of OCT2. Using GC-TOF MS based metabolite profiling, clear clustering of samples was observed in Partial Least Square Discriminant Analysis, showing that metabolic profiles were linked to the genetic variants of OCT2. Tryptophan and uridine presented the most significant alteration in SLC22A2-808TT homozygous and the SLC22A2-808G>T heterozygous variants relative to the reference. Particularly tryptophan showed gene-dose effects of transporter activity according to OCT2 genotypes and the greatest linear association with the pharmacokinetic parameters (Clrenal, Clsec, Cl/F/kg, and Vd/F/kg) of metformin. An inhibition assay demonstrated the inhibitory effect of tryptophan on the uptake of 1-methyl-4-phenyl pyrinidium in a concentration dependent manner and subsequent uptake experiment revealed differential tryptophan-uptake rate in the oocytes expressing OCT2 reference and variant (808G>T). Our results collectively indicate tryptophan can serve as one of the endogenous substrate for the OCT2 as well as a biomarker candidate indicating the variability of the transport activity of OCT2

FigShare

Quantum Chemistry Calculations for Metabolomics

Author: Borges Ricardo M.
Colby Sean M.
Das Susanta K.
Edison Arthur S.
Fiehn Oliver
Kind Tobias
Lee Jesi
Merrill Amy T.
Merz Kenneth M., Jr.
Metz Thomas O.
Nunez Jamie R.
Renslow Ryan S.
Tantillo Dean J.
Wang Lee-Ping
Wang Shunyang
Publication venue: Digital Commons @ Kettering University
Publication date: 12/05/2021
Field of study

A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials (“standards”), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for “standards-free” identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials (“standards”), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for “standards-free” identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples

Kettering University

Multivariate classification of urine metabolome profiles for breast cancer diagnosis

Author: Bong Chul Chung
Byung Hwa Jung
C Denkert
D Agranoff
D Theodorescu
Doheon Lee
Imhoi Koo
J Klawitter
J-N Wang
JR Quinlan
K Kim
L Breiman
L Schnackenberg
L Vanajakshi
M Aivado
M Katajamaa
M Katajamaa
MC Walsh
O Fiehn
PGA Pedrioli
PJG Lisboa
PR Hoyer J
T Kind
WHM Heijne
Younghoon Kim
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Software platform virtualization in chemistry research and university teaching

Author: A Saghatelian
C Border
C Steinbeck
D Bullard
D Butina
D Field
DJ Wild
DJ Wild
E Russo
EL Schymanski
G Pirok
J Gasteiger
Julie A Leary
K Skapinetz
M Hann
M Katajamaa
M Stockman
MR Marty
MS Molchanova
N Bishop
N Brown
O Spjuth
Oliver Fiehn
P Ertl
R Figueiredo
R Guha
RF Boisvert
RP Goldberg
S Wold
T Kind
TI Oprea
Tim Leamy
Tobias Kind
VV Ramkumar
W Colon
W Vogels
Publication venue: BioMed Central
Publication date: 01/11/2009
Field of study

Abstract Background Modern chemistry laboratories operate with a wide range of software applications under different operating systems, such as Windows, LINUX or Mac OS X. Instead of installing software on different computers it is possible to install those applications on a single computer using Virtual Machine software. Software platform virtualization allows a single guest operating system to execute multiple other operating systems on the same computer. We apply and discuss the use of virtual machines in chemistry research and teaching laboratories. Results Virtual machines are commonly used for cheminformatics software development and testing. Benchmarking multiple chemistry software packages we have confirmed that the computational speed penalty for using virtual machines is low and around 5% to 10%. Software virtualization in a teaching environment allows faster deployment and easy use of commercial and open source software in hands-on computer teaching labs. Conclusion Software virtualization in chemistry, mass spectrometry and cheminformatics is needed for software testing and development of software for different operating systems. In order to obtain maximum performance the virtualization software should be multi-core enabled and allow the use of multiprocessor configurations in the virtual machine environment. Server consolidation, by running multiple tasks and operating systems on a single physical machine, can lead to lower maintenance and hardware costs especially in small research labs. The use of virtual machines can prevent software virus infections and security breaches when used as a sandbox system for internet access and software testing. Complex software setups can be created with virtual machines and are easily deployed later to multiple computers for hands-on teaching classes. We discuss the popularity of bioinformatics compared to cheminformatics as well as the missing cheminformatics education at universities worldwide.</p

Metabolomics approach for determining growth-specific metabolites based on Fourier transform ion cyclotron resonance mass spectrometry

Author: A Aharoni
A Bairoch
A Oikawa
AG Marshall
AL Boulesteix
Daisaku Ohta
DL Wheeler
DW Grogan
ER Vimr
H Suzuki
Hiroki Takahashi
J Laskin
JP Merlie
JR Laeter De
JW Gauthier
K Magnuson
Ken Kurokawa
Kenichi Tanaka
Kosuke Kai
L Stein
M Altaf-Ul-Amin
M Ishinaga
M Kanehisa
M Yano
Md. Altaf-Ul-Amin
MJ Brauer
MY Hirai
MY Hirai
Naotake Ogasawara
O Fiehn
RD Hall
S Goto
S Kanaya
SE Polakis
SG Villas-Boas
Shigehiko Kanaya
ST Ali
T Abe
T Kind
T Kind
T Soga
T Tohge
Taku Oshima
V Luca De
Y Nakamura
Yoko Shinbo
YY Chang
Publication venue: Springer-Verlag
Publication date: 01/01/2008
Field of study

Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR/MS) is the best MS technology for obtaining exact mass measurements owing to its great resolution and accuracy, and several outstanding FT-ICR/MS-based metabolomics approaches have been reported. A reliable annotation scheme is needed to deal with direct-infusion FT-ICR/MS metabolic profiling. Correlation analyses can help us not only uncover relations between the ions but also annotate the ions originated from identical metabolites (metabolite derivative ions). In the present study, we propose a procedure for metabolite annotation on direct-infusion FT-ICR/MS by taking into consideration the classification of metabolite-derived ions using correlation analyses. Integrated analysis based on information of isotope relations, fragmentation patterns by MS/MS analysis, co-occurring metabolites, and database searches (KNApSAcK and KEGG) can make it possible to annotate ions as metabolites and estimate cellular conditions based on metabolite composition. A total of 220 detected ions were classified into 174 metabolite derivative groups and 72 ions were assigned to candidate metabolites in the present work. Finally, metabolic profiling has been able to distinguish between the growth stages with the aid of PCA. The constructed model using PLS regression for OD600 values as a function of metabolic profiles is very useful for identifying to what degree the ions contribute to the growth stages. Ten phospholipids which largely influence the constructed model are highly abundant in the cells. Our analyses reveal that global modification of those phospholipids occurs as E. coli enters the stationary phase. Thus, the integrated approach involving correlation analyses, metabolic profiling, and database searching is efficient for high-throughput metabolomics

The University of Manchester - Institutional Repository

Separating the wheat from the chaff: a prioritisation pipeline for the analysis of metabolomics datasets

Author: A Kamleh
A Scalbert
Andris Jankevics
BO Keller
CA Smith
CA Smith
E Takano
E Zelena
Eriko Takano
K Dettmer
K Nieselt
M Arita
M Oldiges
Marcel de Vries
Maria Elena Merlo
O Fiehn
PD Karp
PD Karp
R Scheltema
R Scheltema
R Scheltema
R Tautenhahn
Rainer Breitling
Roel J. Vonk
S Kol
SD Bentley
T Kind
T Sangster
VP Shah
W Lu
W Windig
Publication venue: Springer US
Publication date: 01/01/2011
Field of study

Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful and widely applied method for the study of biological systems, biomarker discovery and pharmacological interventions. LC-MS measurements are, however, significantly complicated by several technical challenges, including: (1) ionisation suppression/enhancement, disturbing the correct quantification of analytes, and (2) the detection of large amounts of separate derivative ions, increasing the complexity of the spectra, but not their information content. Here we introduce an experimental and analytical strategy that leads to robust metabolome profiles in the face of these challenges. Our method is based on rigorous filtering of the measured signals based on a series of sample dilutions. Such data sets have the additional characteristic that they allow a more robust assessment of detection signal quality for each metabolite. Using our method, almost 80% of the recorded signals can be discarded as uninformative, while important information is retained. As a consequence, we obtain a broader understanding of the information content of our analyses and a better assessment of the metabolites detected in the analyzed data sets. We illustrate the applicability of this method using standard mixtures, as well as cell extracts from bacterial samples. It is evident that this method can be applied in many types of LC-MS analyses and more specifically in untargeted metabolomics

University of Groningen

Enlighten

Proceedings - University of Groningen

Springer

ARTS repository - University of Groningen