41 research outputs found
Identification and Quantification of 1‑Hydroxybutene-2-yl Mercapturic Acid in Human Urine by UPLC- HILIC-MS/MS as a Novel Biomarker for 1,3-Butadiene Exposure
1,3-Butadiene (BD) is a Class 1 carcinogen present at
workplaces,
in polluted air, in automobile exhaust, and in tobacco smoke. 2-Hydroxybutene-1-yl
mercapturic acid (2-MHBMA) is a urinary metabolite often measured
as a biomarker for exposure to BD. Here, we show for the first time
that an additional MHBMA isomer is present at significant amounts
in human urine, 1-hydroxybutene-2-yl mercapturic acid (1-MHBMA). For
its quantification, a highly sensitive UPLC-HILIC-MS/MS method was
developed and validated. Analyzing urinary samples of 183 volunteers,
we demonstrate that 1-MHBMA is a novel and potentially more reliable
biomarker for BD exposure than the commonly analyzed 2-MHBMA
Bayesian Independent Component Analysis Recovers Pathway Signatures from Blood Metabolomics Data
Interpreting the complex interplay of metabolites in
heterogeneous
biosamples still poses a challenging task. In this study, we propose
independent component analysis (ICA) as a multivariate analysis tool
for the interpretation of large-scale metabolomics data. In particular,
we employ a Bayesian ICA method based on a mean-field approach, which
allows us to statistically infer the number of independent components
to be reconstructed. The advantage of ICA over correlation-based methods
like principal component analysis (PCA) is the utilization of higher
order statistical dependencies, which not only yield additional information
but also allow a more meaningful representation of the data with fewer
components. We performed the described ICA approach on a large-scale
metabolomics data set of human serum samples, comprising a total of
1764 study probands with 218 measured metabolites. Inspecting the <i>source matrix</i> of statistically independent metabolite profiles
using a weighted enrichment algorithm, we observe strong enrichment
of specific metabolic pathways in all components. This includes signatures
from amino acid metabolism, energy-related processes, carbohydrate
metabolism, and lipid metabolism. Our results imply that the human
blood metabolome is composed of a distinct set of overlaying, statistically
independent signals. ICA furthermore produces a <i>mixing matrix</i>, describing the strength of each independent component for each
of the study probands. Correlating these values with plasma high-density
lipoprotein (HDL) levels, we establish a novel association between
HDL plasma levels and the branched-chain amino acid pathway. We conclude
that the Bayesian ICA methodology has the power and flexibility to
replace many of the nowadays common PCA and clustering-based analyses
common in the research field
Bayesian Independent Component Analysis Recovers Pathway Signatures from Blood Metabolomics Data
Interpreting the complex interplay of metabolites in
heterogeneous
biosamples still poses a challenging task. In this study, we propose
independent component analysis (ICA) as a multivariate analysis tool
for the interpretation of large-scale metabolomics data. In particular,
we employ a Bayesian ICA method based on a mean-field approach, which
allows us to statistically infer the number of independent components
to be reconstructed. The advantage of ICA over correlation-based methods
like principal component analysis (PCA) is the utilization of higher
order statistical dependencies, which not only yield additional information
but also allow a more meaningful representation of the data with fewer
components. We performed the described ICA approach on a large-scale
metabolomics data set of human serum samples, comprising a total of
1764 study probands with 218 measured metabolites. Inspecting the <i>source matrix</i> of statistically independent metabolite profiles
using a weighted enrichment algorithm, we observe strong enrichment
of specific metabolic pathways in all components. This includes signatures
from amino acid metabolism, energy-related processes, carbohydrate
metabolism, and lipid metabolism. Our results imply that the human
blood metabolome is composed of a distinct set of overlaying, statistically
independent signals. ICA furthermore produces a <i>mixing matrix</i>, describing the strength of each independent component for each
of the study probands. Correlating these values with plasma high-density
lipoprotein (HDL) levels, we establish a novel association between
HDL plasma levels and the branched-chain amino acid pathway. We conclude
that the Bayesian ICA methodology has the power and flexibility to
replace many of the nowadays common PCA and clustering-based analyses
common in the research field
Network-Based Approach for Analyzing Intra- and Interfluid Metabolite Associations in Human Blood, Urine, and Saliva
Most
studies investigating human metabolomics measurements are
limited to a single biofluid, most often blood or urine. An organism’s
biochemical pool, however, comprises complex transboundary relationships,
which can only be understood by investigating metabolic interactions
and physiological processes spanning multiple parts of the human body.
Therefore, we here propose a data-driven network-based approach to
generate an integrated picture of metabolomics associations over multiple
fluids. We performed an analysis of 2251 metabolites measured in plasma,
urine, and saliva, from 374 participants of the Qatar Metabolomics
Study on Diabetes (QMDiab). Gaussian graphical models (GGMs) were
used to estimate metabolite-metabolite interactions on different subsets
of the data set. First, we compared similarities and differences of
the metabolome and the association networks between the three fluids.
Second, we investigated the cross-talk between the fluids by analyzing
correlations occurring between them. Third, we propose a framework
for the analysis of medically relevant phenotypes by integrating type
2 diabetes, sex, age, and body mass index into our networks. In conclusion,
we present a generic, data-driven network-based approach for structuring
and visualizing metabolite correlations within and between multiple
body fluids, enabling unbiased interpretation of metabolomics multifluid
data
Experimental confirmation of X-14208 as phenylalanylserine.
<p>Two possible dipeptide variants were predicted and consequently tested. The fragmentation spectrum of the 253.1 m/z ion (positive mode) of the pure Phe-Ser matches that of the unknown compound, whereas the spectrum for pure Ser-Phe differs visibly. Moreover, the retention index (RI) of Phe-Ser is similar to the RI of X-14208, whereas that of Ser-Phe is significantly different.</p
Gaussian graphical modeling.
<p>GGMs embed unknown metabolites into their biochemical context. A: Complete network presentation of partial correlations that are significantly different from zero at α = 0.05 after Bonferroni correction. The unknown metabolites are spread over the entire network and are involved in various metabolic pathways. B–D: Selected high-scoring sub-networks. We observe that GGM edges directly correspond to chemical reactions which alter specific chemical groups (e.g. carbonyl groups and methyl groups). Solid lines denote positive partial correlation. Dashed lines indicate negative partial correlations. Line widths represent partial correlation strengths.</p
Semi-automatic prediction of unknown metabolite identities.
<p>A: Examples of how to determine pathway classifications based on the functional annotations of GGM and GWAS hits. We present two metabolites, X-11421 and X-11244, whose GGM and GWAS associations clearly point into carnitine and steroid metabolism, respectively. B: Overview of unknowns functionally annotated by both GGMs and the GWAS approach. ‘GGM’ refers to an unknown metabolite which is three or less steps away from a known metabolite in the GGM, whereas ‘direct GGM’ represents direct neighbors in the network. C: Pathway predictions for the 16 unknowns with both direct GGM and GWAs annotations. Unknowns marked with a star were subjected to in-depth analysis followed by experimental validation in the following.</p
Data integration workflow for the systematic classification of unknown metabolites.
<p>We combine high-throughput metabolomics and genotyping data in Gaussian graphical models (GGMs) <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003005#pgen.1003005-Krumsiek1" target="_blank">[21]</a> and in genome-wide association studies (GWAS) <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003005#pgen.1003005-Suhre2" target="_blank">[5]</a> in order to produce testable predictions of the unknown metabolites' identities. These hypotheses are then subject to experimental verification by mass-spectrometry. Six such cases have been fully worked through and are presented in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003005#pgen-1003005-t003" target="_blank">Table 3</a>.</p
Six specific scenarios and their experimental validations.
<p>We investigated six scenarios that included a total of nine unknown metabolites. The first three scenarios, DIPEPTIDE, STEROID, and HETE are discussed in the main text of this paper; the remaining three scenarios, CARNITINE, BILIRUBIN and ASCORBATE, are discussed in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003005#pgen.1003005.s006" target="_blank">Text S3</a>. Predictions marked by * are confirmed by exact mass, fragmentation pattern and chromatographic retention time; however, validation using a pure standard compound as a reference is pending since these compounds are presently commercially unavailable in pure form.</p
Manhattan plot of genetic association.
<p>The strength of association for known (bottom) and unknown (top) metabolites is indicated as the negative logarithm of the p-value for the linear model (see Methods). Only metabolite-SNP associations with p-values below 10<sup>−6</sup> are plotted (grey circles). Triangles represent metabolite-SNP associations with p-values below 10<sup>−40</sup>. Horizontal lines indicate the threshold for genome-wide significance ( = 1.6×10<sup>−10</sup> corresponding to α = 0.05 after Bonferroni correction); red vertical dashes indicate loci at which this threshold is attained.</p