15 research outputs found
Optimizing the Use of Quality Control Samples for Signal Drift Correction in Large-Scale Urine Metabolic Profiling Studies
The evident importance of metabolic profiling for biomarker
discovery
and hypothesis generation has led to interest in incorporating this
technique into large-scale studies, e.g., clinical and molecular phenotyping
studies. Nevertheless, these lengthy studies mandate the use of analytical
methods with proven reproducibility. An integrated experimental plan
for LCāMS profiling of urine, involving sample sequence design
and postacquisition correction routines, has been developed. This
plan is based on the optimization of the frequency of analyzing identical
quality control (QC) specimen injections and using the QC intensities
of each metabolite feature to construct a correction trace for all
the samples. The QC-based methods were tested against other current
correction practices, such as total intensity normalization. The evaluation
was based on the reproducibility obtained from technical replicates
of 46 samples and showed the feature-based signal correction (FBSC)
methods to be superior to other methods, resulting in ā¼1000
and 600 metabolite features with coefficient of variation (CV) <
15% within and between two blocks, respectively. Additionally, the
required frequency of QC sample injection was investigated and the
best signal correction results were achieved with at least one QC
injection every 2 h of urine sample injections (<i>n</i> = 10). Higher rates of QC injections (1 QC/h) resulted in slightly
better correction but at the expense of longer total analysis time
Optimized Phenotypic Biomarker Discovery and Confounder Elimination via Covariate-Adjusted Projection to Latent Structures from Metabolic Spectroscopy Data
Metabolism is altered by genetics,
diet, disease status, environment,
and many other factors. Modeling either one of these is often done
without considering the effects of the other covariates. Attributing
differences in metabolic profile to one of these factors needs to
be done while controlling for the metabolic influence of the rest.
We describe here a data analysis framework and novel confounder-adjustment
algorithm for multivariate analysis of metabolic profiling data. Using
simulated data, we show that similar numbers of true associations
and significantly less false positives are found compared to other
commonly used methods. Covariate-adjusted projections to latent structures
(CA-PLS) are exemplified here using a large-scale metabolic phenotyping
study of two Chinese populations at different risks for cardiovascular
disease. Using CA-PLS, we find that some previously reported differences
are actually associated with external factors and discover a number
of previously unreported biomarkers linked to different metabolic
pathways. CA-PLS can be applied to any multivariate data where confounding
may be an issue and the confounder-adjustment procedure is translatable
to other multivariate regression techniques
Subset Optimization by Reference Matching (STORM): An Optimized Statistical Approach for Recovery of Metabolic Biomarker Structural Information from <sup>1</sup>H NMR Spectra of Biofluids
We describe a new multivariate statistical approach to
recover
metabolite structure information from multiple <sup>1</sup>H NMR spectra
in population sample sets. Subset optimization by reference matching
(STORM) was developed to select subsets of <sup>1</sup>H NMR spectra
that contain specific spectroscopic signatures of biomarkers differentiating
between different human populations. STORM aims to improve the visualization
of structural correlations in spectroscopic data by using these reduced
spectral subsets containing smaller numbers of samples than the number
of variables (<i>n</i> āŖ <i>p</i>). We
have used statistical shrinkage to limit the number of false positive
associations and to simplify the overall interpretation of the autocorrelation
matrix. The STORM approach has been applied to findings from an ongoing
human metabolome-wide association study on body mass index to identify
a biomarker metabolite present in a subset of the population. Moreover,
we have shown how STORM improves the visualization of more abundant
NMR peaks compared to a previously published method (statistical total
correlation spectroscopy, STOCSY). STORM is a useful new tool for
biomarker discovery in the āomicā sciences that has
widespread applicability. It can be applied to any type of data, provided
that there is interpretable correlation among variables, and can also
be applied to data with more than one dimension (e.g., 2D NMR spectra)
A Combination of Transcriptomics and Metabolomics Uncovers Enhanced Bile Acid Biosynthesis in HepG2 Cells Expressing CCAAT/Enhancer-Binding Protein Ī² (C/EBPĪ²), Hepatocyte Nuclear Factor 4Ī± (HNF4Ī±), and Constitutive Androstane Receptor (CAR)
The
development of hepatoma-based in vitro models to study hepatocyte
physiology is an invaluable tool for both industry and academia. Here,
we develop an in vitro model based on the HepG2 cell line that produces
chenodeoxycholic acid, the main bile acid in humans, in amounts comparable
to human hepatocytes. A combination of adenoviral transfections for
CCAAT/enhancer-binding protein Ī² (C/EBPĪ²), hepatocyte
nuclear factor 4Ī± (HNF4Ī±), and constitutive androstane
receptor (CAR) decreased intracellular glutamate, succinate, leucine,
and valine levels in HepG2 cells, suggestive of a switch to catabolism
to increase lipogenic acetyl CoA and increased anaplerosis to replenish
the tricarboxylic acid cycle. Transcripts of key genes involved in
bile acid synthesis were significantly induced by approximately 160-fold.
Consistently, chenodeoxycholic acid production rate was increased
by more than 20-fold. Comparison between mRNA and bile acid levels
suggest that 12-alpha hydroxylation of 7-alpha-hydroxy-4-cholesten-3-one
is the limiting step in cholic acid synthesis in HepG2 cells. These
data reveal that introduction of three hepatocyte-related transcription
factors enhance anabolic reactions in HepG2 cells and provide a suitable
model to study bile acid biosynthesis under pathophysiological conditions
Power Analysis and Sample Size Determination in Metabolic Phenotyping
Estimation of statistical
power and sample size is a key aspect
of experimental design. However, in metabolic phenotyping, there is
currently no accepted approach for these tasks, in large part due
to the unknown nature of the expected effect. In such hypothesis free
science, neither the number or class of important analytes nor the
effect size are known <i>a priori</i>. We introduce a new
approach, based on multivariate simulation, which deals effectively
with the highly correlated structure and high-dimensionality of metabolic
phenotyping data. First, a large data set is simulated based on the
characteristics of a pilot study investigating a given biomedical
issue. An effect of a given size, corresponding either to a discrete
(classification) or continuous (regression) outcome is then added.
Different sample sizes are modeled by randomly selecting data sets
of various sizes from the simulated data. We investigate different
methods for effect detection, including univariate and multivariate
techniques. Our framework allows us to investigate the complex relationship
between sample size, power, and effect size for real multivariate
data sets. For instance, we demonstrate for an example pilot data
set that certain features achieve a power of 0.8 for a sample size
of 20 samples or that a cross-validated predictivity <i>Q</i><sub>Y</sub><sup>2</sup> of 0.8 is reached with an effect size of
0.2 and 200 samples. We exemplify the approach for both nuclear magnetic
resonance and liquid chromatographyāmass spectrometry data
from humans and the model organism <i>C. elegans</i>
Untargeted Metabolome Quantitative Trait Locus Mapping Associates Variation in Urine Glycerate to Mutant Glycerate Kinase
With successes of genome-wide association studies, molecular phenotyping systems are developed to identify genetically determined disease-associated biomarkers. Genetic studies of the human metabolome are emerging but exclusively apply targeted approaches, which restricts the analysis to a limited number of well-known metabolites. We have developed novel technical and statistical methods for systematic and automated quantification of untargeted NMR spectral data designed to perform robust and accurate quantitative trait locus (QTL) mapping of known and previously unreported molecular compounds of the metabolome. For each spectral peak, six summary statistics were calculated and independently tested for evidence of genetic linkage in a cohort of F2 (129S6xBALB/c) mice. The most significant evidence of linkages were obtained with NMR signals characterizing the glycerate (LOD10-42) at the mutant glycerate kinase locus, which demonstrate the power of metabolomics in quantitative genetics to identify the biological function of genetic variants. These results provide new insights into the resolution of the complex nature of metabolic regulations and novel analytical techniques that maximize the full utilization of metabolomic spectra in human genetics to discover mappable disease-associated biomarkers
Histogram of estimated ICC.
<p>Estimated intraclass correlation coefficients (ICC) calculated using the formula, 1 ā (Total CV of QC samples)<sup>2</sup> / (Total CV of subject samples)<sup>2</sup>.</p
Histogram of CV for each metabolite in participant samples.
<p>(A) Coefficient of variation (CV) for each detected metabolite in participant plasma samples. (B) Inter- and intra-batch CV for each metabolite in participant samples. Inter- and intra-batch CV were computed using linear mixed models.</p
Histogram of CV for 94 metabolites in QC samples.
<p>(A) Coefficients of variation (CV) for detected 94 metabolites in quality control (QC) samples. (B) Inter- and intra-batch CV for each metabolite in QC samples. Inter- and intra-batch CV were computed using linear mixed models.</p
Bland-Altman plots for uric acid.
<p>X-axis indicates the mean uric acid concentrations (Ī¼mol/L) of capillary electrophoresis-mass spectrometry (CE-MS) and clinical assay, and Y-axis indicates percentage of differences between these two methods.</p