Search CORE

15 research outputs found

Optimizing the Use of Quality Control Samples for Signal Drift Correction in Large-Scale Urine Metabolic Profiling Studies

Author: Elizabeth J. Want (1354113)
Konstantina Spagou (1939336)
Muhammad Anas Kamleh (2101477)
Perrine Masson (2101480)
Timothy M. D. Ebbels (225779)
Publication venue
Publication date
Field of study

The evident importance of metabolic profiling for biomarker discovery and hypothesis generation has led to interest in incorporating this technique into large-scale studies, e.g., clinical and molecular phenotyping studies. Nevertheless, these lengthy studies mandate the use of analytical methods with proven reproducibility. An integrated experimental plan for LC–MS profiling of urine, involving sample sequence design and postacquisition correction routines, has been developed. This plan is based on the optimization of the frequency of analyzing identical quality control (QC) specimen injections and using the QC intensities of each metabolite feature to construct a correction trace for all the samples. The QC-based methods were tested against other current correction practices, such as total intensity normalization. The evaluation was based on the reproducibility obtained from technical replicates of 46 samples and showed the feature-based signal correction (FBSC) methods to be superior to other methods, resulting in ∼1000 and 600 metabolite features with coefficient of variation (CV) < 15% within and between two blocks, respectively. Additionally, the required frequency of QC sample injection was investigated and the best signal correction results were achieved with at least one QC injection every 2 h of urine sample injections (n = 10). Higher rates of QC injections (1 QC/h) resulted in slightly better correction but at the expense of longer total analysis time

FigShare

Optimized Phenotypic Biomarker Discovery and Confounder Elimination via Covariate-Adjusted Projection to Latent Structures from Metabolic Spectroscopy Data

Author: Elaine Holmes (40919)
Isabel Garcia-Perez (488742)
Jeremiah Stamler (2047609)
Jeremy K. Nicholson (40923)
John C. Lindon (206364)
Joram M. Posma (1696315)
Paul Elliott (27295)
Timothy M. D. Ebbels (225779)
Publication venue
Publication date
Field of study

Metabolism is altered by genetics, diet, disease status, environment, and many other factors. Modeling either one of these is often done without considering the effects of the other covariates. Attributing differences in metabolic profile to one of these factors needs to be done while controlling for the metabolic influence of the rest. We describe here a data analysis framework and novel confounder-adjustment algorithm for multivariate analysis of metabolic profiling data. Using simulated data, we show that similar numbers of true associations and significantly less false positives are found compared to other commonly used methods. Covariate-adjusted projections to latent structures (CA-PLS) are exemplified here using a large-scale metabolic phenotyping study of two Chinese populations at different risks for cardiovascular disease. Using CA-PLS, we find that some previously reported differences are actually associated with external factors and discover a number of previously unreported biomarkers linked to different metabolic pathways. CA-PLS can be applied to any multivariate data where confounding may be an issue and the confounder-adjustment procedure is translatable to other multivariate regression techniques

FigShare

Subset Optimization by Reference Matching (STORM): An Optimized Statistical Approach for Recovery of Metabolic Biomarker Structural Information from 1H NMR Spectra of Biofluids

Author: Elaine Holmes (40919)
Isabel Garcia-Perez (488742)
Jeremy K. Nicholson (40923)
John C. Lindon (206364)
Joram M. Posma (1761559)
Maria De Iorio (32163)
Paul Elliott (27295)
Timothy M. D. Ebbels (225779)
Publication venue
Publication date
Field of study

We describe a new multivariate statistical approach to recover metabolite structure information from multiple 1H NMR spectra in population sample sets. Subset optimization by reference matching (STORM) was developed to select subsets of 1H NMR spectra that contain specific spectroscopic signatures of biomarkers differentiating between different human populations. STORM aims to improve the visualization of structural correlations in spectroscopic data by using these reduced spectral subsets containing smaller numbers of samples than the number of variables (n ≪ p). We have used statistical shrinkage to limit the number of false positive associations and to simplify the overall interpretation of the autocorrelation matrix. The STORM approach has been applied to findings from an ongoing human metabolome-wide association study on body mass index to identify a biomarker metabolite present in a subset of the population. Moreover, we have shown how STORM improves the visualization of more abundant NMR peaks compared to a previously published method (statistical total correlation spectroscopy, STOCSY). STORM is a useful new tool for biomarker discovery in the “omic” sciences that has widespread applicability. It can be applied to any type of data, provided that there is interpretable correlation among variables, and can also be applied to data with more than one dimension (e.g., 2D NMR spectra)

FigShare

A Combination of Transcriptomics and Metabolomics Uncovers Enhanced Bile Acid Biosynthesis in HepG2 Cells Expressing CCAAT/Enhancer-Binding Protein β (C/EBPβ), Hepatocyte Nuclear Factor 4α (HNF4α), and Constitutive Androstane Receptor (CAR)

Author: Agustín Lahoz (1951570)
Aitor Carretero (1951567)
Hector C. Keun (225788)
James K. Ellis (1951573)
José V. Castell (1951561)
Marina Blazquez (1951564)
Rachel Cavill (225747)
Roque Bort (513085)
Timothy M. D. Ebbels (225779)
Toby J. Athersuch (225760)
Publication venue
Publication date
Field of study

The development of hepatoma-based in vitro models to study hepatocyte physiology is an invaluable tool for both industry and academia. Here, we develop an in vitro model based on the HepG2 cell line that produces chenodeoxycholic acid, the main bile acid in humans, in amounts comparable to human hepatocytes. A combination of adenoviral transfections for CCAAT/enhancer-binding protein β (C/EBPβ), hepatocyte nuclear factor 4α (HNF4α), and constitutive androstane receptor (CAR) decreased intracellular glutamate, succinate, leucine, and valine levels in HepG2 cells, suggestive of a switch to catabolism to increase lipogenic acetyl CoA and increased anaplerosis to replenish the tricarboxylic acid cycle. Transcripts of key genes involved in bile acid synthesis were significantly induced by approximately 160-fold. Consistently, chenodeoxycholic acid production rate was increased by more than 20-fold. Comparison between mRNA and bile acid levels suggest that 12-alpha hydroxylation of 7-alpha-hydroxy-4-cholesten-3-one is the limiting step in cholic acid synthesis in HepG2 cells. These data reveal that introduction of three hepatocyte-related transcription factors enhance anabolic reactions in HepG2 cells and provide a suitable model to study bile acid biosynthesis under pathophysiological conditions

FigShare

Power Analysis and Sample Size Determination in Metabolic Phenotyping

Author: Adrienne Tin (160761)
Anne-Claire Vergnaud (91115)
Benjamin J. Blaise (2111488)
Elaine Holmes (40919)
Gonçalo Correia (2702089)
J. Hunter Young (274937)
Jake T. M. Pearce (1740700)
Jeremy K. Nicholson (1522261)
Matthew Lewis (275018)
Paul Elliott (27295)
Timothy M. D. Ebbels (225779)
Publication venue
Publication date
Field of study

Estimation of statistical power and sample size is a key aspect of experimental design. However, in metabolic phenotyping, there is currently no accepted approach for these tasks, in large part due to the unknown nature of the expected effect. In such hypothesis free science, neither the number or class of important analytes nor the effect size are known a priori. We introduce a new approach, based on multivariate simulation, which deals effectively with the highly correlated structure and high-dimensionality of metabolic phenotyping data. First, a large data set is simulated based on the characteristics of a pilot study investigating a given biomedical issue. An effect of a given size, corresponding either to a discrete (classification) or continuous (regression) outcome is then added. Different sample sizes are modeled by randomly selecting data sets of various sizes from the simulated data. We investigate different methods for effect detection, including univariate and multivariate techniques. Our framework allows us to investigate the complex relationship between sample size, power, and effect size for real multivariate data sets. For instance, we demonstrate for an example pilot data set that certain features achieve a power of 0.8 for a sample size of 20 samples or that a cross-validated predictivity QY2 of 0.8 is reached with an effect size of 0.2 and 200 samples. We exemplify the approach for both nuclear magnetic resonance and liquid chromatography–mass spectrometry data from humans and the model organism C. elegans

FigShare

Untargeted Metabolome Quantitative Trait Locus Mapping Associates Variation in Urine Glycerate to Mutant Glycerate Kinase

Author: Benjamin J. Blaise (2118058)
Dominique Gauguier (40924)
Elaine C. Holmes (2118055)
James Scott (40922)
Jean-Baptise Cazier (2118067)
Jeremy K. Nicholson (40923)
John C. Lindon (206364)
Karène Argoud (2118052)
Kirill Veselkov (2007508)
Marc-Emmanuel Dumas (40913)
Marie-Thérèse Bihoreau (2118061)
Pamela J. Kaisaki (52706)
Steve C. Mitchell (2118064)
Timothy M. D. Ebbels (225779)
Tsz Tsang (2118070)
Yulan Wang (50197)
Publication venue
Publication date
Field of study

With successes of genome-wide association studies, molecular phenotyping systems are developed to identify genetically determined disease-associated biomarkers. Genetic studies of the human metabolome are emerging but exclusively apply targeted approaches, which restricts the analysis to a limited number of well-known metabolites. We have developed novel technical and statistical methods for systematic and automated quantification of untargeted NMR spectral data designed to perform robust and accurate quantitative trait locus (QTL) mapping of known and previously unreported molecular compounds of the metabolome. For each spectral peak, six summary statistics were calculated and independently tested for evidence of genetic linkage in a cohort of F2 (129S6xBALB/c) mice. The most significant evidence of linkages were obtained with NMR signals characterizing the glycerate (LOD10-42) at the mutant glycerate kinase locus, which demonstrate the power of metabolomics in quantitative genetics to identify the biological function of genetic variants. These results provide new insights into the resolution of the complex nature of metabolic regulations and novel analytical techniques that maximize the full utilization of metabolomic spectra in human genetics to discover mappable disease-associated biomarkers

FigShare

Optimizing the Use of Quality Control Samples for Signal Drift Correction in Large-Scale Urine Metabolic Profiling Studies

Optimized Phenotypic Biomarker Discovery and Confounder Elimination via Covariate-Adjusted Projection to Latent Structures from Metabolic Spectroscopy Data

Subset Optimization by Reference Matching (STORM): An Optimized Statistical Approach for Recovery of Metabolic Biomarker Structural Information from <sup>1</sup>H NMR Spectra of Biofluids

A Combination of Transcriptomics and Metabolomics Uncovers Enhanced Bile Acid Biosynthesis in HepG2 Cells Expressing CCAAT/Enhancer-Binding Protein β (C/EBPβ), Hepatocyte Nuclear Factor 4α (HNF4α), and Constitutive Androstane Receptor (CAR)

Power Analysis and Sample Size Determination in Metabolic Phenotyping

Untargeted Metabolome Quantitative Trait Locus Mapping Associates Variation in Urine Glycerate to Mutant Glycerate Kinase

Histogram of estimated ICC.

Histogram of CV for each metabolite in participant samples.

Histogram of CV for 94 metabolites in QC samples.

Bland-Altman plots for uric acid.