7 research outputs found

    A semiparametric modeling framework for potential biomarker discovery and the development of metabonomic profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The discovery of biomarkers is an important step towards the development of criteria for early diagnosis of disease status. Recently electrospray ionization (ESI) and matrix assisted laser desorption (MALDI) time-of-flight (TOF) mass spectrometry have been used to identify biomarkers both in proteomics and metabonomics studies. Data sets generated from such studies are generally very large in size and thus require the use of sophisticated statistical techniques to glean useful information. Most recent attempts to process these types of data model each compound's intensity either discretely by positional (mass to charge ratio) clustering or through each compounds' own intensity distribution. Traditionally data processing steps such as noise removal, background elimination and m/z alignment, are generally carried out separately resulting in unsatisfactory propagation of signals in the final model.</p> <p>Results</p> <p>In the present study a novel semi-parametric approach has been developed to distinguish urinary metabolic profiles in a group of traumatic patients from those of a control group consisting of normal individuals. Data sets obtained from the replicates of a single subject were used to develop a functional profile through Dirichlet mixture of beta distribution. This functional profile is flexible enough to accommodate variability of the instrument and the inherent variability of each individual, thus simultaneously addressing different sources of systematic error. To address instrument variability, all data sets were analyzed in replicate, an important issue ignored by most studies in the past. Different model comparisons were performed to select the best model for each subject. The m/z values in the window of the irregular pattern are then further recommended for possible biomarker discovery.</p> <p>Conclusion</p> <p>To the best of our knowledge this is the very first attempt to model the physical process behind the time-of flight mass spectrometry. Most of the state of the art techniques does not take these physical principles in consideration while modeling such data. The proposed modeling process will apply as long as the basic physical principle presented in this paper is valid. Notably we have confined our present work mostly within the modeling aspect. Nevertheless clinical validation of our recommended list of potential biomarkers will be required. Hence, we have termed our modeling approach as a "framework" for further work.</p

    Metabolomics of ApcMin/+ mice genetically susceptible to intestinal cancer

    Get PDF
    BACKGROUND: To determine how diets high in saturated fat could increase polyp formation in the mouse model of intestinal neoplasia, Apc( Min/+ ), we conducted large-scale metabolome analysis and association study of colon and small intestine polyp formation from plasma and liver samples of Apc( Min/+ ) vs. wild-type littermates, kept on low vs. high-fat diet. Label-free mass spectrometry was used to quantify untargeted plasma and acyl-CoA liver compounds, respectively. Differences in contrasts of interest were analyzed statistically by unsupervised and supervised modeling approaches, namely Principal Component Analysis and Linear Model of analysis of variance. Correlation between plasma metabolite concentrations and polyp numbers was analyzed with a zero-inflated Generalized Linear Model. RESULTS: Plasma metabolome in parallel to promotion of tumor development comprises a clearly distinct profile in Apc( Min/+ ) mice vs. wild type littermates, which is further altered by high-fat diet. Further, functional metabolomics pathway and network analyses in Apc( Min/+ ) mice on high-fat diet revealed associations between polyp formation and plasma metabolic compounds including those involved in amino-acids metabolism as well as nicotinamide and hippuric acid metabolic pathways. Finally, we also show changes in liver acyl-CoA profiles, which may result from a combination of Apc( Min/+ )-mediated tumor progression and high fat diet. The biological significance of these findings is discussed in the context of intestinal cancer progression. CONCLUSIONS: These studies show that high-throughput metabolomics combined with appropriate statistical modeling and large scale functional approaches can be used to monitor and infer changes and interactions in the metabolome and genome of the host under controlled experimental conditions. Further these studies demonstrate the impact of diet on metabolic pathways and its relation to intestinal cancer progression. Based on our results, metabolic signatures and metabolic pathways of polyposis and intestinal carcinoma have been identified, which may serve as useful targets for the development of therapeutic interventions

    Metabolomics of ApcMin/+\u3c/sup\u3e Mice Genetically Susceptible to Intestinal Cancer

    Get PDF
    Background: To determine how diets high in saturated fat could increase polyp formation in the mouse model of intestinal neoplasia, ApcMin/+, we conducted large-scale metabolome analysis and association study of colon and small intestine polyp formation from plasma and liver samples of ApcMin/+ vs. wild-type littermates, kept on low vs. high-fat diet. Label-free mass spectrometry was used to quantify untargeted plasma and acyl-CoA liver compounds, respectively. Differences in contrasts of interest were analyzed statistically by unsupervised and supervised modeling approaches, namely Principal Component Analysis and Linear Model of analysis of variance. Correlation between plasma metabolite concentrations and polyp numbers was analyzed with a zero-inflated Generalized Linear Model.Results: Plasma metabolome in parallel to promotion of tumor development comprises a clearly distinct profile in ApcMin/+ mice vs. wild type littermates, which is further altered by high-fat diet. Further, functional metabolomics pathway and network analyses in ApcMin/+ mice on high-fat diet revealed associations between polyp formation and plasma metabolic compounds including those involved in amino-acids metabolism as well as nicotinamide and hippuric acid metabolic pathways. Finally, we also show changes in liver acyl-CoA profiles, which may result from a combination of ApcMin/+-mediated tumor progression and high fat diet. The biological significance of these findings is discussed in the context of intestinal cancer progression.Conclusions: These studies show that high-throughput metabolomics combined with appropriate statistical modeling and large scale functional approaches can be used to monitor and infer changes and interactions in the metabolome and genome of the host under controlled experimental conditions. Further these studies demonstrate the impact of diet on metabolic pathways and its relation to intestinal cancer progression. Based on our results, metabolic signatures and metabolic pathways of polyposis and intestinal carcinoma have been identified, which may serve as useful targets for the development of therapeutic interventions. © 2014 Dazard et al.; licensee BioMed Central Ltd

    From Metabolite Concentration to Flux – A Systematic Assessment of Error in Cell Culture Metabolomics

    Get PDF
    The growing availability of genomic, transcriptomic, and metabolomic data has opened the door to the synthesis of multiple levels of information in biological research. As a consequence, there has been a push to analyze biological systems in a comprehensive manner through the integration of their interactions into mathematical models, with the process frequently referred to as “systems biology”. Despite the potential for this approach to greatly improve our knowledge of biological systems, the definition of mathematical relationships between different levels of information opens the door to diverse sources of error, requiring precise, unbiased quantification as well as robust validation methods. Failure to account for differences in uncertainty across multiple levels of data analysis may cause errors to drown out any useful outcomes of the synthesis. The application of a systems biology approach has been particularly important in metabolic modeling. There has been a concentrated effort to build models directly from genomic data and to incorporate as much of the metabolome as possible in the analysis. Metabolomic data collection has been expanded through the recent use of hydrogen Nuclear Magnetic Resonance (1H-NMR) spectroscopy for cell culture monitoring. However, the combination of uncertainty from model construction and measurement error from NMR (or other means of metabolomic) analysis complicates data interpretation. This thesis establishes the precision and accuracy of NMR spectroscopy in the context of cell cultivation while developing a methodology for assessing model error in Metabolic Flux Analysis (MFA). The analysis of cell culture media via NMR has been made possible by the development of specialized software for the “deconvolution” of complex spectra, however, the process is semi-qualitative. A human “profiler” is required to manually fit idealized peaks from a compound library to an observed spectra, where the quality of fit is often subject to considerable interpretation. Work presented in this thesis establishes baseline accuracy as approximately 2%-10% of the theoretical mean, with a relative standard deviation of 1.5% to 3%. Higher variabilities were associated primarily with profiling error, while lower variabilities were due in part to tube insertion (and the steps leading up to spectra acquisition). Although a human profiler contributed to overall uncertainty, the net impact did not make the deconvolution process prohibitively imprecise. Analysis was then expanded to consider solutions that are more representative of cell culture supernatant. The combination of metabolites at different concentration levels was efficiently represented by a Plackett-Burman experiment. The orthogonality of this design ensured that every level of metabolite concentration was combined with an equal number of high and low concentrations of all other variable metabolites, providing a worst-case scenario for variance estimation. Analysis of media-like mixtures revealed a median error and standard deviation to be approximately 10%, although estimating low metabolite concentrations resulted in a considerable loss of accuracy and precision in the presence of resonance overlap. Furthermore, an iterative regression process identified a number of cases where an increase in the concentration of one metabolite resulted in increased quantification error of another. More importantly, the analysis established a general methodology for estimating the quantification variability of media-specific metabolite concentrations. Subsequent application of NMR analysis to time-course data from cell cultivation revealed correlated deviations from calculated trends. Similar deviations were observed for multiple (chemically) unrelated metabolites, amounting to approximately 1%-10% of the metabolite’s concentration. The nature of these deviations suggested the cause to be inaccuracies in internal standard addition or quantification, resulting in a skew of all quantified metabolite concentrations within a sample by the same relative amount. Error magnitude was estimated by calculating the median relative deviation from a smoothing fit for all compounds at a give timepoint. A metabolite time-course simulation was developed to determine the frequency and magnitude of such deviations arising from typical measurement error (without added bias from incorrect internal standard addition). Multiple smoothing functions were tested on simulated time-courses and cubic spline regression was found to minimize the median relative deviation from measurement noise to approximately 2.5%. Based on these results, an iterative smoothing correction method was implemented to identify and correct median deviations greater than 2.5%, with both simulation and correction code released as the “metcourse” package for the R programming language. Finally, a t-test validation method was developed to assess the impact of measurement and model error on MFA, with a Chinese hamster ovary (CHO) cell model chosen as a case study. The standard MFA formulation was recast as a generalized least squares (GLS) problem, with calculated fluxes subject to a t-significance test. NMR data was collected for a CHO cell bioreactor run, with another set of data simulated directly from the model and perturbed by observed measurement error. The frequency of rejected fluxes in the simulated data (free of model error) was attributed to measurement uncertainty alone. The rejection of fluxes calculated from observed data as non-significant that were not rejected in the simulated data was attributed to a lack of model fit i.e. model error. Applying this method to the observed data revealed a considerable level of error that was not identified by traditional χ2 validation. Further simulation was carried out to assess the impact of measurement error and model structure, both of which were found to have a dramatic impact on statistical significance and calculation error that has yet to be addressed in the context of MFA

    Bayesian Methods for Metabolomics

    Get PDF
    Metabolomics, the large-scale study of small molecules, enables the underlying biochemical activity and state of cells or tissues to be directly captured. Nuclear Magnetic Resonance (NMR) Spectroscopy is one of the major data capturing tech- niques for metabolomics, as it provides highly reproducible, quantitative informa- tion on a wide variety of metabolites. This work presents possible solutions for three problems involved to aid the development of better algorithms for NMR data analy- sis. After reviewing relevant concepts and literature, we first utilise observed NMR chemical shift titration data for a range of urinary metabolites and develop a the- oretical model of chemical shift using a Bayesian statistical framework and model selection procedures to estimate the number of protonation sites, a key parameter to model the relationship between chemical shift variation and pH and usually un- known in uncatalogued metabolites. Secondly, with the aim of obtaining explicit concentration estimates for metabolites from NMR spectra, we discuss a Monte Carlo Co-ordinate Ascent Variational Inference (MC-CAVI) algorithm that com- bines Markov chain Monte Carlo (MCMC) methods with Co-ordinate Ascent VI (CAVI), demonstrate MC-CAVI’s suitability for models with hard constraints and compare MC-CAVI’s performance with that of MCMC in an important complex model used in NMR spectroscopy data analysis. The third distribution seeks to im- prove metabolite identification, one of the biggest bottlenecks in metabolomics and severely hindered by resonance overlapping in one-dimensional NMR spectroscopy. In particular, we present a novel Bayesian method for widely used two-dimensional (2D) 1H J-resolved (JRES) NMR spectroscopy, which has considerable potential to accurately identify and quantify metabolites within complex biological samples, through combining B-spline tight wavelet frames with theoretical templates. We then demonstrate the effectiveness of our approach via analyses of JRES datasets from serum and urine
    corecore