513 research outputs found

    Bayesian Methods for Metabolomics

    Get PDF
    Metabolomics, the large-scale study of small molecules, enables the underlying biochemical activity and state of cells or tissues to be directly captured. Nuclear Magnetic Resonance (NMR) Spectroscopy is one of the major data capturing tech- niques for metabolomics, as it provides highly reproducible, quantitative informa- tion on a wide variety of metabolites. This work presents possible solutions for three problems involved to aid the development of better algorithms for NMR data analy- sis. After reviewing relevant concepts and literature, we first utilise observed NMR chemical shift titration data for a range of urinary metabolites and develop a the- oretical model of chemical shift using a Bayesian statistical framework and model selection procedures to estimate the number of protonation sites, a key parameter to model the relationship between chemical shift variation and pH and usually un- known in uncatalogued metabolites. Secondly, with the aim of obtaining explicit concentration estimates for metabolites from NMR spectra, we discuss a Monte Carlo Co-ordinate Ascent Variational Inference (MC-CAVI) algorithm that com- bines Markov chain Monte Carlo (MCMC) methods with Co-ordinate Ascent VI (CAVI), demonstrate MC-CAVI’s suitability for models with hard constraints and compare MC-CAVI’s performance with that of MCMC in an important complex model used in NMR spectroscopy data analysis. The third distribution seeks to im- prove metabolite identification, one of the biggest bottlenecks in metabolomics and severely hindered by resonance overlapping in one-dimensional NMR spectroscopy. In particular, we present a novel Bayesian method for widely used two-dimensional (2D) 1H J-resolved (JRES) NMR spectroscopy, which has considerable potential to accurately identify and quantify metabolites within complex biological samples, through combining B-spline tight wavelet frames with theoretical templates. We then demonstrate the effectiveness of our approach via analyses of JRES datasets from serum and urine

    Sample Preparation in Metabolomics

    Get PDF
    Metabolomics is increasingly being used to explore the dynamic responses of living systems in biochemical research. The complexity of the metabolome is outstanding, requiring the use of complementary analytical platforms and methods for its quantitative or qualitative profiling. In alignment with the selected analytical approach and the study aim, sample collection and preparation are critical steps that must be carefully selected and optimized to generate high-quality metabolomic data. This book showcases some of the most recent developments in the field of sample preparation for metabolomics studies. Novel technologies presented include electromembrane extraction of polar metabolites from plasma samples and guidelines for the preparation of biospecimens for the analysis with high-resolution μ magic-angle spinning nuclear magnetic resonance (HR-μMAS NMR). In the following chapters, the spotlight is on sample preparation approaches that have been optimized for diverse bioanalytical applications, including the analysis of cell lines, bacteria, single spheroids, extracellular vesicles, human milk, plant natural products and forest trees

    Development and application of a platform for harmonisation and integration of metabolomics data

    Get PDF
    Integrating diverse metabolomics data for molecular epidemiology analyses provides both opportuni- ties and challenges in the field of human health research. Combining patient cohorts may improve power and sensitivity of analyses but is challenging due to significant technical and analytical vari- ability. Additionally, current systems for the storage and analysis of metabolomics data suffer from scalability, query-ability, and integration issues that limit their adoption for molecular epidemiological research. Here, a novel platform for integrative metabolomics is developed, which addresses issues of storage, harmonisation, querying, scaling, and analysis of large-scale metabolomics data. Its use is demonstrated through an investigation of molecular trends of ageing in an integrated four-cohort dataset where the advantages and disadvantages of combining balanced and unbalanced cohorts are explored, and robust metabolite trends are successfully identified and shown to be concordant with previous studies.Open Acces

    Incorporating standardised drift-tube ion mobility to enhance non-targeted assessment of the wine metabolome (LC×IM-MS)

    Get PDF
    Liquid chromatography with drift-tube ion mobility spectrometry-mass spectrometry (LCxIM-MS) is emerging as a powerful addition to existing LC-MS workflows for addressing a diverse range of metabolomics-related questions [1,2]. Importantly, excellent precision under repeatability and reproducibility conditions of drift-tube IM separations [3] supports the development of non-targeted approaches for complex metabolome assessment such as wine characterisation [4]. In this work, fundamentals of this new analytical metabolomics approach are introduced and application to the analysis of 90 authentic red and white wine samples originating from Macedonia is presented. Following measurements, intersample alignment of metabolites using non-targeted extraction and three-dimensional alignment of molecular features (retention time, collision cross section, and high-resolution mass spectra) provides confidence for metabolite identity confirmation. Applying a fingerprinting metabolomics workflow allows statistical assessment of the influence of geographic region, variety, and age. This approach is a state-of-the-art tool to assess wine chemodiversity and is particularly beneficial for the discovery of wine biomarkers and establishing product authenticity based on development of fingerprint libraries

    Development of an integrated computational platform for metabolomics data analysis and knowledge extraction

    Get PDF
    Dissertação de mestrado em Computing EngineeringIn the last few years, biological and biomedical research has been generating a large amount of quantitative data, given the surge of high-throughput techniques that are able to quantify different types of molecules in the cell. While transcriptomics and proteomics, which measure gene expression and amounts of proteins respectively, are the most mature, metabolomics, the quantification of small compounds, has been emerging in the last years as an advantageous alternative in many applications. As it happens with other omics data, metabolomics brings important challenges regarding the capability of extracting relevant knowledge from typically large amounts of data. To respond to these challenges, an integrated computational platform for metabolomics data analysis and knowledge extraction was created to facilitate the use of several methods of visualization, data analysis and data mining. In the first stage of the project, a state of the art analysis was conducted to assess the existing methods and computational tools in the field and what was missing or was difficult to use for a common user without computational expertise. This step helped to figure out which strategies to adopt and the main functionalities which were important to develop in the software. As a supporting framework, R was chosen given the easiness of creating and documenting data analysis scripts and the possibility of developing new packages adding new functions, while taking advantage of the numerous resources created by the vibrant R community. So, the next step was to develop an R package with an integrated set of functions that would allow to conduct a metabolomics data analysis pipeline, with reduced effort, allowing to explore the data, apply different data analysis methods and visualize their results, in this way supporting the extraction of relevant knowledge from metabolomics data. Regarding data analysis, the package includes functions for data loading from different formats and pre-processing, as well as different methods for univariate and multivariate data analysis, including t-tests, analysis of variance, correlations, principal component analysis and clustering. Also, it includes a large set of methods for machine learning with distinct models for classification and regression, as well as feature selection methods. The package supports the analysis of metabolomics data from infrared, ultra violet visible and nuclear magnetic resonance spectroscopies. The package has been validated on real examples, considering three case studies, including the analysis of data from natural products including bees propolis and cassava, as well as metabolomics data from cancer patients. Each of these data were analyzed using the developed package with different pipelines of analysis and HTML reports that include both analysis scripts and their results, were generated using the documentation features provided by the package.Nos últimos anos, a investigação biológica e biomédica tem gerado um grande número de dados quantitativos, devido ao aparecimento de técnicas de alta capacidade que permitem quantificar diferentes tipos de moléculas na célula. Enquanto a transcriptómica e a proteómica, que medem a expressão genética e quantidade de proteínas respectivamente, estão mais desenvolvidas, a metabolómica, que tem por definição a quantificação de pequenos compostos, tem emergido nestes últimos anos como uma alternativa vantajosa em muitas aplicações. Como acontece com outros dados ómicos, a metabolómica traz importantes desafios em relação à capacidade de extracção de conhecimento relevante de uma grande quantidade de dados tipicamente. Para responder a esses desafios, uma plataforma computacional integrada para a análise de dados de metabolómica e extracção de informação foi criada para facilitar o uso de diversos métodos de visualização, análise de dados e mineração de dados. Na primeira fase do projecto, foi efectuado um levantamento do estado da arte para avaliar os métodos e ferramentas computacionais existentes na área e o que estava em falta ou difícil de usar para um utilizador comum sem conhecimentos de informática. Esta fase ajudou a esclarecer que estratégias adoptar e as principais funcionalidades que fossem importantes para desenvolver no software. Como uma plataforma de apoio, o R foi escolhido pela sua facilidade de criação e documentar scripts de análise de dados e a possibilidade de novos pacotes adicionarem novas funcionalidades, enquanto se tira vantagem dos inúmeros recursos criados pela vibrante comunidade do R. Assim, o próximo passo foi o desenvolvimento do pacote do R com um conjunto integrado de funções que permitem conduzir um pipeline de análise de dados, com reduzido esforço, permitindo explorar os dados, aplicar diferentes métodos de análise de dados e visualizar os seus resultados, desta maneira suportando a extracção de conhecimento relevante de dados de metabolómica. Em relação à análise de dados, o pacote inclui funções para o carregamento dos dados de diversos formatos e para pré-processamento, assim como diferentes métodos para a análise univariada e multivariada dos dados, incluindo t-tests, análise de variância, correlações, análise de componentes principais e agrupamentos. Também inclui um grande conjunto de métodos para aprendizagem automática com modelos distintos para classificação ou regressão, assim como métodos de selecção de atributos. Este pacote suporta a análise de dados de metabolómica de espectroscopia de infravermelhos, ultra violeta visível e ressonância nuclear magnética. O pacote foi validado com exemplos reais, considerando três casos de estudo, incluindo a análise dos dados de produtos naturais como a própolis e a mandioca, assim como dados de metabolómica de pacientes com cancro. Cada um desses dados foi analisado usando o pacote desenvolvido com diferentes pipelines de análise e relatórios HTML que incluem ambos scripts de análise e os seus resultados, foram gerados usando as funcionalidades documentadas fornecidas pelo pacote

    Deriving statistical inference from the application of artificial neural networks to clinical metabolomics data

    Get PDF
    Metabolomics data are complex with a high degree of multicollinearity. As such, multivariate linear projection methods, such as partial least squares discriminant analysis (PLS-DA) have become standard. Non-linear projections methods, typified by Artificial Neural Networks (ANNs) may be more appropriate to model potential nonlinear latent covariance; however, they are not widely used due to difficulty in deriving statistical inference, and thus biological interpretation. Herein, we illustrate the utility of ANNs for clinical metabolomics using publicly available data sets and develop an open framework for deriving and visualising statistical inference from ANNs equivalent to standard PLS-DA methods

    Application of nuclear magnetic resonance spectroscopy in the study of complex matrices

    Get PDF
    The aim of this PhD work was to apply the NMR based metabolomic approach to the study of complex matrices such as several food plants (pepper, celery, tomatoes, hemp, baobab, teas, blueberries and olive oils). A comprehensive description of the chemical composition in term of primary and secondary metabolites obtained by means of 1D and 2D experiments was reported and information regarding specific aspects (variety, type of production etc) were obtained. The study of stool samples of patients with liver cirrhosis was also carried out confirming the important contribution of the NMR approach in the disease investigation

    Automated Analysis of Quantitative NMR Spectra

    Get PDF
    NMR spectroscopy is an invaluable tool for structure elucidation in chemistry and molecular biology, which is able to provide unique information not easily obtained by other analytical methods. However, performing quantitative NMR experiments and mixture analysis is considerably less common due to constraints in sensitivity/resolution and the fact that NMR observes individual nuclei, not molecules. The advances in instrument design in the last 25 years have substantially increased the sensitivity of NMR spectrometers, diminishing the main weakness of NMR, while increases in field strength and ever more intricate experiments have improved the resolving power and expanded the attainable information. The minimal need for sample preparation and its non-specific nature make quantitative NMR suitable for many applications ranging from quality control to metabolome characterization. Furthermore, the development of automated sample changers and fully automated acquisition have made high-throughput NMR acquisition a more feasible and attractive, yet expensive, possibility. This work discusses the fundamental principles and limitations of quantitative liquid state NMR spectroscopy, and tries to put together a summary of its various aspects scattered across literature. Many of these more subtle features can be neglected in simple routine spectroscopy, but become important when extracting quantitative data and/or when trying to acquire and process vast amounts of spectra consistently. The original research presented in this thesis provides improved methods for data acquisition of quantitative 13C detected NMR spectra in the form of modified INEPT based experiments (Q-INEPT-CT and Q-INEPT-2D), while software tools for automated processing and analysis of NMR spectra are also presented (ImatraNMR and SimpeleNMR). The application of these tools is demonstrated in the analysis of complex hydrocarbon mixtures (base oils), plant extracts and blood plasma samples. The increased capability of NMR spectroscopy, the rising interest in metabolomics and for example the recent introduction of benchtop NMR spectrometers are likely to expand the future use of quantitative NMR in the analysis of complex mixtures. For this reason, the further development of robust, accurate and feasible analysis methods and tools is essential.NMR-spektroskopia on keskeinen mm. kemiassa ja molekyylibiologiassa käytetty analyysimenetelmä, joka perustuu atomiydinten havaitsemiseen voimakkaassa magneettikentässä radioaaltojen avulla. Menetelmä soveltuu erityisen hyvin molekyylirakenteiden selvittämiseen, ja sillä voidaan saada tietoa myös molekyylien kolmiulotteisesta rakenteesta sekä niiden välisistä interaktioista. NMR-spektroskopia on myös epäselektiivinen menetelmä, jolla on helppo tutkia erityyppisiä näytteitä ilman monimutkaista esikäsittelyä. Perinteisesti NMR-spektroskopian heikkoutena on ollut spektrometrien kalleus ja huono herkkyys, joka on rajannut sen käyttöä laimeiden näytteiden ja etenkin seosten analysoinnissa. Laitteistojen ja analyysitekniikoiden parantuminen viimeisten 20-30 vuoden aikana on kuitenkin kohentanut tilannetta merkittävästi, ja NMR-spektroskopian käyttäminen seosten kvantitatiiviseen analyysiin on selvässä kasvussa. Etenkin metaboliittien analysoimisesta erilaisista biologisista näytteistä on muodostunut tärkeä sovellus. Tätä kehitystä on vauhdittanut myös näytteenkäsittelyn ja spektrien prosessoinnin automaation kehittyminen, joka helpottaa suurien näytemäärien tutkimista. Suurin osa NMR-spektrien käsittelyyn tarkoitetuista ohjelmistoista ei kuitenkaan vielä ole suunniteltu ensisijaisesti suurten näytesarjojen tai seosten analysointiin. Tämä työ keskittyy kvantitatiiviseen NMR-spektroskopiaan ja sen sovelluksiin. Työssä kehitettiin kvantitatiivisia NMR-menetelmiä (pulssisarjat), sekä spektrien analyysiin soveltuvia ohjelmistotyökaluja (ImatraNMR ja SimpeleNMR), joiden tavoitteena on etenkin suurten näytesarjojen automaattisen analysoinnin helpottaminen. Kehitettyjä työkaluja käytettiin hiilivetyseosten (perusöljyt) ja kasviekstraktien analysointiin, mutta niitä voidaan soveltaa myös moniin muihin näytesarjoihin tai esimerkiksi reaktioseosten analysointiin

    Metabolic engineering of microorganisms for the overproduction of fatty acids

    Get PDF
    Fatty acids naturally synthesized in many organisms are promising starting points for the catalytic production of industrial chemicals and diesel-like biofuels. However, bio-production of fatty acids in microbial hosts relies heavily on manipulating tightly regulated fatty acid biosynthetic pathways, thus complicating the engineering for higher yields. With the advent of systems metabolic engineering, we demonstrated an iterative metabolic engineering effort that integrates computationally driven predictions and metabolic flux analysis (MFA) was demonstrated to meet this challenge. With wild type E. coli fluxomic data, the OptForce procedure was employed to suggest genetic manipulations for fatty acid overproduction. In accordance with the OptForce prioritization of interventions, fabZ and acyl-ACP thioesterase were upregulated and fadD was deleted to arrive at a strain that produces 1.70 g/L and 0.14 g fatty acid/g glucose of C14-16 fatty acid in minimal medium. However, OptForce does not infer gene regulation, enzyme inhibition and metabolic toxicity. Along with transcriptomics and metabolomics analysis, we re-deployed OptForce simulation using the redefined flux distribution as constraints to generate predictions for the second generation fatty acid-overproducing strain. MFA identified the up-regulation of the TCA cycle and down-regulation of pentose phosphate pathway under fatty acid overproduction to replenish the need of energy and reducing molecules. The elevation of intracellular metabolite levels in the TCA cycle complemented the flux findings. With re-defined flux boundary of the first generation strain, OptForce suggested the interruption of TCA cycle such as removal of succinate dehydrogenase as the most prioritized genetic intervention to further improve fatty acid production. Meanwhilem, the whole genome transcriptional analysis revealed acid stress response, membrane disruption, colanic acid and biofilm formation during fatty acid production, thus pinpointing the targets for future metabolic engineering effort. These results highlight the benefit of using computational strain design and system metabolic engineering tools in systematically guiding the strain design to produce free fatty acids. Nonetheless, Saccharomyces cerevisiae is another attractive host organism for the production of biochemicals and biofuels. However, S. cerevisiae is very susceptible to octanoic acid toxicity. Transcriptomics analysis revealed membrane stress and intracellular acidification during octanoic acid stress. MFA illustrated the increase of flux in the TCA cycle possibly to facilitate the ATP-binding-cassette transporter activities. Further efforts can focus on improving membrane integrity or explore oleaginious yeasts to enhance the tolerance against fatty acids
    corecore