959 research outputs found

    Updates in metabolomics tools and resources: 2014-2015

    Get PDF
    Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

    Development of computational tools for the analysis of 2D-nuclear magnetic resonance data

    Get PDF
    Dissertação de mestrado em BioinformaticsMetabolomics is one of the omics’ sciences that has been gaining a lot of interest due to its potential on correlating an organism’s biochemical activity and its phenotype. The applications of metabolomics are being extended as new techniques reveal new information on metabolic profiles and molecules, thus elucidating biological, chemical and functional knowledge. The main techniques that collect data are based on mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy. The last one has the advantage of analyzing a sample in vivo without damaging it and while its sensitivity is pointed out as a disadvantage, multidimensional NMR delivers a solution to this issue. It adds layers of information, generating new data that requires advanced bioinformatics methods in order to extract biological meaning. Since multidimensional NMR has different approaches within itself, the need to estab lish an integrated framework that allows a researcher to load its data and extract relevant knowledge has become more imperative over the years. Also, establishing common data analysis pipelines on one-dimensional and multidimensional NMR remains a challenge in current scientific research hindering reproducibility across research groups. In recent work from the host group, specmine, an R package for metabolomics and spectral data analysis/mining, has been developed to wrap and deliver key metabolomic methods that allow a researcher to perform a complete analysis. In this dissertation, tools integrated in specmine were developed to read, visualize and analyze two-dimensional (2D) NMR. A new specmine structure was created for this type of data, easing interpretation and data visualization. In terms of visualization a novel approach towards three-dimensional environments enables users to interact with their data allowing peak hovering or identification of rich resonance regions. The selection of which samples to plot, when the user does not specify an input, is based on a signal-to-noise ratio scale which plots samples with opposite signal-to-noise ratios. A method to perform peak detection on 2D NMR based on local maximum search was implemented to obtain a data structure that best benefits from specmine’s functionalities. These include preprocessing, univariate and multivariate analysis as well as machine learning and feature selection methods. The 2D NMR functions were validated using experimental data from two scientific papers, available on metabolomic databases and applying the necessary preprocessing steps to compare spectra and results. These data originated two case studies from different NMR sources, Bruker and Varian, which reinforces specmine’s flexibility. The case studies were carried out using mainly specmine and other packages for specific processing steps, such as, probabilistic quotient normalization. A pipeline to analyze 2D NMR was added to specmine, in a form of a vignette, to provide a guideline for the newly developed functionalities.A metabolómica é uma das ciências ómicas que tem vindo a ganhar muito interesse devido ao seu potencial para correlacionar a atividade bioquímica de um organismo com o seu fenótipo. As aplicações da metabolómica estão em constante crescimento à medida que novas técnicas revelam nova informação sobre perfis metabólicos e moleculares, elucidando conhecimento biológico, químico e funcional. As principais técnicas para recolher este tipo de dados são baseadas em espectrometria de massa e em ressonância magnética nuclear (RMN). Esta última tem a vantagem de analisar uma amostra in vivo sem a danificar e enquanto a sensibilidade da mesma tem sido apontada como uma desvantagem, surge a abordagem de RMN multidimensional melhorando a versão tradicional. Através da medição de outros núcleos adiciona camadas de informação, gerando um novo tipo de dados que requere métodos bioinformáticos avançados para se extrair significado biológico. A existência de várias abordagens para realizar RMN multidimensional leva à crescente necessidade da existência de uma ferramenta que integre este tipo de dados, de forma a permitir ao investigador executar a sua análise de forma eficaz. Adicionalmente, a consolidação de pipelines comuns para analisar dados de RMN uni- e multidimensional permanece um desafio a investigação científica, dificultando a reprodutibilidade de resultados por diferentes grupos de investigação. Em trabalhos recentes do grupo de acolhimento foi desenvolvido um package para o programa R focado na metabolómica e na análise/mineração de dados. Este package, specmine, tem sido melhorado desde o seu desenvolvimento funcionando como uma ferramenta que engloba diferentes métodos permitindo uma análise total a um determinado conjunto de dados. Baseado neste package, mais recentemente foi desenvolvida uma plataforma web integrada, WebSpecmine, com o mesmo propósito que providencia ao utilizador uma interface de utilizador mais fácil e amigável. Nesta dissertação, ferramentas que permitem a leitura, visualização e análise de NMR bidimensional (2D) foram desenvolvidas tendo em conta a sua integração no specmine. Uma nova estrutura foi adicionada ao package, facilitando a interpretação e esquemetazição dos dados. Quanto a visualização, uma abordagem inovadora para ambientes tridimensionais permite ao utilizador interagir com os seus dados através da identificação de regiões espectrais de interesse ou reconhecimento de picos. A visualização de espectros 2D, sem especificação por parte do utilizador, tem por base uma escala de relação sinal/ruído que permite numa primeira instância visualizar as amostras com uma maior e menor diferença entre sinal e ruído. Foi também implementado um método para realizar a deteção de picos em RMN 2D baseado na procura por valores máximos locais. Esta operação tem por objectivo obter uma estrutura de dados simplificada que melhor beneficia das funcionalidades do specmine. Estas incluem operações de pré-processamento, análises uni- e multivariada, métodos de seleção de variáveis e aprendizagem máquina. As funções desenvolvidas para RMN 2D foram validadas com dados experimentais recolhidos de dois artigos científicos, disponíveis em bases de dados de metabolómica e sobre os quais foram aplicados os passos de pré-processamento que permitissem a comparação de resultados. Estes dados originaram dois casos de estudos que abordavam diferentes instrumentos utilizados em RMN, Bruker e Varian, reforçando desta forma a flexibilidade do specmine relativamente as tipologias de dados capazes de serem lidas. Estes casos foram realizados utilizando principalmente o specmine, no entanto, a utilização de packages externos foi necessária para passos de processamento específicos, como por exemplo, a normalização por quociente probabilístico. Uma pipeline para analise de dados RMN 2D foi adicionada ao specmine, sob a forma de vignette, um formato de documentação longa adequado a packages implementados no programa R. Desta forma e proporcionado ao utilizador um conjunto de procedimentos, orientados a utilização correta das funcionalidades implementadas

    Bayesian Deconvolution and Quantification of Metabolites from J-Resolved NMR Spectroscopy

    Get PDF
    Two-dimensional (2D) nuclear magnetic resonance (nmr) methods have become increasingly popular in metabolomics, since they have considerable potential to accurately identify and quantify metabolites within complex biological samples. 2D 1 H J-resolved (jres) nmr spectroscopy is a widely used method that expands overlapping resonances into a second dimension. However, existing analytical processing methods do not fully exploit the information in the jres spectrum and, more importantly, do not provide measures of uncertainty associated with the estimates of quantities of interest, such as metabolite concentration. Combining the data-generating mechanisms and the extensive prior knowledge available in online databases, we develop a Bayesian method to analyse 2D jres data, which allows for automatic deconvolution, identification and quantification of metabolites. The model extends and improves previous work on one-dimensional nmr spectral data. Our approach is based on a combination of B-spline tight wavelet frames and theoretical templates, and thus enables the automatic incorporation of expert knowledge within the inferential framework. Posterior inference is performed through specially devised Markov chain Monte Carlo methods. We demonstrate the performance of our approach via analyses of datasets from serum and urine, showing the advantages of our proposed approach in terms of identification and quantification of metabolites

    Bayesian Methods for Metabolomics

    Get PDF
    Metabolomics, the large-scale study of small molecules, enables the underlying biochemical activity and state of cells or tissues to be directly captured. Nuclear Magnetic Resonance (NMR) Spectroscopy is one of the major data capturing tech- niques for metabolomics, as it provides highly reproducible, quantitative informa- tion on a wide variety of metabolites. This work presents possible solutions for three problems involved to aid the development of better algorithms for NMR data analy- sis. After reviewing relevant concepts and literature, we first utilise observed NMR chemical shift titration data for a range of urinary metabolites and develop a the- oretical model of chemical shift using a Bayesian statistical framework and model selection procedures to estimate the number of protonation sites, a key parameter to model the relationship between chemical shift variation and pH and usually un- known in uncatalogued metabolites. Secondly, with the aim of obtaining explicit concentration estimates for metabolites from NMR spectra, we discuss a Monte Carlo Co-ordinate Ascent Variational Inference (MC-CAVI) algorithm that com- bines Markov chain Monte Carlo (MCMC) methods with Co-ordinate Ascent VI (CAVI), demonstrate MC-CAVI’s suitability for models with hard constraints and compare MC-CAVI’s performance with that of MCMC in an important complex model used in NMR spectroscopy data analysis. The third distribution seeks to im- prove metabolite identification, one of the biggest bottlenecks in metabolomics and severely hindered by resonance overlapping in one-dimensional NMR spectroscopy. In particular, we present a novel Bayesian method for widely used two-dimensional (2D) 1H J-resolved (JRES) NMR spectroscopy, which has considerable potential to accurately identify and quantify metabolites within complex biological samples, through combining B-spline tight wavelet frames with theoretical templates. We then demonstrate the effectiveness of our approach via analyses of JRES datasets from serum and urine

    Dolphin and whale: development, evaluation and application of novel bioinformatics tools for metabolite profiling in high throughput 1H-NMR analysis

    Get PDF
    El perfilat de metabòlits es la tasca més difícil dins l'anàlisi espectral de RMN. El seu objectiu es comprendre els processos biològics que tenen lloc en un moment concret mitjançant la identificació i quantificació dels metabòlits presents en mescles d' RMN complexes. Un espectre de RMN està compost per ressonàncies d'un gran nombre de metabòlits, i aquestes se solen solapar entre elles, canviar de posició depenent del pH de la mostra i poden quedar emmascarades per senyals de macromolècules. Tots aquests problemes compliquen la identificació i quantificació de metabòlits, pel que obtenir un perfil de metabòlits curat en una mostra pot ser un gran repte inclús per usuaris experts. En aquest context, la motivació d'aquesta tesi va néixer amb l'objectiu de donar automatismes i funcions fàcils de fer servir per al perfilat de metabòlits en RMN, millorant la qualitat dels resultats i reduint el temps d'anàlisi. Per fer-ho, es van implementar un conjunt d'algoritmes que van acabar empaquetats en dos programes, Dolphin i Whale.El perfilado de metabolitos es la tarea más difícil dentro del análisis espectral de RMN. Su objetivo es comprender los procesos biológicos que tienen lugar en un momento concreto a través de la identificación y cuantificación de los metabolitos presentes en mezclas de RMN complejas. Un espectro de RMN está compuesto por resonancias de un gran numero de metabolitos, y éstas a menudo se solapan entre ellas, cambian de posición dependiendo del pH de la muestra y pueden quedar enmascaradas por señales de macromoléculas. Todos estos problemas complican la identificación y cuantificación de metabolitos, por lo que obtener un perfilado de metabolitos curado en una muestra puede ser un gran reto incluso para usuarios expertos. En este contexto, la motivación de esta tesis nació con el objetivo de dar automatismos y funciones fáciles de usar para el perfilado de metabolitos en RMN, mejorando la calidad de los resultados y reduciendo el tiempo de análisis. Para hacerlo, se implementaron un conjunto de algoritmos que acabaron empaquetados en dos programas, Dolphin y Whale.Metabolite profiling is the most challenging approach in NMR spectral analysis. It aims to comprehend biological processes occurring in a certain moment through identifying and quantifying metabolites present in complex NMR mixtures. An NMR spectrum is composed by resonances of a huge number of metabolites, and these resonances often overlap between them, shift position depending on the sample pH and can be masked by macromolecules signals. All these drawbacks hinder metabolite identification and quantification, so obtaining a cured metabolite profile of a sample can be a very big issue even for expert users. In this context, the motivation of this thesis was born with the aim to provide automatisms and user-friendly interactive functions for NMR metabolite profiling, improving the quality of the results and reducing the time span of the analysis. To do so, several algorisms were implemented and embedded into two software packages, Dolphin and Whale

    Statistical Methods in Metabolomics

    Get PDF
    Metabolomics lies at the fulcrum of the system biology ‘omics’. Metabolic profiling offers researchers new insight into genetic and environmental interactions, responses to pathophysi- ological stimuli and novel biomarker discovery. Metabolomics lacks the simplicity of a single data capturing technique; instead, increasingly sophisticated multivariate statistical techniques are required to tease out useful metabolic features from various complex datasets. In this work, two major metabolomics methods are examined: Nuclear Magnetic Resonance (NMR) Spec- troscopy and Liquid Chromatography-Mass Spectrometry (LC-MS). MetAssimulo, an 1H-NMR metabolic-profile simulator, was developed in part by this author and is described in the Chap- ter 2. Peak positional variation is a phenomenon occurring in NMR spectra that complicates metabolomic analysis so Chapter 3 focuses on modelling the effect of pH on peak position. Analysis of LC-MS data is somewhat more complex given its 2-D structure, so I review existing pre-processing and feature detection techniques in Chapter 4 and then attempt to tackle the issue from a Bayesian viewpoint. A Bayesian Partition Model is developed to distinguish chro- matographic peaks representing useful features from chemical and instrumental interference and noise. Another of the LC-MS pre-processing problems, data binning, is also explored as part of H-MS: a pre-processing algorithm incorporating wavelet smoothing and novel Gaussian and Exponentially Modified Gaussian peak detection. The performance of H-MS is compared alongside two existing pre-processing packages: apLC-MS and XCMS.Open Acces

    An R-Package for the Deconvolution and Integration of 1D NMR Data: MetaboDecon1D

    Get PDF
    NMR spectroscopy is a widely used method for the detection and quantification of metabolites in complex biological fluids. However, the large number of metabolites present in a biological sample such as urine or plasma leads to considerable signal overlap in one-dimensional NMR spectra, which in turn hampers both signal identification and quantification. As a consequence, we have developed an easy to use R-package that allows the fully automated deconvolution of overlapping signals in the underlying Lorentzian line-shapes. We show that precise integral values are computed, which are required to obtain both relative and absolute quantitative information. The algorithm is independent of any knowledge of the corresponding metabolites, which also allows the quantitative description of features of yet unknown identity

    Quantitative NMR-Based Biomedical Metabolomics: Current Status and Applications

    Get PDF
    Nuclear Magnetic Resonance (NMR) spectroscopy is a quantitative analytical tool commonly utilized for metabolomics analysis. Quantitative NMR (qNMR) is a field of NMR spectroscopy dedicated to the measurement of analytes through signal intensity and its linear relationship with analyte concentration. Metabolomics-based NMR exploits this quantitative relationship to identify and measure biomarkers within complex biological samples such as serum, plasma, and urine. In this review of quantitative NMR-based metabolomics, the advancements and limitations of current techniques for metabolite quantification will be evaluated as well as the applications of qNMR in biomedical metabolomics. While qNMR is limited by sensitivity and dynamic range, the simple method development, minimal sample derivatization, and the simultaneous qualitative and quantitative information provide a unique landscape for biomedical metabolomics, which is not available to other techniques. Furthermore, the non-destructive nature of NMR-based metabolomics allows for multidimensional analysis of biomarkers that facilitates unambiguous assignment and quantification of metabolites in complex biofluids

    MetaboHunter: an automatic approach for identification of metabolites from 1H-NMR spectra of complex mixtures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One-dimensional <sup>1</sup>H-NMR spectroscopy is widely used for high-throughput characterization of metabolites in complex biological mixtures. However, the accurate identification of individual compounds is still a challenging task, particularly in spectral regions with higher peak densities. The need for automatic tools to facilitate and further improve the accuracy of such tasks, while using increasingly larger reference spectral libraries becomes a priority of current metabolomics research.</p> <p>Results</p> <p>We introduce a web server application, called MetaboHunter, which can be used for automatic assignment of <sup>1</sup>H-NMR spectra of metabolites. MetaboHunter provides methods for automatic metabolite identification based on spectra or peak lists with three different search methods and with possibility for peak drift in a user defined spectral range. The assignment is performed using as reference libraries manually curated data from two major publicly available databases of NMR metabolite standard measurements (HMDB and MMCD). Tests using a variety of synthetic and experimental spectra of single and multi metabolite mixtures show that MetaboHunter is able to identify, in average, more than 80% of detectable metabolites from spectra of synthetic mixtures and more than 50% from spectra corresponding to experimental mixtures. This work also suggests that better scoring functions improve by more than 30% the performance of MetaboHunter's metabolite identification methods.</p> <p>Conclusions</p> <p>MetaboHunter is a freely accessible, easy to use and user friendly <sup>1</sup>H-NMR-based web server application that provides efficient data input and pre-processing, flexible parameter settings, fast and automatic metabolite fingerprinting and results visualization via intuitive plotting and compound peak hit maps. Compared to other published and freely accessible metabolomics tools, MetaboHunter implements three efficient methods to search for metabolites in manually curated data from two reference libraries.</p> <p>Availability</p> <p><url>http://www.nrcbioinformatics.ca/metabohunter/</url></p
    corecore