439 research outputs found

    Updates in metabolomics tools and resources: 2014-2015

    Get PDF
    Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

    Scientific Workflows for Metabolic Flux Analysis

    Get PDF
    Metabolic engineering is a highly interdisciplinary research domain that interfaces biology, mathematics, computer science, and engineering. Metabolic flux analysis with carbon tracer experiments (13 C-MFA) is a particularly challenging metabolic engineering application that consists of several tightly interwoven building blocks such as modeling, simulation, and experimental design. While several general-purpose workflow solutions have emerged in recent years to support the realization of complex scientific applications, the transferability of these approaches are only partially applicable to 13C-MFA workflows. While problems in other research fields (e.g., bioinformatics) are primarily centered around scientific data processing, 13C-MFA workflows have more in common with business workflows. For instance, many bioinformatics workflows are designed to identify, compare, and annotate genomic sequences by "pipelining" them through standard tools like BLAST. Typically, the next workflow task in the pipeline can be automatically determined by the outcome of the previous step. Five computational challenges have been identified in the endeavor of conducting 13 C-MFA studies: organization of heterogeneous data, standardization of processes and the unification of tools and data, interactive workflow steering, distributed computing, and service orientation. The outcome of this thesis is a scientific workflow framework (SWF) that is custom-tailored for the specific requirements of 13 C-MFA applications. The proposed approach – namely, designing the SWF as a collection of loosely-coupled modules that are glued together with web services – alleviates the realization of 13C-MFA workflows by offering several features. By design, existing tools are integrated into the SWF using web service interfaces and foreign programming language bindings (e.g., Java or Python). Although the attributes "easy-to-use" and "general-purpose" are rarely associated with distributed computing software, the presented use cases show that the proposed Hadoop MapReduce framework eases the deployment of computationally demanding simulations on cloud and cluster computing resources. An important building block for allowing interactive researcher-driven workflows is the ability to track all data that is needed to understand and reproduce a workflow. The standardization of 13 C-MFA studies using a folder structure template and the corresponding services and web interfaces improves the exchange of information for a group of researchers. Finally, several auxiliary tools are developed in the course of this work to complement the SWF modules, i.e., ranging from simple helper scripts to visualization or data conversion programs. This solution distinguishes itself from other scientific workflow approaches by offering a system of loosely-coupled components that are flexibly arranged to match the typical requirements in the metabolic engineering domain. Being a modern and service-oriented software framework, new applications are easily composed by reusing existing components

    The Experiment Data Depot: A Web-Based Software Tool for Biological Experimental Data Storage, Sharing, and Visualization

    Get PDF
    Although recent advances in synthetic biology allow us to produce biological designs more efficiently than ever, our ability to predict the end result of these designs is still nascent. Predictive models require large amounts of high-quality data to be parametrized and tested, which are not generally available. Here, we present the Experiment Data Depot (EDD), an online tool designed as a repository of experimental data and metadata. EDD provides a convenient way to upload a variety of data types, visualize these data, and export them in a standardized fashion for use with predictive algorithms. In this paper, we describe EDD and showcase its utility for three different use cases: storage of characterized synthetic biology parts, leveraging proteomics data to improve biofuel yield, and the use of extracellular metabolite concentrations to predict intracellular metabolic fluxes

    From spectrometric data to metabolic networks: an integrated view of cell metabolism

    Get PDF
    La biologia molecular ha avançat considerablement gràcies a importants progressos com la seqüenciació del ADN o la seva modificació per CRISPR. Tot i això, per entendre el metabolisme requerim estudiar els perfils metabòlics i les seves reaccions metabòliques. L™objectiu d™aquesta tesi és contribuir en aquest estudi del metabolism, el qual unifica dels camps de la proteòmica i la metabolòmica. Tradicionalment, l™anàlisi de dades òmiques es basa en el tractament independent de les diferents variables encara que està profundament establert que els mecanismes moleculars són controlats per la interacció de diferents molècules, i per tant seria més correcte tractar les dades de la mateixa manera. Avui dia, s™han descrit una gran quantitat de vies metabòliques, incluint els enzims responsables de les transformacions dels metabòlits que les formen, aquesta informació s™ha recopilat en bases de dades, que a la vegada poden ser utilitzades per a construir xarxes metabòliques. En aquesta tesi, s™han utilitzat xarxes metabòliques per a desenvolupar un algoritme que prediu metabòlits desregulats basant-se en el perfil d™expressió d™enzims gràcies a proteòmica quantitativa. Per a validar tals prediccions, és possible mesurar l™abundància d™aquests metabòlits, o el seu flux, o sigui la velocitat a la que s™han transformat, utilitzant experiments de marcatge amb isòtops estables, mesures completades mitjançant metabolòmica. Aqui, mostrem els productes del desenvolupament de dos mètodes per a l™anàlisi de dades de metabolòmica per a experiments amb isòtops estables: el primer per a la quantificació dirigida del flux en metabòlits del metabolisme central; i un segon, per la detecció no-dirigida de metabòlits marcats amb isòtops en altres vies metabòliques. Aquests mètodes han sigut provats en diferents estudis on han aportat resultats remarcables, revelant nous mecanismes moleculars en una complicació de la diabetes o en relació al metabolisme del càncer.La biología molecular ha avanzado considerablemente gracias a progresos como la secuenciación de ADN o su modificación por CRISPR. Sin embargo, para entender el metabolismo es indispensable estudiar los perfiles metabólicos y sus reacciones metabólicas. El objetivo de esta tesis es contribuir en el estudio del metabolismo, el cual implica los campos de la proteómica y la metabolómica. Tradicionalmente, el análisis de datos ómicas se basa en el tratamiento independiente de las diferentes variables aunque está profundamente aceptado que los mecanismos moleculares son controlados por la interacción de diferentes moléculas, y por lo tanto sería más correcto tratar los datos de esa manera. Hoy día, se han descrito una gran cantidad de vías metabólicas, incluyendo las enzimas responsables de las transformaciones de los metabolitos que las forman, esta información se ha recopilado en bases de datos, que a su vez pueden ser utilizadas para construir redes metabólicas . En esta tesis, se han utilizado redes metabólicas para desarrollar un algoritmo que predice metabolitos desregulados basándose en el perfil de expresión de enzimas por proteómica cuantitativa. Para validar tales predicciones, es posible medir la abundancia de estos metabolitos, o su flujo, o sea la velocidad a la que se han transformado, utilizando experimentos de marcado con isótopos estables, estas medidas se obtienen por metabolómica. Aquí, mostramos los productos del desarrollo de dos métodos para el análisis de datos de metabolómica para experimentos con isótopos estables: el primero para la cuantificación dirigida del flujo en metabolitos del metabolismo central; y un segundo, para la detección no-dirigida de metabolitos marcados con isótopos en otras vías metabólicas. Estos métodos han sido probados en diferentes estudios donde han aportado resultados interesantes, revelando nuevos mecanismos moleculares en una complicación de la diabetes o en relación al metabolismo del cáncer.Understanding the molecular basis of life has been in the spotlight of biochemistry research for more than a century already. Molecular biology has taken medicine forward thanks to technological breakthroughs like DNA sequencing and CRISPR editing. However, in order to understand metabolism we must rely on the study of metabolite profiles and metabolic reactions. The purpose of this thesis to contribute to this area, which unites the fields of proteomics and metabolomics. Traditionally, omics data analysis treats variables independently even if it is strongly settled that molecular mechanisms involve the interaction of diverse pathways, therefore data should be analyzed correspondingly. A vast amount of metabolic pathways have been described, together with enzymes that are responsible for metabolite transformations, this information has been assembled in databases that, in turn, can be used to build metabolic networks. In here, we use metabolic networks to predict metabolite dysregulation based on quantitative proteomics profiles. To validate the predictions, it is possible to measure the abundance of metabolites or their flux, namely the rate at which they are transformed, using stable isotope labelling experiments, both measurements can be performed by metabolomics. In this thesis, two different metabolomics-based stable isotope labelling approaches have been developed, one for the study of central carbon metabolites and one for the unbiased detection of deregulated fluxes in other metabolic pathways. These approaches have been tested on different datasets and have proven valuable to obtain remarkable results, unraveling molecular mechanisms in diabetes complications or novel metabolic hallmarks of cancer

    The Design of FluxML: A Universal Modeling Language for 13C Metabolic Flux Analysis

    Get PDF
    13C metabolic flux analysis (MFA) is the method of choice when a detailed inference of intracellular metabolic fluxes in living organisms under metabolic quasi-steady state conditions is desired. Being continuously developed since two decades, the technology made major contributions to the quantitative characterization of organisms in all fields of biotechnology and health-related research. 13C MFA, however, stands out from other “-omics sciences,” in that it requires not only experimental-analytical data, but also mathematical models and a computational toolset to infer the quantities of interest, i.e., the metabolic fluxes. At present, these models cannot be conveniently exchanged between different labs. Here, we present the implementation-independent model description language FluxML for specifying 13C MFA models. The core of FluxML captures the metabolic reaction network together with atom mappings, constraints on the model parameters, and the wealth of data configurations. In particular, we describe the governing design processes that shaped the FluxML language. We demonstrate the utility of FluxML to represent many contemporary experimental-analytical requirements in the field of 13C MFA. The major aim of FluxML is to offer a sound, open, and future-proof language to unambiguously express and conserve all the necessary information for model re-use, exchange, and comparison. Along with FluxML, several powerful computational tools are supplied for easy handling, but also to maintain a maximum of flexibility. Altogether, the FluxML collection is an “all-around carefree package” for 13C MFA modelers. We believe that FluxML improves scientific productivity as well as transparency and therewith contributes to the efficiency and reproducibility of computational modeling efforts in the field of 13C MFA

    A system-wide stable isotope labeling approach for connecting natural products to their biosynthetic gene clusters

    Get PDF
    Although the first bacterial genome sequence was published almost 20 years ago, there is still no generalizable method for automatically assigning natural products to their cognate biosynthetic gene clusters (BGCs). This thesis describes the development of a mass spectrometry-based parallel stable isotope labeling (SIL) platform, termed IsoAnalyst, which automatically associates metabolite stable isotope labeling patterns with BGC structure prediction in order to connect natural products to their cognate BGCs. The parallel SIL experiments were optimized for small scale and a custom tool written in Python was developed for the untargeted detection and interpretation of SIL labeling patterns. This approach was validated in the industrial production strains Saccharopolyspora erythraea and Amycolatopsis mediterranei demonstrating that the compounds erythromycin A and rifamycin SV respectively, could be associated with the proper BGCs based on the distribution of isotopomer labeling patterns. The method was further validated by connecting known biosynthetic intermediates of these compounds to their associated BGCs and the identification of various siderophores through a combination of SIL labeling patterns and MS/MS fragmentation data. Extension to environmental organisms using a sequenced Micromonospora sp. from our Actinobacterial isolate library led to the discovery of lobosamide D, a new member of the lobosamide family of natural products, and an update to the lobosamide BGC to include relevant tailoring enzymes. This discovery illustrates the power of the IsoAnalyst platform for identifying new compounds, linking molecules to BGCs, and generating new knowledge about biosynthesis

    Deciphering Chronometabolic Dynamics Through Metabolomics, Stable Isotope Tracers, And Genome-Scale Reaction Modeling

    Get PDF
    Synchrony across environmental cues, endogenous genetic clocks, sleep/wake cycles, and metabolism evoke physiological harmony for organismal health. Perturbation of this synchrony has been recently correlated with a growing list of pathologies, which is alarming given the ubiquity of sleep deprivation, mistimed light exposure, and altered eating schedules in modern society. Deeper insights into clocks, sleep, and metabolism are necessary to understand these outcomes. In this work, extensive metabolic profiles of circadian systems were obtained from the development of new liquid chromatography mass spectrometry (LC-MS) metabolomics methods. These methods were applied to Drosophila melanogaster to discern relative influences of environmental and genetic drivers of metabolic cycles. Unique sets of metabolites oscillated with 24-hour circadian periods under light:dark (LD) and constant darkness (DD) conditions, and ultradian rhythms were noted for clock mutant flies under LD, suggesting clock-independent metabolic cycles driven by environmental inputs. However, this metabolomic analysis does not fully capture the inherently dynamic nature of circadian metabolism. These LC-MS methods were adapted to analyze isotope enrichments from a novel 13C6 glucose injection platform in Drosophila. Metabolic flux cycles were noted from glucose carbons into serine, glutamine and reduced glutathione biosynthesis, and altered under sleep deprivation, demonstrating unique energy and redox demands in perturbed sleep/wake cycles. Global isotopolome shifts were most notable in WT flies after lights-on, suggesting a catabolic rush from glucose oxidation early in the active phase. As the scope of these isotope tracer-based metabolomic analyses expand, attributing labeling patterns to specific reactions requires consideration of genome-scale metabolic networks. A new computational approach was developed, called the IsoPathFinder, which uncovered biosynthetic paths from glucose to serine, and extends to glycine and glutathione production. Carbon flux into glutamine was predicted to occur through the TCA cycle, supported by enzyme thermodynamics and circadian expression datasets. This tool is presented as a new mechanism to simulate additional isotope tracer experiments, with broad applicability beyond circadian research. Collectively, a new set of analytical and computational tools are developed to both produce dynamic metabolomic data and improve data interpretability, with applications to uncover new chronometabolic connections

    Cell-free Metabolic Engineering Strategies for Accelerated Biomanufacturing

    Get PDF
    Biomanufacturing propels the bioeconomy. Accelerating bioeconomic growth thus requires the expedited development of biomanufacturing processes that can expand the current bioproduct portfolio. Lysate-based cell-free systems provide unique advantages for simplified metabolic pathway construction. Their central metabolic pathways and transcriptional-translational (TX-TL) machineries are free from genome regulation and are amenable to direct manipulation, enabling the streamlined construction of biomanufacturing processes. While their utility as prototyping platforms for accelerating cellular metabolic engineering has been demonstrated, the potential to rapidly build commercial “cell-free factories” capable of sophisticated bioconversion has not been fully realized. Lysates with high-yield pathways are projected to enable commercialized cell-free biomanufacturing of high-value chemicals. However, strategies that can be incorporated into frameworks for lysate pathway yield optimization rely on traditional cell-based metabolic engineering techniques that are cumbersome to extracts’ source strains (e.g., gene knockouts and cell-based overexpression) and thus lengthen design-build-test-learn (DBTL) cycles. The work in this dissertation introduces new strategies for cell-free pathway engineering that can benefit conversion yields in lysates, with a focus on 1) restructuring the endogenous metabolic proteome post-lysis and 2) optimizing the lysate TX-TL machinery for the cell-free overexpression of complex heterologous enzymes. By minimizing the involvement of cell-based engineering, these strategies enable faster endogenous and heterologous pathway build cycles compared to previous approaches. Part one describes the development of the first lysate flux-rewiring approach that enables conversion through a native pathway at 100% of the theoretical yield. Part two reports on the design and development of a plate reader assay for troubleshooting the expression of biosynthetic enzymes in lysates. The assay allows the generalizable screening of cell-free expression conditions with higher throughput than previous approaches and is leveraged to improve the lysate-based expression of natural product forming megasynthases. Refining and integrating these approaches into cost-effective cell-free metabolic engineering (CFME) workflows will enable rapid high-yield metabolic pathway construction, advancing lysates as sustainable biomanufacturing platforms

    Improvement of KiMoSys framework for kinetic modelling

    Get PDF
    Over the past years, an increasing amount of biological data produced shows the impor tance of data repositories. The databases ensure an easier way to reuse and share research data between the scientific community. Among the most important features are the quick access to data, described by metadata and available in standard formats, and the compli ance with the FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles for data management. KiMoSys (https://kimosys.org) is a public domain-specific repository of experi mental data, containing concentration data of enzymes, metabolites and flux data. It offers a web-based interface and upload facility to publish data, making it accessible in standard formats, while also integrating kinetic models related to the data. This thesis is a contribution to the improvement and extension of KiMoSys. It includes the addition of more downloadable data formats, the introduction of data visualization, the incorporation of more tools to filter data, the integration of a simulation environment for kinetic models and the inclusion of a unique persistent identifier system. As a result, it is provided a new version of KiMoSys, with a renewed interface, mul tiple new features, and an enhancement of the previously existing ones. These are in accordance with all FAIR data principles. Therefore, it is believed that KiMoSys v2.0 will be an important tool for the systems biology modeling community.Nos últimos anos, uma quantidade crescente de dados biológicos produzidos atesta a importância dos repositórios de dados. As bases de dados garantem uma maneira mais fácil de reutilizar e partilhar dados de pesquisa entre a comunidade científica. Entre as características mais importantes estão o rápido acesso aos dados, descritos por metada dos e disponíveis em formatos padrão, e o cumprimento dos Princípios FAIR (Findable, Accessible, Interoperable e Reusable) para a gestão de dados. KiMoSys (https://kimosys.org) consiste num repositório público de domínio espe cífico de dados experimentais, contendo dados de concentração de enzimas, metabolitos e dados de fluxo. Oferece uma interface para a web e uma ferramenta de carregamento de dados, tornando-os acessíveis em formatos padrão, além de integrar modelos cinéticos relacionados aos dados. Esta tese contribui para o melhoramento e extensão do KiMoSys. Inclui a adição de mais formatos de dados para descarga, a introdução de visualização de dados, a incorpo ração de mais opções para filtrar os dados, a integração de um ambiente de simulação para modelos cinéticos e a inclusão de um sistema de identificador único persistente. Como resultado, é apresentada uma nova versão do KiMoSys, com uma interface renovada, várias novas características e um aprimoramento das anteriormente existentes. Estas estão de acordo com todos os princípios de dados FAIR. Portanto, acredita-se que o KiMoSys v2.0 será uma ferramenta importante para a comunidade de modelagem de sistemas biológicos
    corecore