933 research outputs found

    Atom Identifiers Generated by a Neighborhood-Specific Graph Coloring Method Enable Compound Harmonization across Metabolic Databases

    Get PDF
    Metabolic flux analysis requires both a reliable metabolic model and reliable metabolic profiles in characterizing metabolic reprogramming. Advances in analytic methodologies enable production of high-quality metabolomics datasets capturing isotopic flux. However, useful metabolic models can be difficult to derive due to the lack of relatively complete atom-resolved metabolic networks for a variety of organisms, including human. Here, we developed a neighborhood-specific graph coloring method that creates unique identifiers for each atom in a compound facilitating construction of an atom-resolved metabolic network. What is more, this method is guaranteed to generate the same identifier for symmetric atoms, enabling automatic identification of possible additional mappings caused by molecular symmetry. Furthermore, a compound coloring identifier derived from the corresponding atom coloring identifiers can be used for compound harmonization across various metabolic network databases, which is an essential first step in network integration. With the compound coloring identifiers, 8865 correspondences between KEGG (Kyoto Encyclopedia of Genes and Genomes) and MetaCyc compounds are detected, with 5451 of them confirmed by other identifiers provided by the two databases. In addition, we found that the Enzyme Commission numbers (EC) of reactions can be used to validate possible correspondence pairs, with 1848 unconfirmed pairs validated by commonality in reaction ECs. Moreover, we were able to detect various issues and errors with compound representation in KEGG and MetaCyc databases by compound coloring identifiers, demonstrating the usefulness of this methodology for database curation

    DEVELOPMENT OF TOOLS FOR ATOM-LEVEL INTERPRETATION OF STABLE ISOTOPE-RESOLVED METABOLOMICS DATASETS

    Get PDF
    Metabolomics is the global study of small molecules in living systems under a given state, merging as a new ‘omics’ study in systems biology. It has shown great promise in elucidating biological mechanism in various areas. Many diseases, especially cancers, are closely linked to reprogrammed metabolism. As the end point of biological processes, metabolic profiles are more representative of the biological phenotype compared to genomic or proteomic profiles. Therefore, characterizing metabolic phenotype of various diseases will help clarify the metabolic mechanisms and promote the development of novel and effective treatment strategies. Advances in analytical technologies such as nuclear magnetic resonance and mass spectroscopy greatly contribute to the detection and characterization of global metabolites in a biological system. Furthermore, application of these analytical tools to stable isotope resolved metabolomics experiments can generate large-scale high-quality metabolomics data containing isotopic flow through cellular metabolism. However, the lack of the corresponding computational analysis tools hinders the characterization of metabolic phenotypes and the downstream applications. Both detailed metabolic modeling and quantitative analysis are required for proper interpretation of these complex metabolomics data. For metabolic modeling, currently there is no comprehensive metabolic network at an atom-resolved level that can be used for deriving context-specific metabolic models for SIRM metabolomics datasets. For quantitative analysis, most available tools conduct metabolic flux analysis based on a well-defined metabolic model, which is hard to achieve for complex biological system due to the limitations in our knowledge. Here, we developed a set of methods to address these problems. First, we developed a neighborhood-specific coloring method that can create identifier for each atom in a specific compound. With the atom identifiers, we successfully harmonized compounds and reactions across KEGG and MetaCyc databases at various levels. In addition, we evaluated the atom mappings of the harmonized metabolic reactions. These results will contribute to the construction of a comprehensive atom-resolved metabolic network. In addition, this method can be easily applied to any metabolic database that provides a molfile representation of compounds, which will greatly facilitate future expansion. In addition, we developed a moiety modeling framework to deconvolute metabolite isotopologue profiles using moiety models along with the analysis and selection of the best moiety model(s) based on the experimental data. To our knowledge, this is the first method that can analyze datasets involving multiple isotope tracers. Furthermore, instead of a single predefined metabolic model, this method allows the comparison of multiple metabolic models derived from a given metabolic profile, and we have demonstrated the robust performance of the moiety modeling framework in model selection with a 13C-labeled UDP-GlcNAc isotopologue dataset. We further explored the data quality requirements and the factors that affect model selection. Collectively, these methods and tools help interpret SIRM metabolomics datasets from metabolic modeling to quantitative analysis

    Identification of metabolic pathways using pathfinding approaches: A systematic review

    Get PDF
    Metabolic pathways have become increasingly available for variousmicroorganisms. Such pathways have spurred the development of a wide array of computational tools, in particular, mathematical pathfinding approaches. This article can facilitate the understanding of computational analysis ofmetabolic pathways in genomics. Moreover, stoichiometric and pathfinding approaches inmetabolic pathway analysis are discussed. Threemajor types of studies are elaborated: stoichiometric identification models, pathway-based graph analysis and pathfinding approaches in cellular metabolism. Furthermore, evaluation of the outcomes of the pathways withmathematical benchmarkingmetrics is provided. This review would lead to better comprehension ofmetabolismbehaviors in living cells, in terms of computed pathfinding approaches. © The Author 2016

    Automatic mapping of atoms across both simple and complex chemical reactions

    Get PDF
    Mapping atoms across chemical reactions is important for substructure searches, automatic extraction of reaction rules, identification of metabolic pathways, and more. Unfortunately, the existing mapping algorithms can deal adequately only with relatively simple reactions but not those in which expert chemists would benefit from computer's help. Here we report how a combination of algorithmics and expert chemical knowledge significantly improves the performance of atom mapping, allowing the machine to deal with even the most mechanistically complex chemical and biochemical transformations. The key feature of our approach is the use of few but judiciously chosen reaction templates that are used to generate plausible "intermediate" atom assignments which then guide a graph-theoretical algorithm towards the chemically correct isomorphic mappings. The algorithm performs significantly better than the available state-of-the-art reaction mappers, suggesting its uses in database curation, mechanism assignments, and - above all - machine extraction of reaction rules underlying modern synthesis-planning programs

    Molecular Similarity and Xenobiotic Metabolism

    Get PDF
    MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner.MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm.In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions.This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds.MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics website.----Boehringer-Ingelhie

    Computational Studies on the Evolution of Metabolism

    Get PDF
    Living organisms throughout evolution have developed desired properties, such as the ability of maintaining functionality despite changes in the environment or their inner structure, the formation of functional modules, from metabolic pathways to organs, and most essentially the capacity to adapt and evolve in a process called natural selection. It can be observed in the metabolic networks of modern organisms that many key pathways such as the citric acid cycle, glycolysis, or the biosynthesis of most amino acids are common to all of them. Understanding the evolutionary mechanisms behind this development of complex biological systems is an intriguing and important task of current research in biology as well as artificial life. Several competing hypotheses for the formation of metabolic pathways and the mecha- nisms that shape metabolic networks have been discussed in the literature, each of which finds support from comparative analysis of extant genomes. However, while being powerful tools for the investigation of metabolic evolution, these traditional methods do not allow to look back in evolution far enough to the time when metabolism had to emerge and evolve to the form we can observe today. To this end, simulation studies have been introduced to discover the principles of metabolic evolution and the sources for the emergence of metabolism prop- erties. These approaches differ considerably in the realism and explicitness of the underlying models. A difficult trade-off between realism and computational feasibility has to be made and further modeling decisions on many scales have to be taken into account, requiring the combination of knowledge from different fields such as chemistry, physics, biology and last but not least also computer science. In this thesis, a novel computational model for the in silico evolution of early metabolism is introduced. It comprises all the components on different scales to resemble a situation of evolving metabolic protocells in an RNA-world. Therefore, the model contains a minimal RNA-based genetics and an evolving metabolism of catalytic ribozymes that manipulate a rich underlying chemistry. To allow the metabolic organization to escape from the confines of the chemical space set by the initial conditions of the simulation and in general an open- ended evolution, an evolvable sequence-to-function map is used. At the heart of the metabolic subsystem is a graph-based artificial chemistry equipped with a built-in thermodynamics. The generation of the metabolic reaction network is realized as a rule-based stochastic simulation. The necessary reaction rates are calculated from the chemical graphs of the reactants on the fly. The selection procedure among the population of protocells is based on the optimal metabolic yield of the protocells, which is computed using flux balance analysis. The introduced computational model allows for profound investigations of the evolution of early metabolism and the underlying evolutionary mechanisms. One application in this thesis is the study of the formation of metabolic pathways. Therefore, four established hypothe- ses, namely the backwards evolution, forward evolution, patchwork evolution and the shell hypothesis, are discussed within the realms of this in silico evolution study. The metabolic pathways of the networks, evolved in various simulation runs, are determined and analyzed in terms of their evolutionary direction. The simulation results suggest that the seemingly mutually exclusive hypotheses may well be compatible when considering that different pro- cesses dominate different phases in the evolution of a metabolic system. Further, it is found that forward evolution shapes the metabolic network in the very early steps of evolution. In later and more complex stages, enzyme recruitment supersedes forward evolution, keeping a core set of pathways from the early phase. Backward evolution can only be observed under conditions of steady environmental change. Additionally, evolutionary history of enzymes and metabolites were studied on the network level as well as for single instances, showing a great variety of evolutionary mechanisms at work. The second major focus of the in silico evolutionary study is the emergence of complex system properties, such as robustness and modularity. To this end several techniques to analyze the metabolic systems were used. The measures for complex properties stem from the fields of graph theory, steady state analysis and neutral network theory. Some are used in general network analysis and others were developed specifically for the purpose introduced in this work. To discover potential sources for the emergence of system properties, three different evolutionary scenarios were tested and compared. The first two scenarios are the same as for the first part of the investigation, one scenario of evolution under static conditions and one incorporating a steady change in the set of ”food” molecules. A third scenario was added that also simulates a static evolution but with an increased mutation rate and regular events of horizontal gene transfer between protocells of the population. The comparison of all three scenarios with real world metabolic networks shows a significant similarity in structure and properties. Among the three scenarios, the two static evolutions yield the most robust metabolic networks, however, the networks evolved under environmental change exhibit their own strategy to a robustness more suited to their conditions. As expected from theory, horizontal gene transfer and changes in the environment seem to produce higher degrees of modularity in metabolism. Both scenarios develop rather different kinds of modularity, while horizontal gene transfer provides for more isolated modules, the modules of the second scenario are far more interconnected

    Computational Studies on the Evolution of Metabolism

    Get PDF
    Living organisms throughout evolution have developed desired properties, such as the ability of maintaining functionality despite changes in the environment or their inner structure, the formation of functional modules, from metabolic pathways to organs, and most essentially the capacity to adapt and evolve in a process called natural selection. It can be observed in the metabolic networks of modern organisms that many key pathways such as the citric acid cycle, glycolysis, or the biosynthesis of most amino acids are common to all of them. Understanding the evolutionary mechanisms behind this development of complex biological systems is an intriguing and important task of current research in biology as well as artificial life. Several competing hypotheses for the formation of metabolic pathways and the mecha- nisms that shape metabolic networks have been discussed in the literature, each of which finds support from comparative analysis of extant genomes. However, while being powerful tools for the investigation of metabolic evolution, these traditional methods do not allow to look back in evolution far enough to the time when metabolism had to emerge and evolve to the form we can observe today. To this end, simulation studies have been introduced to discover the principles of metabolic evolution and the sources for the emergence of metabolism prop- erties. These approaches differ considerably in the realism and explicitness of the underlying models. A difficult trade-off between realism and computational feasibility has to be made and further modeling decisions on many scales have to be taken into account, requiring the combination of knowledge from different fields such as chemistry, physics, biology and last but not least also computer science. In this thesis, a novel computational model for the in silico evolution of early metabolism is introduced. It comprises all the components on different scales to resemble a situation of evolving metabolic protocells in an RNA-world. Therefore, the model contains a minimal RNA-based genetics and an evolving metabolism of catalytic ribozymes that manipulate a rich underlying chemistry. To allow the metabolic organization to escape from the confines of the chemical space set by the initial conditions of the simulation and in general an open- ended evolution, an evolvable sequence-to-function map is used. At the heart of the metabolic subsystem is a graph-based artificial chemistry equipped with a built-in thermodynamics. The generation of the metabolic reaction network is realized as a rule-based stochastic simulation. The necessary reaction rates are calculated from the chemical graphs of the reactants on the fly. The selection procedure among the population of protocells is based on the optimal metabolic yield of the protocells, which is computed using flux balance analysis. The introduced computational model allows for profound investigations of the evolution of early metabolism and the underlying evolutionary mechanisms. One application in this thesis is the study of the formation of metabolic pathways. Therefore, four established hypothe- ses, namely the backwards evolution, forward evolution, patchwork evolution and the shell hypothesis, are discussed within the realms of this in silico evolution study. The metabolic pathways of the networks, evolved in various simulation runs, are determined and analyzed in terms of their evolutionary direction. The simulation results suggest that the seemingly mutually exclusive hypotheses may well be compatible when considering that different pro- cesses dominate different phases in the evolution of a metabolic system. Further, it is found that forward evolution shapes the metabolic network in the very early steps of evolution. In later and more complex stages, enzyme recruitment supersedes forward evolution, keeping a core set of pathways from the early phase. Backward evolution can only be observed under conditions of steady environmental change. Additionally, evolutionary history of enzymes and metabolites were studied on the network level as well as for single instances, showing a great variety of evolutionary mechanisms at work. The second major focus of the in silico evolutionary study is the emergence of complex system properties, such as robustness and modularity. To this end several techniques to analyze the metabolic systems were used. The measures for complex properties stem from the fields of graph theory, steady state analysis and neutral network theory. Some are used in general network analysis and others were developed specifically for the purpose introduced in this work. To discover potential sources for the emergence of system properties, three different evolutionary scenarios were tested and compared. The first two scenarios are the same as for the first part of the investigation, one scenario of evolution under static conditions and one incorporating a steady change in the set of ”food” molecules. A third scenario was added that also simulates a static evolution but with an increased mutation rate and regular events of horizontal gene transfer between protocells of the population. The comparison of all three scenarios with real world metabolic networks shows a significant similarity in structure and properties. Among the three scenarios, the two static evolutions yield the most robust metabolic networks, however, the networks evolved under environmental change exhibit their own strategy to a robustness more suited to their conditions. As expected from theory, horizontal gene transfer and changes in the environment seem to produce higher degrees of modularity in metabolism. Both scenarios develop rather different kinds of modularity, while horizontal gene transfer provides for more isolated modules, the modules of the second scenario are far more interconnected

    Computational Studies on Cellular Metabolism:From Biochemical Pathways to Complex Metabolic Networks

    Get PDF
    Biotechnology promises the biologically and ecologically sustainable production of commodity chemicals, biofuels, pharmaceuticals and other high-value products using industrial platform microorganisms. Metabolic engineering plays a key role in this process, providing the tools for targeted modifications of microbial metabolism to create efficient microbial cell factories that convert low value substrates to value-added chemicals. Engineering microbes for the bioproduction of chemicals has been practiced through three different approaches: (i) optimization of native pathways of a host organism; (ii) incorporation of heterologous pathways in an amenable organism; and finally (iii) design and introduction of synthetic pathways in an organism. So far, the progress that has been made in the biosynthesis of chemicals was mostly achieved using the first two approaches. Nevertheless, many novel biosynthetic pathways for the production of native and non-native compounds that have potential to provide near-theoretical yields and high specific production rates of chemicals remain yet to be discovered. Therefore, the third approach is crucial for the advancement of bio-based production of value-added chemicals. We need to fully comprehend and analyze the existing knowledge of metabolism in order to generate new hypotheses and design de novo pathways. In this thesis, through development and application of efficient computational methods, we took the research path to expand our understanding of cell metabolism with the aim to discover novel knowledge about metabolic networks. We analyze different aspects of metabolism through five distinct studies. In the first study, we begin with a holistic view of the enzymatic reactions across all the species, and we propose a computational approach for identifying all the theoretically possible enzymatic reactions based on the known biochemistry. We organize our results in a web-based database called ĂąAtlas of biochemistryĂą. In the second study, we focus on one of the most structurally diverse and ubiquitous constituents of metabolism, the lipid metabolism. Here we propose a computational framework for integrating lipid species with unknown metabolic/catabolic pathways into metabolic networks. In our next study, we investigate the full metabolic capacity of E. coli. We explore computationally all enzymatic potentials of this organism, and we introduce the ĂąSuper E. coliĂą, a new and advanced chassis for metabolic engineering studies. Our next contribution concentrates on the development of a new method for the atom-level description of metabolic networks. We demonstrate the significance of our approach through the reconstruction of atom-level map of the E. coli central metabolism. In the last study, we turn our focus on studying the thermodynamics of metabolism and we present our original approach for estimating the thermodynamic properties of an important class of metabolites. So far, the available thermodynamic properties either from experiments or the computational methods are estimated with respect to the standard conditions, which are different from typical biological conditions. Our workflow paves the way for reliable computing of thermochemical properties of biomolecules at biological conditions of temperature and pressure. Finally, in the conclusion chapter, we discuss the outlook of this work and the potential further applications of the computational methods that were developed in this thesis
    • 

    corecore