1,151 research outputs found

    Efficient Reconstruction of Metabolic Pathways by Bidirectional Chemical Search

    Get PDF
    One of the main challenges in systems biology is the establishment of the metabolome: a catalogue of the metabolites and biochemical reactions present in a specific organism. Current knowledge of biochemical pathways as stored in public databases such as KEGG, is based on carefully curated genomic evidence for the presence of specific metabolites and enzymes that activate particular biochemical reactions. In this paper, we present an efficient method to build a substantial portion of the artificial chemistry defined by the metabolites and biochemical reactions in a given metabolic pathway, which is based on bidirectional chemical search. Computational results on the pathways stored in KEGG reveal novel biochemical pathways

    Implementation of new tools and approaches for the reconstruction of genome-scale metabolic models

    Get PDF
    Dissertação de mestrado em BioinformáticaThe reconstruction of high-quality genome-scale metabolic (GSM) models can have a rele vant role in the investigation and study of an organism, since these mathematical models can be used to phenotypically manipulate an organism and predict its response, in silico, under different environmental conditions or genetic modifications. Several bioinformatics tools and software have been developed since then to facilitate and accelerate the reconstruction of these models by automating some steps that compose the traditional reconstruction process. “Metabolic Models Reconstruction Using Genome-Scale Information” (merlin) is a free, user-friendly, JavaTM application that automates the main stages of the reconstruction of a GSM model for any microorganism. Although it has already been used successfully in several works, many plugins are still being developed to improve its resources and make it more accessible to any user. In this work, the new tools integrated in merlin will be described in detail, as well as the improvement of other features present on the platform. The general improvements performed and the implementation of the new tools, improve the overall user experience during the process of reconstructing GSM models in merlin. The main feature implemented in this work is the incorporation of the BiGG Integration Tool (BIT) in merlin. This plugin allows the collection of metabolic data that integrates the models present in the BiGG Models database and its association with the genome of the organism in study, by homology, creating, if possible, the boolean rule for each BiGG reaction in the model under construction. All the computation required to execute merlin’s BIT takes place remotely, to accelerate the process. Within a few minutes, the results are returned by the server and imported into the user’s workspace. Running the tool outside the user’s machine also brings advantages in terms of information storage, since the BiGG data structure that supports the entire tool is available remotely. The implementation of this tool provides an alternative to obtaining metabolic information from the KEGG database, the only option available in merlin so far. To test the implemented tool, several draft genome-scale metabolic networks were generated and analyzed.A reconstrução de modelos metabólicos à escala genómica (MEG) de alta qualidade, pode desempenhar um papel relevante na investigação e estudo de um organismo, uma vez que estes modelos matemáticos podem ser utilizados para manipular fenotipicamente um organ ismo e prever a sua resposta, in silico, sob diferentes condições ambientais ou modificações genéticas. Várias ferramentas bioinformáticas e software têm sido desenvolvidos desde então para facilitar e acelerar a reconstrução desses modelos por automatização de algumas etapas que constituem o processo de reconstrução tradicional. O “Metabolic Models Reconstruction Using Genome-Scale Information” (merlin) é uma aplicação JavaTM gratuita, e fácil de utilizar, que automatiza as principais etapas de recon strução de um modelo MEG para qualquer microrganismo. Apesar de já ter sido utilizado com sucesso em vários trabalhos, muitos plugins ainda estão a ser desenvolvidas para aprimorar os seus recursos e torná-lo mais acessível a qualquer utilizador. Neste trabalho, serão descritas em detalhe as novas ferramentas integradas no merlin, bem como a melhoria de outras funcionalidades presentes na plataforma. As melhorias gerais realizadas e a implementação das novas ferramentas permitem melhorar a experiência global do utilizador durante o processo de reconstrução de modelos MEG no merlin. O principal recurso implementado neste trabalho é a integração da BiGG Integration Tool (BIT) no merlin. Este plugin permite a recolha dos dados metabólicos que integram os modelos presentes na base de dados BiGG Models e a sua associação ao genoma do organismo em estudo, por homologia, criando, se possível, a boolean rule para cada reação BiGG presente no modelo sob construção. Todo o processamento exigido para executar a BIT do merlin ocorre remotamente, para acelerar o processo. Em poucos minutos, os resultados são devolvidos pelo servidor e importados para o ambiente de trabalho do utilizador. A execução da ferramenta fora da máquina do utilizador traz também vantagens ao nível do armazenamento da informação, já que a estrutura de dados BiGG que sustenta toda a ferramenta está disponível remotamente. A implementação desta ferramenta fornece uma alternativa à obtenção de informação metabólica a partir da base de dados KEGG, única opção disponibilizada pelo merlin até ao momento. Para testar a ferramenta implementada, várias redes metabólicas à escala genómica rascunho foram geradas e analisadas

    Constraint-based probabilistic learning of metabolic pathways from tomato volatiles

    Get PDF
    Clustering and correlation analysis techniques have become popular tools for the analysis of data produced by metabolomics experiments. The results obtained from these approaches provide an overview of the interactions between objects of interest. Often in these experiments, one is more interested in information about the nature of these relationships, e.g., cause-effect relationships, than in the actual strength of the interactions. Finding such relationships is of crucial importance as most biological processes can only be understood in this way. Bayesian networks allow representation of these cause-effect relationships among variables of interest in terms of whether and how they influence each other given that a third, possibly empty, group of variables is known. This technique also allows the incorporation of prior knowledge as established from the literature or from biologists. The representation as a directed graph of these relationship is highly intuitive and helps to understand these processes. This paper describes how constraint-based Bayesian networks can be applied to metabolomics data and can be used to uncover the important pathways which play a significant role in the ripening of fresh tomatoes. We also show here how this methods of reconstructing pathways is intuitive and performs better than classical techniques. Methods for learning Bayesian network models are powerful tools for the analysis of data of the magnitude as generated by metabolomics experiments. It allows one to model cause-effect relationships and helps in understanding the underlying processes

    on the use of networks in biomedicine

    Get PDF
    Abstract The concept of "neural network" emerges by electronic models inspired to the neural structure of human brain. Neural networks aim to solve problems currently out of computer's calculation capacity, trying to mimic the role of human brain. Recently, the number of biological based applications using neural networks is growing up. Biological networks represent correlations, extracted from sets of clinical data, diseases, mutations, and patients, and many other types of clinical or biological features. Biological networks are used to model both the state of a range of functionalities in a particular moment, and the space-time distribution of biological and clinical events. The study of biological networks, their analysis and modeling are important tasks in life sciences. Most biological networks are still far from being complete and they are often difficult to interpret due to the complexity of relationships and the peculiarities of the data. Starting from preliminary notions about neural networks, we focus on biological networks and discuss some well-known applications, like protein-protein interaction networks, gene regulatory networks (DNA-protein interaction networks), metabolic networks, signaling networks, neuronal network, phylogenetic trees and special networks. Finally, we consider the use of biological network inside a proposed model to map health related data

    Creation and Application of Various Tools for the Reconstruction, Curation, and Analysis of Genome-Scale Models of Metabolism

    Get PDF
    Systems biology uses mathematics tools, modeling, and analysis for holistic understanding and design of biological systems, allowing the investigation of metabolism and the generation of actionable hypotheses based on model analyses. Detailed here are several systems biology tools for model reconstruction, curation, analysis, and application through synthetic biology. The first, OptFill, is a holistic (whole model) and conservative (minimizing change) tool to aid in genome-scale model (GSM) reconstructions by filling metabolic gaps caused by lack of system knowledge. This is accomplished through Mixed Integer Linear Programming (MILP), one step of which may also be independently used as an additional curation tool. OptFill is applied to a GSM reconstruction of the melanized fungus Exophiala dermatitidis, which underwent various analyses investigating pigmentogenesis and similarity to human melanogenesis. Analysis suggest that carotenoids serve a currently unknown function in E. dermatitidis and that E. dermatitidis could serve as a model of human melanocytes for biomedical applications. Next, a new approach to dynamic Flux Balance Analysis (dFBA) is detailed, the Optimization- and Runge-Kutta- based Approach (ORKA). The ORKA is applied to the model plant Arabidopsis thaliana to show its ability to recreate in vivo observations. The analyzed model is more detailed than previous models, encompassing a larger time scale, modeling more tissues, and with higher accuracy. Finally, a pair of tools, the Eukaryotic Genetic Circuit Design (EuGeneCiD) and Modeling (EuGeneCiM) tools, is introduced which can aid in the design and modeling of synthetic biology applications hypothesized using systems biology. These tools bring a computational approach to synthetic biology, and are applied to Arabidopsis thaliana to design thousands of potential two-input genetic circuits which satisfy 27 different input and logic gate combinations. EuGeneCiM is further used to model a repressilator circuit. Efforts are ongoing to disseminate these tools to maximize their impact on the field of systems biology. Future research will include further investigation of E. dermatitidis through modeling and expanding my expertise to kinetic models of metabolism. Advisor: Rajib Sah

    Dicyemid Mesozoans: A Unique Parasitic Lifestyle and a Reduced Genome

    Get PDF
    Dicyemids, previously called "mesozoans" (intermediates between unicellular protozoans and multicellular metazoans), are an enigmatic animal group. They have a highly simplified adult body, comprising only approximately 30 cells, and they have a unique parasitic lifestyle. Recently, dicyemids were shown to be spiralians, with affinities to the Platyhelminthes. In order to understand molecular mechanisms involved in evolution of this odd animal, we sequenced the genome of Dicyema japonicum and a reference transcriptome assembly using mixed-stage samples. The D. japonicum genome features a high proportion of repetitive sequences that account for 49% of the genome. The dicyemid genome is reduced to approximately 67.5 Mb with 5,012 protein-coding genes. Only four Hox genes exist in the genome, with no clustering. Gene distribution in KEGG pathways shows that D. japonicum has fewer genes in most pathways. Instead of eliminating entire critical metabolic pathways, parasitic lineages likely simplify pathways by eliminating pathway-specific genes, while genes with fundamental functions may be retained in multiple pathways. In principle, parasites can stand to lose genes that are unnecessary, in order to conserve energy. However, whether retained genes in incomplete pathways serve intermediate functions and how parasites overcome the physiological needs served by lost genes, remain to be investigated in future studies

    Methods for the refinement of genome-scale metabolic networks

    Get PDF
    More accurate metabolic networks of pathogens and parasites are required to support the identification of important enzymes or transporters that could be potential targets for new drugs. The overall aim of this thesis is to contribute towards a new level of quality for metabolic network reconstruction, through the application of several different approaches. After building a draft metabolic network using an automated method, a large amount of manual curation effort is still necessary before an accurate model can be reached. PathwayBooster, a standalone software package, which I developed in Python, supports the first steps of model curation, providing easy access to enzymatic function information and a visual pathway display to enable the rapid identification of inaccuracies in the model. A major current problem in model refinement is the identification of genes encoding enzymes which are believed to be present but cannot be found using standard methods. Current searches for enzymes are mainly based on strong sequence similarity to proteins of known function, although in some cases it may be appropriate to consider more distant relatives as candidates for filling these pathway holes. With this objective in mind, a protocol was devised to search a proteome for superfamily relatives of a given enzymatic function, returning candidate enzymes to perform this function. Another, related approach tackles the problem of misannotation errors in public gene databases and their influence on metabolic models through the propagation of erroneous annotations. I show that the topological properties of metabolic networks contains useful information about annotation quality and can therefore play a role in methods for gene function assignment. An evolutionary perspective into functional changes within homologous domains opens up the possibility of integrating information from multiple genomes to support the reconstruction of metabolic models. I have therefore developed a methodology to predict functional change within a gene superfamily phylogeny
    corecore