41 research outputs found

    Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks

    Get PDF
    International audienceIncreasing amounts of sequence data are becoming available for a wide range of non-model organisms. Investigating and modelling the metabolic behaviour of those organisms is highly relevant to understand their biology and ecology. As sequences are often incomplete and poorly annotated, draft networks of their metabolism largely suffer from incompleteness. Appropriate gap-filling methods to identify and add missing reactions are therefore required to address this issue. However, current tools rely on phenotypic or taxonomic information, or are very sensitive to the stoichiometric balance of metabolic reactions, especially concerning the co-factors. This type of information is often not available or at least prone to errors for newly-explored organisms. Here we introduce Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. Meneco reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry of a metabolic network considered in other methods, and solves this problem using Answer Set Programming. Run on several artificial test sets gathering 10,800 degraded Escherichia coli networks Meneco was able to efficiently identify essential reactions missing in networks at high degradation rates, outperforming the stoichiometry-based tools in scalability. To demonstrate the utility of Meneco we applied it to two case studies. Its application to recent metabolic networks reconstructed for the brown algal model Ectocarpus siliculosus and an associated bacterium Candidatus Phaeomarinobacter ectocarpi revealed several candidate metabolic pathways for algal-bacterial interactions. Then Meneco was used to reconstruct, from transcriptomic and metabolomic data, the first metabolic network for the microalga Euglena mutabilis. These two case studies show that Meneco is a versatile tool to complete draft genome-scale metabolic networks produced from heterogeneous data, and to suggest relevant reactions that explain the metabolic capacity of a biological system

    Étude de la coopération hôte-microbiote par des problèmes d'optimisation basés sur la complétion de réseaux métaboliques

    Get PDF
    Systems biology relies on computational biology to integrate knowledge and data, for a better understanding of organisms’ physiology. Challenges reside in the applicability of methods and tools to non-model organisms, for instance in marine biology. Sequencing advances and the growing importance of elucidating microbiotas’ roles, have led to an increased interest into these organisms. This thesis focuses on the modeling of the metabolism through networks, and of its functionality using graphs and constraints semantics. In particular, a first part presents work on gap-filling metabolic networks in the context of non-model organisms. A graph-based method is benchmarked and validated and a hybrid one is developed using Answer Set Programming (ASP) and linear programming. Such gap-filling is applied on algae and extended to decipher putative interactions between Ectocarpus siliculosus and a symbiotic bacterium. In this direction, the second part of the thesis aims at proposing formalisms and implementation of a tool for selecting and screening communities of interest within microbiotas. It enables to scale to large microbiotas and, with a two-step approach, to suggest symbionts that fit the desired objective. The modeling supports the computation of exchanges, and solving can cover the whole solution space. Applications are presented on the human gut microbiota and the selection of bacterial communities for a brown alga. Altogether, this thesis proposes modeling, software and biological applications using graph-based semantics to support the elaboration of hypotheses for elucidating the metabolism of organisms.La biologie des systèmes intègre données et connaissances par des méthodes bioinformatiques, afin de mieux appréhender la physiologie des organismes. Une problématique est l’applicabilité de ces techniques aux organismes non modèles, au centre de plus en plus d’études, grâce aux avancées de séquençage et à l’intérêt croissant de la recherche sur les microbiotes. Cette thèse s’intéresse à la modélisation du métabolisme par des réseaux, et de sa fonctionnalité par diverses sémantiques basées sur les graphes et les contraintes stoechiométriques. Une première partie présente des travaux sur la complétion de réseaux métaboliques pour les organismes non modèles. Une méthode basée sur les graphes est validée, et une seconde, hybride, est développée, en programmation par ensembles réponses (ASP). Ces complétions sont appliquées à des réseaux métaboliques d’algues en biologie marine, et étendues à la recherche de complémentarité métabolique entre Ectocarpus siliculosus et une bactérie symbiotique. En s’appuyant sur les méthodes de complétion, la seconde partie de la thèse vise à proposer et implémenter une sélection de communautés à l’échelle de grands microbiotes. Une approche en deux étapes permet de suggérer des symbiotes pour l’optimisation d’un objectif donné. Elle supporte la modélisation des échanges et couvre tout l’espace des solutions. Des applications sur le microbiote intestinal humain et la sélection de bactéries pour une algue brune sont présentées. Dans l’ensemble, cette thèse propose de modéliser, développer et appliquer des méthodes reposant sur des sémantiques de graphe pour élaborer des hypothèses sur le métabolisme des organismes

    Complétion combinatoire pour la reconstruction de réseaux métaboliques, et application au modèle des algues brunes Ectocarpus siliculosus

    Get PDF
    In this thesis we focused on the development of a comprehensive approach to reconstruct metabolic networks applied to unconventional biological species for which we have little information. Traditionally, this reconstruction is based on three points : the creation of a metabolic draft from a genome, the completion of this draft and the verification of the results. We have been particularly interested in the hard combinatorial optimization problem represented by the gap-filling step. We used Answer Set Programming (or ASP) to solve this combinatorial problem. Changes to an existing method allowed us to improve both the computational time and the quality of modeling. This entire process of metabolic network reconstruction was applied to the model of brown algae, Ectocarpus siliculosus, allowing us to reconstruct the first metabolic network of a brown macro-algae. The reconstruction of this network allowed us to improve our understanding of the metabolism of this species and to improve annotation of its genome.Durant cette thèse nous nous sommes attachés au développement d'une méthode globale de création de réseaux métaboliques chez des espèces biologiques non classiques pour lesquelles nous possédons peu d'informations. Classiquement cette reconstruction s'articule en trois points : la création d'une ébauche métabolique à partir d'un génome, la complétion du réseau et la vérification du résultat obtenu. Nous nous sommes particulièrement intéressés au problème d'optimisation combinatoire difficile que représente l'étape de complétion du réseau, en utilisant un paradigme de programmation par contraintes pour le résoudre : la programmation par ensemble réponse (ou ASP). Les modifications apportées à une méthode préexistante nous ont permis d'améliorer à la fois le temps de calcul pour résoudre ce problème combinatoire et la qualité de la modélisation. L'ensemble de ce processus de reconstruction de réseau métabolique a été appliqué au modèle des algues brunes, Ectocarpus siliculosus, nous permettant ainsi de reconstruire le premier réseau métabolique chez une macro-algue brune. La reconstruction de ce réseau nous a permis d'améliorer notre compréhension du métabolisme de cette espèce et d'améliorer l'annotation de son génome

    Handbook of Marine Model Organisms in Experimental Biology

    Get PDF
    "The importance of molecular approaches for comparative biology and the rapid development of new molecular tools is unprecedented. The extraordinary molecular progress belies the need for understanding the development and basic biology of whole organisms. Vigorous international efforts to train the next-generation of experimental biologists must combine both levels – next generation molecular approaches and traditional organismal biology. This book provides cutting-edge chapters regarding the growing list of marine model organisms. Access to and practical advice on these model organisms have become aconditio sine qua non for a modern education of advanced undergraduate students, graduate students and postdocs working on marine model systems. Model organisms are not only tools they are also bridges between fields – from behavior, development and physiology to functional genomics. Key Features Offers deep insights into cutting-edge model system science Provides in-depth overviews of all prominent marine model organisms Illustrates challenging experimental approaches to model system research Serves as a reference book also for next-generation functional genomics applications Fills an urgent need for students Related Titles Jarret, R. L. & K. McCluskey, eds. The Biological Resources of Model Organisms (ISBN 978-1-1382-9461-5) Kim, S.-K. Healthcare Using Marine Organisms (ISBN 978-1-1382-9538-4) Mudher, A. & T. Newman, eds. Drosophila: A Toolbox for the Study of Neurodegenerative Disease (ISBN 978-0-4154-1185-1) Green, S. L. The Laboratory Xenopus sp. (ISBN 978-1-4200-9109-0)

    Comparative analysis of plant genomes through data integration

    Get PDF
    When we started our research in 2008, several online resources for genomics existed, each with a different focus. TAIR (The Arabidopsis Information Resource) has a focus on the plant model species Arabidopsis thaliana, with (at that time) little or no support for evolutionary or comparative genomics. Ensemble provided some basic tools and functions as a data warehouse, but it would only start incorporating plant genomes in 2010. There was no online resource at that time however, that provided the necessary data content and tools for plant comparative and evolutionary genomics that we required. As such, the plant community was missing an essential component to get their research at the same level as the biomedicine oriented research communities. We started to work on PLAZA in order to provide such a data resource that could be accessed by the plant community, and which also contained the necessary data content to help our research group’s focus on evolutionary genomics. The platform for comparative and evolutionary genomics, which we named PLAZA, was developed from scratch (i.e. not based on an existing database scheme, such as Ensemble). Gathering the data for all species, parsing this data into a common format and then uploading it into the database was the next step. We developed a processing pipeline, based on sequence similarity measurements, to group genes into gene families and sub families. Functional annotation was gathered through both the original data providers and through InterPro scans, combined with Interpro2GO. This primary data information was then ready to be used in every subsequent analysis. Building such a database was good enough for research within our bioinformatics group, but the target goal was to provide a comprehensive resource for all plant biologists with an interest in comparative and evolutionary genomics. Designing and creating a user-friendly, visually appealing web interface, connected to our database, was the next step. While the most detailed information is commonly presented in data tables, aesthetically pleasing graphics, images and charts are often used to visualize trends, general statistics and also used in specific tools. Design and development of these tools and visualizations is thus one of the core elements within my PhD. The PLAZA platform was designed as a gene-centric data resource, which is easily navigated when a biologist wants to study a relative small number of genes. However, using the default PLAZA website to retrieve information for dozens of genes quickly becomes very tedious. Therefore a ’gene set’-centric extra layer was developed where user-defined gene sets could be quickly analyzed. This extra layer, called the PLAZA workbench, functions on top of the normal PLAZA website, implicating that only gene sets from species present within the PLAZA database can be directly analyzed. The PLAZA resource for comparative and evolutionary genomics was a major success, but it still had several issues. We tried to solve at least two of these problems at the same time by creating a new platform. The first issue was the building procedure of PLAZA: adding a single species, or updating the structural annotation of an existing one, requires the total re-computation of the database content. The second issue was the restrictiveness of the PLAZA workbench: through a mapping procedure gene sets could be entered for species not present in the PLAZA database, but for species without a phylogenetic close relative this approach did not always yield satisfying results. Furthermore, the research in question might just focus on the difference between a species present in PLAZA and a close relative not present in PLAZA (e.g. to study adaptation to a different ecological niche). In such a case, the mapping procedure is in itself useless. With the advent of NGS transcriptome data sets for a growing number of species, it was clear that a next challenge had presented itself. We designed and developed a new platform, named TRAPID, which could automatically process entire transcriptome data sets, using a reference database. The target goal was to have the processing done quickly with the results containing both gene family oriented data (such as multiple sequence alignments and phylogenetic trees) and functional characterization of the transcripts. Major efforts went into designing the processing pipeline so it could be reliable, fast and accurate

    Extending the Metabolic Network of Ectocarpus Siliculosus using Answer Set Programming

    No full text
    International audienceMetabolic network reconstruction is of great biological relevance be- cause it offers a way to investigate the metabolic behavior of organisms. However, such a reconstruction remains a difficult task at both the biological and compu- tational level. Building on previous work establishing an ASP-based approach to this problem, we present a report from the field resulting in the discovery of new biological knowledge. In fact, for the first time ever, we automatically reconstructed a metabolic network for a macroalgea. We accomplished this by taking advantage of ASP's integrated optimization and enumeration capacities. Both tasks have been modeled in an improved ASP problem representation, in- corporating the concept of reversible reactions. Interestingly, it turned out that optimization highly benefits from the usage of unsatisfiable cores available in the ASP solver unclasp. Finally, applied to Ectocarpus siliculosus, only the combi- nation of unclasp and clasp allowed us to obtain a metabolic network able to produce all recoverable metabolites among the experimentally measured ones. Moreover, 70% of the identified reactions are supported by the existence of an homologous enzyme in Ectocarpus siliculosus, confirming the quality of the re- constructed network from a biological point of view

    Annotation of marine eukaryotic genomes

    Get PDF

    Barely visible but highly unique : the Ostreococcus genome unveils its secrets

    Get PDF
    corecore