3 research outputs found

    Biochemical complex data generation and integration in genome-scale metabolic models

    Get PDF
    Dissertação de mestrado em BioinformaticsThe (re-)construction of Genome-Scale Metabolic (GSM) models is highly dependent on biochemical databases. In fact, the biochemical data within these databases is limited, lacking, most of the times, in structurally defined compounds’ representations. In order to circumvent this limitation, compounds are frequently represented by their generic version. Lipids are paradigmatic cases: given that a multitude of lipid species can occur in nature, not only is their storage in databases hampered, but also their integration into GSM models. Accordingly, converting one lipid version, in GSM models, into another can be tricky, as these compounds possess side chains that are likely to be transferred all across their biosynthetic network. Hence, converting a lipid implies that all its precursors have to be converted as well, requiring information on lipid specificity and biosynthetic context. The present work represents a strategy to tackle this issue. Biochemical cOmplex data Integration in Metabolic Models at Genome scale (BOIMMG)’s pipeline encompasses the integration and processing of biochemical data from different sources, aiming at expanding the current knowledge in lipid biosynthesis, and its integration in GSM models. Generic reactions retrieved from MetaCyc were handled and transformed into reactions with structurally defined lipid species. More than 30 generic reactions were fully (and 27 partially) characterized, allowing to predict over 30000 new lipid structures and their biosynthetic context. The integration of BOIMMG’s data into GSM models was conducted for electron-transfer quinones, glycerolipids, and phospholipids metabolism. The validation accounted on the comparison of models with different versions of these metabolites. BOIMMG’s conversion modules were applied to Escherichia coli’s iJR904 model [1], generating 53 more matching lipids and 38 more matching reactions with iJR904 model’s iteration iAF1260b [2, 3], in which the conversion was performed and curated manually. To the best of our knowledge, BOIMMG’s database is the only with biosynthetic information regarding structurally defined lipids. Moreover, there is no other state-of-the-art tool capable of automatically generating complex lipid-specific networks.A reconstrução de modelos metabólicos à escala genómica (GSM na língua inglesa) depende grandemente da informaçãoo bioquímica presente em bases de dados. De facto, esta informação é muitas vezes limitada, podendo não conter representações de compostos estruturalmente definidos. Como tentativa de contornar esta limitação, os compostos químicos são frequentemente representados pela sua representação genérica. Os lípidos são casos paradigmáticos, dado que uma multitude de diferentes espécies químicas de lípidos ocorrem na natureza, dificultando o seu armazenamento em bases de dados, assim como a sua integração em modelos GSM. Desta forma, o processo de converter lípidos de uma versão genérica para uma versão estruturalmente definida não é trivial, dado que estes compostos possuem cadeias laterais que são transferidas ao longo das suas vias de biossíntese. Consequentemente, essa conversão implica que todos os precursores desses lípidos também sejam convertidos, requerendo haver informação relativa a lípidos específicos e às suas relações biossintéticas. O presente trabalho representa uma estratégia para resolver esse problema. A pipeline do software desenvolvido no âmbito deste trabalho, Biochemical cOmplex dataIntegration in Metabolic Models at Genome scale (BOIMMG), engloba a integração e processamento de dados bioquímicos de diferentes fontes, visando a expansão do conhecimento atual na biossíntese de lípidos, assim como a sua integração em modelos GSM. Relativamente à segunda fase, reações genéricas extraídas da base de dados MetaCyc foram processadas e transformadas em reações com lípidos estruturalmente definidos. Mais de 30 reações genéricas foram completamente (e 27 parcialmente) caracterizadas, permitindo prever mais de 30000 novas estruturas de lípidos, assim como os seus contextos biossintéticos. A integração dos dados nos modelos GSM foi conduzido para o metabolismo das quinonas transportadoras de eletrões, glicerolípidos e fosfolípidos. A validação teve em conta a comparação entre modelos com diferentes versões destes metabolitos. Os módulos de conversão do BOIMMG foram aplicados ao modelo iJR904 de Escherichia coli [1], gerando mais 53 lípidos e 38 reações que se encontram no modelo iAF1260b [2, 3], uma iteração do modelo iJR904 cuja conversão de lípidos se procedeu manualmente. A base de dados gerada pelo método BOIMMG é a única que contém informação biossintética relata a lípidos estruturalmente definidos. Adicionalmente, BOIMMG é uma ferramenta única que permite gerar redes complexas de lípidos automaticamente

    Revising lipid chemical structures in genome-wide metabolic models with BOIMMG

    Get PDF
    An important step in the reconstruction of Genome-Scale Metabolic (GSM) models is the integration of biochemical data. Such information is often incomplete or generic, lacking in completely defined chemical structures for several molecules, including lipids. The inumerous combinations of fatty acids in the side chains of lipids, hinder their storage in databases and integration into GSM models. Generic representations are commonly used to circumvent such limitation. However, lipid specificity is likely lost, and data integration problems arise, as several models contain lipids with completely defined structures and others with their generic versions. Such clash of versions is addressed by the Biochemical cOmplex data Integration in Metabolic Models at Genome-scale (BOIMMG). BOIMMG is an open-source framework that accelerates the swapping of different molecular versions (mainly lipids, structurally defined or not) in GSM models. Upon integration into a Neo4j graph database (http://neo4j.com/), lipid-specific data from LIPID MAPS Structure Database (LMSD), Swiss Lipids (SLM) and Model SEED were processed for biosynthetic contextualization within the curated pathways of MetaCyc. Several algorithms were developed to integrate this information in GSM models, afterwards. Over 30 generic reactions were fully and 27 partially expanded, resulting in 557392 new reactions, in which 557252 were not integrated, nor listed in Model SEED. These reactions were inferred from the previously contextualized biosynthetic relationships between structurally defined compounds. BOIMMGs information was applied to GSM models, tackling the conflict of molecules versions. The whole glycerolipids and phospholipids metabolic network within E. coli iJR904 model was expanded by our approach. The comparison between the altered model and one of its manually expanded published iterations (iAF1260b), has shown that 53 and 38 more matching lipids and reactions, respectively, were found. Besides the new biochemical set, BioISO's analysis demonstrated that biomass lipids were correctly produced, corroborating the correct expansion of the whole biosynthetic network. In conclusion, BOIMMG (available athttps://boimmg.bio.di.uminho.pt/) can establish relevant relationships between complex macromolecules, within their biosynthetic context, and provide automated procedures for their integration into GSM models.info:eu-repo/semantics/publishedVersio

    Towards a multivariate analysis of genome-scale metabolic models derived from the BiGG models database

    Get PDF
    First Online: 28 August 2021Genome-Scale metabolic models (GEMs) are a relevant tool in systems biology for in silico strain optimisation and drug discovery. An easier way to reconstruct a model is to use available GEMs as templates to create the initial draft, which can be curated up until a simulation-ready model is obtained. This approach is implemented in merlin's BiGG Integration Tool, which reconstructs models from existing GEMs present in the BiGG Models database. This study aims to assess draft models generated using models from BiGG as templates for three distinct organisms, namely, Streptococcus thermophilus, Xylella fastidiosa and Mycobacterium tuberculosis. Several draft models were reconstructed using the BiGG Integration Tool and different templates (all, selected and random). The variability of the models was assessed using the reactions and metabolic functions associated with the model's genes. This analysis showed that, even though the models shared a significant portion of reactions and metabolic functions, models from different organisms are still differentiated. Moreover, there also seems to be variability among the templates used to generate the draft models to a lower extent. This study concluded that the BiGG Integration Tool provides a fast and reliable alternative for draft reconstruction for bacteria.This study was supported by the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UIDB/04469/2020 unit. A. Oliveira (DFA/BD/10205/2020), E. Cunha (DFA/BD/8076/2020), F. Cruz (SFRH /BD/139198/2018), J. Sequeira (SFRH/BD/147271/2019), and M. Sampaio (SFRH/BD/144643/2019) hold a doctoral fellowship provided by the FCT. Oscar Dias acknowledge FCT for the Assistant Research contract obtained under CEEC Individual 2018.info:eu-repo/semantics/publishedVersio
    corecore