3 research outputs found

    Biochemical complex data generation and integration in genome-scale metabolic models

    Get PDF
    Dissertação de mestrado em BioinformaticsThe (re-)construction of Genome-Scale Metabolic (GSM) models is highly dependent on biochemical databases. In fact, the biochemical data within these databases is limited, lacking, most of the times, in structurally defined compounds’ representations. In order to circumvent this limitation, compounds are frequently represented by their generic version. Lipids are paradigmatic cases: given that a multitude of lipid species can occur in nature, not only is their storage in databases hampered, but also their integration into GSM models. Accordingly, converting one lipid version, in GSM models, into another can be tricky, as these compounds possess side chains that are likely to be transferred all across their biosynthetic network. Hence, converting a lipid implies that all its precursors have to be converted as well, requiring information on lipid specificity and biosynthetic context. The present work represents a strategy to tackle this issue. Biochemical cOmplex data Integration in Metabolic Models at Genome scale (BOIMMG)’s pipeline encompasses the integration and processing of biochemical data from different sources, aiming at expanding the current knowledge in lipid biosynthesis, and its integration in GSM models. Generic reactions retrieved from MetaCyc were handled and transformed into reactions with structurally defined lipid species. More than 30 generic reactions were fully (and 27 partially) characterized, allowing to predict over 30000 new lipid structures and their biosynthetic context. The integration of BOIMMG’s data into GSM models was conducted for electron-transfer quinones, glycerolipids, and phospholipids metabolism. The validation accounted on the comparison of models with different versions of these metabolites. BOIMMG’s conversion modules were applied to Escherichia coli’s iJR904 model [1], generating 53 more matching lipids and 38 more matching reactions with iJR904 model’s iteration iAF1260b [2, 3], in which the conversion was performed and curated manually. To the best of our knowledge, BOIMMG’s database is the only with biosynthetic information regarding structurally defined lipids. Moreover, there is no other state-of-the-art tool capable of automatically generating complex lipid-specific networks.A reconstrução de modelos metabólicos à escala genómica (GSM na língua inglesa) depende grandemente da informaçãoo bioquímica presente em bases de dados. De facto, esta informação é muitas vezes limitada, podendo não conter representações de compostos estruturalmente definidos. Como tentativa de contornar esta limitação, os compostos químicos são frequentemente representados pela sua representação genérica. Os lípidos são casos paradigmáticos, dado que uma multitude de diferentes espécies químicas de lípidos ocorrem na natureza, dificultando o seu armazenamento em bases de dados, assim como a sua integração em modelos GSM. Desta forma, o processo de converter lípidos de uma versão genérica para uma versão estruturalmente definida não é trivial, dado que estes compostos possuem cadeias laterais que são transferidas ao longo das suas vias de biossíntese. Consequentemente, essa conversão implica que todos os precursores desses lípidos também sejam convertidos, requerendo haver informação relativa a lípidos específicos e às suas relações biossintéticas. O presente trabalho representa uma estratégia para resolver esse problema. A pipeline do software desenvolvido no âmbito deste trabalho, Biochemical cOmplex dataIntegration in Metabolic Models at Genome scale (BOIMMG), engloba a integração e processamento de dados bioquímicos de diferentes fontes, visando a expansão do conhecimento atual na biossíntese de lípidos, assim como a sua integração em modelos GSM. Relativamente à segunda fase, reações genéricas extraídas da base de dados MetaCyc foram processadas e transformadas em reações com lípidos estruturalmente definidos. Mais de 30 reações genéricas foram completamente (e 27 parcialmente) caracterizadas, permitindo prever mais de 30000 novas estruturas de lípidos, assim como os seus contextos biossintéticos. A integração dos dados nos modelos GSM foi conduzido para o metabolismo das quinonas transportadoras de eletrões, glicerolípidos e fosfolípidos. A validação teve em conta a comparação entre modelos com diferentes versões destes metabolitos. Os módulos de conversão do BOIMMG foram aplicados ao modelo iJR904 de Escherichia coli [1], gerando mais 53 lípidos e 38 reações que se encontram no modelo iAF1260b [2, 3], uma iteração do modelo iJR904 cuja conversão de lípidos se procedeu manualmente. A base de dados gerada pelo método BOIMMG é a única que contém informação biossintética relata a lípidos estruturalmente definidos. Adicionalmente, BOIMMG é uma ferramenta única que permite gerar redes complexas de lípidos automaticamente

    Revising lipid chemical structures in genome-wide metabolic models with BOIMMG

    Get PDF
    An important step in the reconstruction of Genome-Scale Metabolic (GSM) models is the integration of biochemical data. Such information is often incomplete or generic, lacking in completely defined chemical structures for several molecules, including lipids. The inumerous combinations of fatty acids in the side chains of lipids, hinder their storage in databases and integration into GSM models. Generic representations are commonly used to circumvent such limitation. However, lipid specificity is likely lost, and data integration problems arise, as several models contain lipids with completely defined structures and others with their generic versions. Such clash of versions is addressed by the Biochemical cOmplex data Integration in Metabolic Models at Genome-scale (BOIMMG). BOIMMG is an open-source framework that accelerates the swapping of different molecular versions (mainly lipids, structurally defined or not) in GSM models. Upon integration into a Neo4j graph database (http://neo4j.com/), lipid-specific data from LIPID MAPS Structure Database (LMSD), Swiss Lipids (SLM) and Model SEED were processed for biosynthetic contextualization within the curated pathways of MetaCyc. Several algorithms were developed to integrate this information in GSM models, afterwards. Over 30 generic reactions were fully and 27 partially expanded, resulting in 557392 new reactions, in which 557252 were not integrated, nor listed in Model SEED. These reactions were inferred from the previously contextualized biosynthetic relationships between structurally defined compounds. BOIMMGs information was applied to GSM models, tackling the conflict of molecules versions. The whole glycerolipids and phospholipids metabolic network within E. coli iJR904 model was expanded by our approach. The comparison between the altered model and one of its manually expanded published iterations (iAF1260b), has shown that 53 and 38 more matching lipids and reactions, respectively, were found. Besides the new biochemical set, BioISO's analysis demonstrated that biomass lipids were correctly produced, corroborating the correct expansion of the whole biosynthetic network. In conclusion, BOIMMG (available athttps://boimmg.bio.di.uminho.pt/) can establish relevant relationships between complex macromolecules, within their biosynthetic context, and provide automated procedures for their integration into GSM models.info:eu-repo/semantics/publishedVersio
    corecore