3,892 research outputs found

    Genome-scale modeling of yeast: chronology, applications and critical perspectives

    Get PDF
    Over the last 15 years, several genome-scale metabolic models (GSMMs) were developed for different yeast species, aiding both the elucidation of new biological processes and the shift toward a bio-based economy, through the design of in silico inspired cell factories. Here, an historical perspective of the GSMMs built over time for several yeast species is presented and the main inheritance patterns among the metabolic reconstructions are highlighted. We additionally provide a critical perspective on the overall genome-scale modeling procedure, underlining incomplete model validation and evaluation approaches and the quest for the integration of regulatory and kinetic information into yeast GSMMs. A summary of experimentally validated model-based metabolic engineering applications of yeast species is further emphasized, while the main challenges and future perspectives for the field are finally addressedThis work was supported by the Portuguese Foundation for Science and Technology (FCT) under the scope of a Ph.D. grant (PD/BD/52336/2013), of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01–0145FEDER-006684) and also in the context of the EU-funded initiative ERA-NET for Industrial Biotechnology (ERA-IB-2/0003/2013), in addition to the BioTecNorte operation (NORTE-01–0145FEDER-000004) funded by European Regional Development Fund under the scope of Norte2020 - Programa Operacional Regional do Norte.info:eu-repo/semantics/publishedVersio

    LGEM+^\text{+}: a first-order logic framework for automated improvement of metabolic network models through abduction

    Full text link
    Scientific discovery in biology is difficult due to the complexity of the systems involved and the expense of obtaining high quality experimental data. Automated techniques are a promising way to make scientific discoveries at the scale and pace required to model large biological systems. A key problem for 21st century biology is to build a computational model of the eukaryotic cell. The yeast Saccharomyces cerevisiae is the best understood eukaryote, and genome-scale metabolic models (GEMs) are rich sources of background knowledge that we can use as a basis for automated inference and investigation. We present LGEM+, a system for automated abductive improvement of GEMs consisting of: a compartmentalised first-order logic framework for describing biochemical pathways (using curated GEMs as the expert knowledge source); and a two-stage hypothesis abduction procedure. We demonstrate that deductive inference on logical theories created using LGEM+, using the automated theorem prover iProver, can predict growth/no-growth of S. cerevisiae strains in minimal media. LGEM+ proposed 2094 unique candidate hypotheses for model improvement. We assess the value of the generated hypotheses using two criteria: (a) genome-wide single-gene essentiality prediction, and (b) constraint of flux-balance analysis (FBA) simulations. For (b) we developed an algorithm to integrate FBA with the logic model. We rank and filter the hypotheses using these assessments. We intend to test these hypotheses using the robot scientist Genesis, which is based around chemostat cultivation and high-throughput metabolomics.Comment: 15 pages, one figure, two tables, two algorithm

    Genome-scale modeling of yeast metabolism: retrospectives and perspectives

    Get PDF
    Yeasts have been widely used for production of bread, beer and wine, as well as for production of bioethanol, but they have also been designed as cell factories to produce various chemicals, advanced biofuels and recombinant proteins. To systematically understand and rationally engineer yeast metabolism, genome-scale metabolic models (GEMs) have been reconstructed for the model yeast Saccharomyces cerevisiae and nonconventional yeasts. Here, we review the historical development of yeast GEMs together with their recent applications, including metabolic flux prediction, cell factory design, culture condition optimization and multi-yeast comparative analysis. Furthermore, we present an emerging effort, namely the integration of proteome constraints into yeast GEMs, resulting in models with improved performance. At last, we discuss challenges and perspectives on the development of yeast GEMs and the integration of proteome constraints

    Evaluating accessibility, usability and interoperability of genome-scale metabolic models for diverse yeasts species

    Get PDF
    Metabolic network reconstructions have become an important tool for probing cellular metabolism in the field of systems biology. They are used as tools for quantitative prediction but also as scaffolds for further knowledge contextualization. The yeast Saccharomyces cerevisiae was one of the first organisms for which a genome-scale metabolic model (GEM) was reconstructed, in 2003, and since then 45 metabolic models have been developed for a wide variety of relevant yeasts species. A systematic evaluation of these models revealed that-despite this long modeling history-the sequential process of tracing model files, setting them up for basic simulation purposes and comparing them across species and even different versions, is still not a generalizable task. These findings call the yeast modeling community to comply to standard practices on model development and sharing in order to make GEMs accessible and useful for a wider public

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Toward systems biology in brown algae to explore acclimation and adaptation to the shore environment.

    Get PDF
    International audienceBrown algae belong to a phylogenetic lineage distantly related to land plants and animals. They are almost exclusively found in the intertidal zone, a harsh and frequently changing environment where organisms are submitted to marine and terrestrial constraints. In relation with their unique evolutionary history and their habitat, they feature several peculiarities, including at the level of their primary and secondary metabolism. The establishment of Ectocarpus siliculosus as a model organism for brown algae has represented a framework in which several omics techniques have been developed, in particular, to study the response of these organisms to abiotic stresses. With the recent publication of medium to high throughput profiling data, it is now possible to envision integrating observations at the cellular scale to apply systems biology approaches. As a first step, we propose a protocol focusing on integrating heterogeneous knowledge gained on brown algal metabolism. The resulting abstraction of the system will then help understanding how brown algae cope with changes in abiotic parameters within their unique habitat, and to decipher some of the mechanisms underlying their (1) acclimation and (2) adaptation, respectively consequences of (1) the behavior or (2) the topology of the system resulting from the integrative approach

    Systems Biology of Protein Secretion in Human Cells: Multi-omics Analysis and Modeling of the Protein Secretion Process in Human Cells and its Application.

    Get PDF
    Since the emergence of modern biotechnology, the production of recombinant pharmaceutical proteins has been an expanding field with high demand from industry. Pharmaceutical proteins have constituted the majority of top-selling drugs in the pharma industry during recent years. Many of these proteins require post-translational modifications and are therefore produced using mammalian cells such as Chinese Hamster Ovary cells. Despite frequent improvements in developing efficient cell factories for producing recombinant proteins, the natural complexity of the protein secretion process still poses serious challenges for the production of some proteins at the desired quantity and accepted quality. These challenges have been intensified by the growing demands of the pharma industry to produce novel products with greater structural complexity,\ua0\ua0as well as increasing expectations from regulatory authorities in the form of new quality control criteria to guarantee product safety.This thesis focuses on different aspects of the protein secretion process, including its engineering for cell factory development and analysis in diseases associated with its deregulation. A major part of this thesis involved the use of HEK293 cells as a human model cell-line for investigating the protein secretion process by generating different types of omics data and developing a computational model of the human protein secretion pathway. We compared the transcriptomic profile of cell lines producing erythropoietin (EPO; as a model secretory protein) at different rates to identify key genes that potentially contributed to higher rates of protein secretion. Moreover, by performing a transcriptomic comparison of cells producing green fluorescent protein (GFP; as a model non-secretory protein) with EPO producers, we captured differences that specifically relate to secretory protein production. We sought to further investigate the factors contributing to increased recombinant protein production by analyzing additional omic layers such as proteomics and metabolomics in cells that exhibited different rates of EPO production. Moreover, we developed a toolbox (HumanSec) to extend the reference human genome-scale metabolic model (Human1) to encompass protein-specific reactions for each secretory protein detected in our proteomics dataset. By generating cell-line specific protein secretion models and constraining the models using metabolomics data, we could predict the top host cell proteins (HCPs) that compete with EPO for metabolic and energetic resources.\ua0Finally,\ua0based on the detected patterns of changes in our multi-omics investigations combined with a protein secretion sensitivity analysis using the metabolic model, we identified a list of genes and pathways that potentially play a key role in recombinant protein production and could serve as promising candidates for targeted cell factory design.In another part of the thesis, we studied the link between the expression profiles of genes involved in the protein secretory pathway (PSP) and various hallmarks of cancer. By\ua0implementing a dual approach involving differential expression analysis and eight different machine learning algorithms, we investigated the expression changes in secretory pathway components across different cancer types to identify PSP genes whose expression was associated with tumor characteristics. We demonstrated that a combined machine learning and differential expression approach have a complementary nature and could highlight key PSP components relevant to features of tumor pathophysiology that may constitute potential therapeutic targets

    Yeast metabolic innovations emerged via expanded metabolic network and gene positive selection

    Get PDF
    Yeasts are known to have versatile metabolic traits, while how these metabolic traits have evolved has not been elucidated systematically. We performed integrative evolution analysis to investigate how genomic evolution determines trait generation by reconstructing genome-scale metabolic models (GEMs) for 332 yeasts. These GEMs could comprehensively characterize trait diversity and predict enzyme functionality, thereby signifying that sequence-level evolution has shaped reaction networks towards new metabolic functions. Strikingly, using GEMs, we can mechanistically map different evolutionary events, e.g. horizontal gene transfer and gene duplication, onto relevant subpathways to explain metabolic plasticity. This demonstrates that gene family expansion and enzyme promiscuity are prominent mechanisms for metabolic trait gains, while GEM simulations reveal that additional factors, such as gene loss from distant pathways, contribute to trait losses. Furthermore, our analysis could pinpoint to specific genes and pathways that have been under positive selection and relevant for the formulation of complex metabolic traits, i.e. thermotolerance and the Crabtree effect. Our findings illustrate how multidimensional evolution in both metabolic network structure and individual enzymes drives phenotypic variations

    Application of machine learning in systems biology

    Get PDF
    Biological systems are composed of a large number of molecular components. Understanding their behavior as a result of the interactions between the individual components is one of the aims of systems biology. Computational modelling is a powerful tool commonly used in systems biology, which relies on mathematical models that capture the properties and interactions between molecular components to simulate the behavior of the whole system. However, in many biological systems, it becomes challenging to build reliable mathematical models due to the complexity and the poor understanding of the underlying mechanisms. With the breakthrough in big data technologies in biology, data-driven machine learning (ML) approaches offer a promising complement to traditional theory-based models in systems biology. Firstly, ML can be used to model the systems in which the relationships between the components and the system are too complex to be modelled with theory-based models. Two such examples of using ML to resolve the genotype-phenotype relationships are presented in this thesis: (i) predicting yeast phenotypes using genomic features and (ii) predicting the thermal niche of microorganisms based on the proteome features. Secondly, ML naturally complements theory-based models. By applying ML, I improved the performance of the genome-scale metabolic model in describing yeast thermotolerance. In this application, ML was used to estimate the thermal parameters by using a Bayesian statistical learning approach that trains regression models and performs uncertainty quantification and reduction. The predicted bottleneck genes were further validated by experiments in improving yeast thermotolerance. In such applications, regression models are frequently used, and their performance relies on many factors, including but not limited to feature engineering and quality of response values. Manually engineering sufficient relevant features is particularly challenging in biology due to the lack of knowledge in certain areas. With the increasing volume of big data, deep-transfer learning enables us to learn a statistical summary of the samples from a big dataset which can be used as input to train other ML models. In the present thesis, I applied this approach to first learn a deep representation of enzyme thermal adaptation and then use it for the development of regression models for predicting enzyme optimal and protein melting temperatures. It was demonstrated that the transfer learning-based regression models outperform the classical ones trained on rationally engineered features in both cases. On the other hand, noisy response values are very common in biological datasets due to the variation in experimental measurements and they fundamentally restrict the performance attainable with regression models. I thereby addressed this challenge by deriving a theoretical upper bound for the coefficient of determination (R2) for regression models. This theoretical upper bound depends on the noise associated with the response variable and variance for a given dataset. It can thus be used to test whether the maximal performance has been reached on a particular dataset, or whether further model improvement is possible
    • 

    corecore