2 research outputs found
Predicting and modeling genotype-phenotype associations in yeast metabolic networks
Over the last 15 years, several genome-scale metabolic models (GSSMs) of
Saccharomyces cerevisiae were reconstructed and published. The in silico representation of
the interaction network between all the system components is carried out to predict the
physiological behavior of a microorganism, under different environmental and genetic
perturbations. However, gene knockout predictions are usually assessed and validated
using merely gene essentiality data. Saccharomyces Genome Database (SGD) [1] is a
powerful web-accessible resource that comprises functional structured information of
budding yeast genes. SGD contains information about over 180 different observed types of
phenotypes of which nearly 10% can be predicted using GSMMs. These data can provide
an additional layer for curation and validation of metabolic models, as well as contribute to
model improvements and to gain insights into yeast physiology.
In this study we have assessed the predictive accuracy of GSSMs based on singlegene
deletions, by comparing experimental data present in SGD with computational
simulations. Since the phenotypical behavior upon a gene deletion depends on the strain
background, media and other environmental conditions, we performed a thoroughly
characterization and (re)curation of the in vivo experiments to closely mimic these
evidences in silico. Nearly 3000 different phenotypic reported cases were evaluated using
two different constraint-based approaches (pFBA [2] and LMOMA [3]), which allow a
direct association between genetic data and metabolic fluxes. In parallel, a Jupyter
Notebook platform was also developed, aiming to serve as a possible validation tool for
new yeast GSMMs, using the curated SGD-based dataset.
We observed that, despite all the recent efforts and advances in the reconstruction
and annotation of GSMMs, there is still a lot of opportunities for improvements in the
models predictive ability. Most of the observed mismatches result from structural issues in
network reconstructions or due to the lack of regulatory information. To address these
issues, several strategies were investigated, including changes in gene-protein-reaction
associations and reversibility of reactions in the network, aside from the formulation of a
new biomass equation, based on the experimental determination of its macromolecular
composition, to which several cofactors, that surprisingly had not been represented in the
original biomass reaction, were also added. For example, this last modification led to
significant improvements in the prediction of auxotroph-inducing mutations and lethal
knockouts, which should enable us to more effectively engineer yeast as a cell factory
Accelerating metabolic engineering tasks by the in silico development of yeast cell factories
Tese de Doutoramento em BioengenhariaThe increasing knowledge about microbial metabolism and the recent advances in genetic
engineering tools are enabling metabolic engineering of microorganisms for the
sustainable production of industrially relevant compounds. Moreover, the reconstruction
of genome-scale metabolic models (GSMMs) together with the use of computational tools
have contributed to the rational design of microbial cell factories. However, despite all
the advances and available technologies in the systems biology field, the development of
economically viable cell factories is still a costly and time-consuming process, since efforts
have to be developed for each new targeted product. Therefore, the development of
chassis cells, i.e. pre-optimized strains for the overproduction of different compounds, as
well as the improvement of model predictive accuracy are needed to accelerate the
model-driven design of efficient cell factories.
Overall, this thesis follows the design-build-test-learn cycle of the model-guided
metabolic engineering workflow, focusing on the development of chassis strains of
Saccharomyces cerevisiae optimized for the improved production of different industrially
relevant organic acids with possible applications in the food, chemical and pharmaceutical
industries, in addition to the study of genotype-phenotype associations using GSMMs.
A conceptual framework for the design of chassis strains aimed at overproducing C4-
dicarboxylic acids, namely succinic, fumaric and malic acid, was developed. This strain
design framework applies a metaheuristic approach to search for growth-coupled
production designs, building upon the fact that these compounds are derived from the
same metabolic precursors. Several chassis strains encompassing common and nonintuitive
gene deletions, and requiring minimal strain optimization additional steps
towards the overproduction of the three organic acids, were generated based on a
modular fashion and analyzed in terms of their biological feasibility against literature data.
The most promising candidate solutions were further implemented in vivo and
physiologically characterized in batch cultivations regarding growth and production rates,
using high performance liquid chromatography (HPLC) and gas chromatography-mass
spectrometry (GC-MS) quantification techniques. Although the final strains developed in
this study would require additional rounds of metabolic engineering and bioprocess development to achieve industrially relevant production levels, a proof of concept of the
chassis cell for overproducing three different dicarboxylic acids has been established, to
the best of our knowledge, for the first time. This work also demonstrates the potential
of combining metabolic modeling with adaptive laboratory evolution in the design and
optimization of yeast cell factories. After evolution, for example, the strains engineered
to produce succinate showed a 6.1-fold improvement in the specific growth rate and a
4.9-fold improvement in the succinate secretion rate when compared to the non-evolved
counterpart, with a 38-fold increase in succinate titer and 17-fold increase in productivity
in comparison to the wild-type strain.
Despite the promising results, some discrepancies were observed between the
experimentally obtained yields and the simulated ones, stressing the need for model
improvements. Therefore, in order to improve and make the model validation process
more transparent and robust, phenomenaly (PHENOtypic and MEtabolic Network Analysis
at Large-scale of Yeast data) was developed. Phenomenaly is an open source Python
package built around manually curated data sets of yeast mutant phenotypes available
on Saccharomyces Genome Database (SGD), that enables the simulation and analysis of
genotype-phenotype associations using GSMMs of yeast. This tool was further used to
perform a thorough characterization of the observed inconsistencies, guiding the
formulation of several hypotheses to address these failures, including a detailed revision
of the in silico representation of the yeast biomass composition. The applied changes
contributed to the discovery of missing or erroneously present metabolic functions in the
yeast metabolic network and to broaden the network’s functionality and activity, i.e. the
number of reactions carrying flux in the wild-type simulation as suggested by
experimental evidences. Moreover, the model’s ability to correctly predict lethal and
auxotroph-inducing genes was improved by over 18% and 25%, respectively. The
questions raised and addressed in this part of the work are expected to contribute to more
effectively engineer yeast as a cell factory through model-guided approaches.
In summary, the work developed in this thesis shows that model-predicted modular
design strategies can indeed help to accelerate metabolic engineering tasks, although
genotype/phenotype predictions with yeast metabolic networks must be improved to
yield industrially relevant cell factories.O conhecimento acumulado sobre o metabolismo microbiano e o desenvolvimento de
ferramentas avançadas de engenharia genética têm possibilitado a manipulação de
microrganismos capazes de produzir compostos de interesse industrial de forma
sustentável e ecológica. Além disso, a reconstrução de modelos metabólicos à escala
genómica (GSMMs), aliada ao uso de ferramentas computacionais, tem contribuído para
o desenho racional de microrganismos como fábricas celulares. No entanto, apesar dos
avanços conseguidos e tecnologias disponíveis na área da biologia de sistemas, o
desenvolvimento de fábricas celulares compatíveis com requisitos industriais é ainda um
processo bastante caro e demorado, uma vez que cada produto alvo requer ainda uma
estratégia única e desenvolvida de raiz, devido sobretudo à inexistência de células chassis
(isto é, estirpes otimizadas para a produção de diferentes compostos), bem como à falta
de ferramentas de modelação devidamente validadas.
De um modo geral, esta tese segue o chamado ciclo “design-build-test-learn” associado
ao processo de engenharia metabólica baseada em modelos à escala genómica, e foca-se
no desenvolvimento de estirpes chassis de Saccharomyces cerevisiae, pré-otimizadas para
a produção de diferentes ácidos orgânicos de elevado interesse industrial, assim como no
estudo de associações entre genótipo e fenótipo usando este tipo de modelos.
Para cumprir o objetivo de criar estirpes chassis de S. cerevisiae otimizadas para o
aumento de produção de diferentes ácidos dicarboxílicos, nomeadamente ácido sucínico,
ácido fumárico e ácido málico, foi desenvolvida uma ferramenta computacional que usa
algoritmos de otimização existentes e se baseia na premissa de que estes compostos
derivam dos mesmos precursores metabólicos. Foram geradas várias soluções possíveis
para o efeito, incluindo alvos genéticos para deleção não óbvios e comuns à produção
desta família de compostos, com base num conceito de design modular.
As soluções mais promissoras foram depois implementadas in vivo e caracterizadas em
termos de crescimento e taxas de produção, usando técnicas analíticas como HPLC e GCMS.
Apesar de serem necessárias novas rondas de otimização das estirpes e dos
bioprocessos utilizados para atingir valores compatíveis com a utilização industrial,
estabeleceu-se, pela primeira vez, a prova de conceito de células chassis de S. cerevisiae
otimizadas para a produção de diferentes ácidos orgânicos. Além disso, foi também demonstrado o potencial de combinar a utilização de modelos metabólicos com
estratégias de adaptação evolutiva em laboratório no desenho e otimização de fábricas
celulares de levedura. Depois do processo evolutivo, as estirpes produtoras de sucinato
apresentaram uma melhoria significativa em termos de taxa específica de crescimento e
taxa de secreção desse composto, em comparação com as estirpes não evoluídas, bem
como uma significativa melhoria da produção de sucinato e produtividade da estirpe
evoluída em relação à estirpe selvagem.
Contudo, verificaram-se algumas discrepâncias entre os valores obtidos
experimentalmente e os dados resultantes das simulações, reforçando a necessidade de
melhoria da capacidade preditiva dos modelos. Com o intuito de melhorar e tornar o
processo de validação dos modelos mais transparente, foi criada uma nova ferramenta
computacional designada phenomenaly (PHENOtypic and MEtabolic Network Analysis at
Large-scale of Yeast data). Esta ferramenta foi construída com base em conjuntos de
dados de fenótipos resultantes de mutantes de S. cerevisiae disponíveis na
Saccharomyces Genome Database (SGD). Através do phenomenaly é possível simular e
analisar o fenótipo resultante da deleção de um determinado gene, usando diferentes
GSMMs de levedura. Com o recurso à ferramenta desenvolvida, foram geradas várias
hipóteses para tentar corrigir as discrepâncias observadas entre dados experimentais e
dados in silico, o que levou à reformulação da equação representativa da biomassa do
GSMM mais recente do microrganismo em estudo. As modificações efetuadas
contribuíram para descobrir reações em falta ou mal representadas no modelo, assim
como para ampliar o número de reações ativas ao nível da existência de fluxo, com base
em evidências experimentais. Além disso, a capacidade do modelo prever genes letais ou
auxotrofias foi também melhorada em cerca de 18% e 25%, respetivamente. Espera-se
que número de questões levantadas e analisadas nesta parte do trabalho contribuam para
construir fábricas celulares de levedura mais eficientes, com base em estratégias
racionais.
O trabalho desenvolvido ao longo desta tese demonstra que estratégias de design
modular baseadas em modelos metabólicos podem, de facto, contribuir para acelerar as
tarefas associadas à engenharia metabólica, embora a capacidade preditiva dos modelos
deva ser melhorada para dar origem a fábricas celulares com níveis de produção
industriais.Fundação para a Ciência e Tecnologia e ao Programa MIT Portugal pelo financiamento atribuído através da bolsa de doutoramento PD/BD/52336/2013