18 research outputs found
An Orthogonal and pH-Tunable Sensor-Selector for Muconic Acid Biosynthesis in Yeast
Microbes offer enormous potential
for production of industrially
relevant chemicals and therapeutics, yet the rapid identification
of high-producing microbes from large genetic libraries is a major
bottleneck in modern cell factory development. Here, we develop and
apply a synthetic selection system in <i>Saccharomyces cerevisiae</i> that couples the concentration of muconic acid, a plastic precursor,
to cell fitness by using the prokaryotic transcriptional regulator
BenM driving an antibiotic resistance gene. We show that the sensor-selector
does not affect production nor fitness, and find that tuning pH of
the cultivation medium limits the rise of nonproducing cheaters. We
apply the sensor-selector to selectively enrich for best-producing
variants out of a large library of muconic acid production strains,
and identify an isolate that produces more than 2 g/L muconic acid
in a bioreactor. We expect that this sensor-selector can aid the development
of other synthetic selection systems based on allosteric transcription
factors
Design, characterization and modeling of design-build-test-learn cycle I.
(A) Outline of the stochastic sampling and test workflow for data generation. Created with Biorender.com. (B) The distribution and counts of parts from the 167 strains that were accepted as input for machine learning in the first learning phase of the first DBTL cycle. (C) The distribution of observed strictosidine titers relative to reference strain MIA-CH-A2. Below the bar plot the distribution of parts for each of the 238 analyzed strains is presented. (D) Cross-validated predictions vs average normalized strictosidine production. All values are ranked.</p
Learning curves and top-ranking strains designs from the iterative engineering cycles.
Learning curves from the first (A) and second (B) DBTL cycles, illustrating mean absolute error (MAE) of the best-performing deep learning and XGBoost models used cycle I and II, respectively, in relation to the number of data points (blue line) and the cross-validation holdout prediction MAE together with the standard deviations of the 10 models created (yellow line). The points are based on 10 models created with a randomized shuffled data in partitions of 33, 67, 100% and 20, 40, 60, 80 and 100% of the data available for dbtl1 and dbtl2 respectively to get the same size of partitions. (C) Average strictosidine production for Top-20 strains from first and second DBTL cycles. Genotypes are shown (left) with their respective color codes (middle) and average strictosidine production (right). For the strictosidine production, the light and dark blue colors correspond to strain designs that were first found in the first and second second DBTL cycle, respectively.</p
Design, characterization and modeling of design-build-test-learn cycle II.
(A) The distribution and counts of parts from the strains that were accepted as input for machine learning in the second learning phase of the second cycle of DBTL. (B) The distribution of observed strictosidine titers relative to reference strain MIA-CH-A2. Below the bar plot the distribution of parts for each of the 240 analysed strains is presented. (C) Cross-validated predictions vs average normalized strictosidine production. All values are ranked.</p
Using literate programming to A) Simulate a gel with one line of code and B) Run the gel with the amplicons.
Using literate programming to A) Simulate a gel with one line of code and B) Run the gel with the amplicons.</p
Design and characteristics of the constituent DNA parts used as experimental testbed for teemi.
(A) The ten-step biosynthetic pathway converting geraniol to strictosidine. The G8H step is highlighted in a dashed box [26]. (B-C) Rooted phylogenetic trees of G8H (D) and CPR (E) protein representatives. Uniprot identifiers are shown in parentheses. Catharanthus roseus (Cro), Rauvolfia serpentina (Rse), Olea europaea (Oeu), Camptotheca acuminata (Cac), Vinca minor (Vmi), Cinchona calisaya (Cca), Ophiarrhiza pumila (Opu), and Swertia mussatii (Smu), Artemisia annua (Aan), Arabidopsis thaliana (Ath), Catharanthus longifolius (Clo), Amsania hubrichtii (Ahu), and Aspergillus niger (Ani). (D-E) Temporal resolution of transcript abundances for candidate genes [34], for which promoters were chosen to control the expression of genes encoding G8H (D) and CPR (D) homologous. (F) Combinatorial assembly and genome integration strategy.</p
Schematic representations of A) top 25 designed strains from DBTL2 and B) top 25 predicted strains from the updated machine-learning model after DBTL2.
Schematic representations of A) top 25 designed strains from DBTL2 and B) top 25 predicted strains from the updated machine-learning model after DBTL2.</p
Conversion of natural language lab protocols for iterative design-build-test-learn cycles to literate protocols using teemi.
Natural language protocols (left—blue) comprehensible to humans are converted into computer code (right—yellow) that can be understood by both computers and humans. In teemi, each procedure in natural language protocols is connected with names of python modules in literate protocols, thus lowering the programming entry level needed for adopting teemi. See also S1 Fig for more details. Created with Biorender.com.</p
Machine-learning model characteristics.
Synthetic biology dictates the data-driven engineering of biocatalysis, cellular functions, and organism behavior. Integral to synthetic biology is the aspiration to efficiently find, access, interoperate, and reuse high-quality data on genotype-phenotype relationships of native and engineered biosystems under FAIR principles, and from this facilitate forward-engineering strategies. However, biology is complex at the regulatory level, and noisy at the operational level, thus necessitating systematic and diligent data handling at all levels of the design, build, and test phases in order to maximize learning in the iterative design-build-test-learn engineering cycle. To enable user-friendly simulation, organization, and guidance for the engineering of biosystems, we have developed an open-source python-based computer-aided design and analysis platform operating under a literate programming user-interface hosted on Github. The platform is called teemi and is fully compliant with FAIR principles. In this study we apply teemi for i) designing and simulating bioengineering, ii) integrating and analyzing multivariate datasets, and iii) machine-learning for predictive engineering of metabolic pathway designs for production of a key precursor to medicinal alkaloids in yeast. The teemi platform is publicly available at PyPi and GitHub.</div
Comparison of maintained<sup>b'*'</sup> open-source IT tools and their functionalities for full-stack DBTL cycle.
Comparison of maintainedb'*' open-source IT tools and their functionalities for full-stack DBTL cycle.</p