139,420 research outputs found
Machine learning-assisted directed protein evolution with combinatorial libraries
To reduce experimental effort associated with directed protein evolution and
to explore the sequence space encoded by mutating multiple positions
simultaneously, we incorporate machine learning in the directed evolution
workflow. Combinatorial sequence space can be quite expensive to sample
experimentally, but machine learning models trained on tested variants provide
a fast method for testing sequence space computationally. We validate this
approach on a large published empirical fitness landscape for human GB1 binding
protein, demonstrating that machine learning-guided directed evolution finds
variants with higher fitness than those found by other directed evolution
approaches. We then provide an example application in evolving an enzyme to
produce each of the two possible product enantiomers (stereodivergence) of a
new-to-nature carbene Si-H insertion reaction. The approach predicted libraries
enriched in functional enzymes and fixed seven mutations in two rounds of
evolution to identify variants for selective catalysis with 93% and 79% ee. By
greatly increasing throughput with in silico modeling, machine learning
enhances the quality and diversity of sequence solutions for a protein
engineering problem.Comment: Corrected best S-selective variant sequence in Figure 4. Corrected
less R-selective variant sequences from Round II Input library in Table 2 and
Supp Table 4. Corrections may also be found on PNAS version
https://www.pnas.org/content/early/2019/12/26/192177011
Machine learning-guided directed evolution for protein engineering
Machine learning (ML)-guided directed evolution is a new paradigm for
biological design that enables optimization of complex functions. ML methods
use data to predict how sequence maps to function without requiring a detailed
model of the underlying physics or biological pathways. To demonstrate
ML-guided directed evolution, we introduce the steps required to build ML
sequence-function models and use them to guide engineering, making
recommendations at each stage. This review covers basic concepts relevant to
using ML for protein engineering as well as the current literature and
applications of this new engineering paradigm. ML methods accelerate directed
evolution by learning from information contained in all measured variants and
using that information to select sequences that are likely to be improved. We
then provide two case studies that demonstrate the ML-guided directed evolution
process. We also look to future opportunities where ML will enable discovery of
new protein functions and uncover the relationship between protein sequence and
function.Comment: Made significant revisions to focus on aspects most relevant to
applying machine learning to speed up directed evolutio
Computational structureβbased drug design: Predicting target flexibility
The role of molecular modeling in drug design has experienced a significant revamp in the last decade. The increase in computational resources and molecular models, along with software developments, is finally introducing a competitive advantage in early phases of drug discovery. Medium and small companies with strong focus on computational chemistry are being created, some of them having introduced important leads in drug design pipelines. An important source for this success is the extraordinary development of faster and more efficient techniques for describing flexibility in threeβdimensional structural molecular modeling. At different levels, from docking techniques to atomistic molecular dynamics, conformational sampling between receptor and drug results in improved predictions, such as screening enrichment, discovery of transient cavities, etc. In this review article we perform an extensive analysis of these modeling techniques, dividing them into high and low throughput, and emphasizing in their application to drug design studies. We finalize the review with a section describing our Monte Carlo method, PELE, recently highlighted as an outstanding advance in an international blind competition and industrial benchmarks.We acknowledge the BSC-CRG-IRB Joint Research Program in Computational Biology. This work was supported by a grant
from the Spanish Government CTQ2016-79138-R.J.I. acknowledges support from SVP-2014-068797, awarded by the Spanish Government.Peer ReviewedPostprint (author's final draft
The Nondeterministic Waiting Time Algorithm: A Review
We present briefly the Nondeterministic Waiting Time algorithm. Our technique
for the simulation of biochemical reaction networks has the ability to mimic
the Gillespie Algorithm for some networks and solutions to ordinary
differential equations for other networks, depending on the rules of the
system, the kinetic rates and numbers of molecules. We provide a full
description of the algorithm as well as specifics on its implementation. Some
results for two well-known models are reported. We have used the algorithm to
explore Fas-mediated apoptosis models in cancerous and HIV-1 infected T cells
Inferring stabilizing mutations from protein phylogenies : application to influenza hemagglutinin
One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΞΞG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution
The genetic basis for adaptation of model-designed syntrophic co-cultures.
Understanding the fundamental characteristics of microbial communities could have far reaching implications for human health and applied biotechnology. Despite this, much is still unknown regarding the genetic basis and evolutionary strategies underlying the formation of viable synthetic communities. By pairing auxotrophic mutants in co-culture, it has been demonstrated that viable nascent E. coli communities can be established where the mutant strains are metabolically coupled. A novel algorithm, OptAux, was constructed to design 61 unique multi-knockout E. coli auxotrophic strains that require significant metabolite uptake to grow. These predicted knockouts included a diverse set of novel non-specific auxotrophs that result from inhibition of major biosynthetic subsystems. Three OptAux predicted non-specific auxotrophic strains-with diverse metabolic deficiencies-were co-cultured with an L-histidine auxotroph and optimized via adaptive laboratory evolution (ALE). Time-course sequencing revealed the genetic changes employed by each strain to achieve higher community growth rates and provided insight into mechanisms for adapting to the syntrophic niche. A community model of metabolism and gene expression was utilized to predict the relative community composition and fundamental characteristics of the evolved communities. This work presents new insight into the genetic strategies underlying viable nascent community formation and a cutting-edge computational method to elucidate metabolic changes that empower the creation of cooperative communities
Biological applications of the theory of birth-and-death processes
In this review, we discuss the applications of the theory of birth-and-death
processes to problems in biology, primarily, those of evolutionary genomics.
The mathematical principles of the theory of these processes are briefly
described. Birth-and-death processes, with some straightforward additions such
as innovation, are a simple, natural formal framework for modeling a vast
variety of biological processes such as population dynamics, speciation, genome
evolution, including growth of paralogous gene families and horizontal gene
transfer, and somatic evolution of cancers. We further describe how empirical
data, e.g., distributions of paralogous gene family size, can be used to choose
the model that best reflects the actual course of evolution among different
versions of birth-death-and-innovation models. It is concluded that
birth-and-death processes, thanks to their mathematical transparency,
flexibility and relevance to fundamental biological process, are going to be an
indispensable mathematical tool for the burgeoning field of systems biology.Comment: 29 pages, 4 figures; submitted to "Briefings in Bioinformatics
- β¦