28 research outputs found

    EnzyHTP Computational Directed Evolution with Adaptive Resource Allocation

    No full text
    Directed evolution facilitates enzyme engineering via iterative rounds of mutagenesis. Despite the wide applications of high-throughput screening, building “smart libraries” to effectively identify beneficial variants remains a major challenge in the community. Here, we developed a new computational directed evolution protocol based on EnzyHTP, a software we have previously reported to automate enzyme modeling. To enhance the throughput efficiency, we implemented an adaptive resource allocation strategy that dynamically allocates different types of computing resources (e.g., GPU/CPU) based on the specific need of an enzyme modeling sub-task in the workflow. We implemented the strategy as a Python library and tested the library using fluoroacetate dehalogenase as a model enzyme. The results show that comparing to fixed resource allocation where both CPU and GPU are on-call for use during the entire workflow, applying adaptive resource allocation can save 87% CPU hours and 14% GPU hours. Furthermore, we constructed a computational directed evolution protocol under the framework of adaptive resource allocation. The workflow was tested against two rounds of mutational screening in the directed evolution experiments of Kemp eliminase with a total of 184 mutants. Using folding stability and electrostatic stabilization energy as computational readout, we reproduced three out of the four experimentally-observed target variants. Enabled by the workflow, the entire computation task (i.e., 18.4 μs MD and 18,400 QM single point calculations) completes in three days of wall clock time using ~30 GPUs and ~1000 CPUs

    EnzyHTP: A High-Throughput Computational Platform for Enzyme Modeling

    No full text
    Molecular simulations, including quantum mechanics (QM), molecular mechanics (MM), and multiscale QM/MM modeling, have been extensively applied to understand the mechanism of enzyme catalysis and to design new enzymes. However, molecular simulations typically require specialized, manual operation ranging from model construction to post-analysis to complete the entire life-cycle of enzyme modeling. The dependence on manual operation makes it challenging to simulate enzymes and enzyme variants in a high-throughput fashion. In this work, we developed a Python software, EnzyHTP, to automate molecular model construction, QM, MM, and QM/MM computation, and analyses of modeling data for enzyme simulations. To test the EnzyHTP, we used fluoroacetate dehalogenase (FAcD) as a model system and simulated the enzyme interior electrostatics for 100 FAcD mutants with a random single amino acid substitution. For each enzyme mutant, the workflow involves structural model construction, 1 ns molecular dynamics simulations, and quantum mechnical calculations in 100 MD-sampled snapshots. The entire simulation workflow for 100 mutants was completed in 7 hours with 10 GPUs and 160 CPUs. EnzyHTP is expected to improve the efficiency and reproducibility of computational enzyme, facilitate the fundamental understanding of catalytic origins across enzyme families, and accelerate the optimization of biocatalysts for non-native substrate transformation

    Convergence in Determining Enzyme Functional Descriptors across Kemp Eliminase Variants

    No full text
    Molecular simulations have been extensively employed to accelerate biocatalytic discoveries. Enzyme functional descriptors derived from molecular simulations have been leveraged to guide the search for beneficial enzyme mutants. However, the ideal active-site region size for computing the descriptors over multiple enzyme variants remains untested. Here, we conducted convergence tests for dynamics-derived and electrostatic descriptors on eighteen Kemp eliminase variants across six active-site regions with various boundary distances to the substrate. The tested descriptors include the root-mean-square deviation of the active-site region, the solvent accessible surface area ratio between the substrate and active site, and the projection of the electric field on the breaking C–H bond. All descriptors were evaluated using molecular mechanics methods. To understand the effects of electronic structure, the electric field was also evaluated using quantum mechanics/molecular mechanics methods. The descriptor values were computed for eighteen Kemp eliminase variants. Spearman correlation matrices were used to determine the region size condition under which further expansion of the region boundary does not substantially change the ranking of descriptor values. We observed that protein dynamics-derived descriptors, including RMSDactive_site and SASAratio, converge at a distance cutoff of 5 Å from the substrate. The electrostatic descriptor, EFC–H, converges at 6 Å using molecular mechanics methods with truncated enzyme models and 4 Å using quantum mechanics/molecular mechanics methods with whole enzyme model. This study serves as a future reference to determine descriptors for predictive modeling of enzyme engineering

    EnzyKR: A Chirality-Aware Deep Learning Model for Predicting the Outcomes of the Hydrolase-Catalyzed Kinetic Resolution

    No full text
    Hydrolase-catalyzed kinetic resolution is a well-established biocatalytic process. However, the computational tools that predict the favorable enzyme scaffolds for separating racemic substrate mixture are underdeveloped. To address this challenge, we trained a deep learning framework, EnzyKR, to automate the selection of hydrolases for stereoselective biocatalysis. EnzyKR adopts a classifier-regressor architecture that first identifies the reactive binding conformer of an enantiomer-hydrolase complex, and then predicts its activation free energy. A structure-based encoding strategy was used to depict the chiral interactions between hydrolases and enantiomers. Different from existing models trained on protein sequence and substrate SMILES strings, EnzyKR was trained using 204 enantiomer-hydrolase complexes, which were constructed by docking based on the enzyme and substrate structures curated from IntEnzyDB. EnzyKR was tested using a held-out dataset of 20 complexes on the task of active free energy prediction. EnzyKR achieved a Pearson correlation coefficient (R) of 0.72, a Spearman rank correlation coefficient (Spearman R) of 0.72, and a mean absolute error (MAE) of 1.54 kcal/mol in its active free energy prediction task. Furthermore, EnzyKR was tested on the task of predicting enantiomeric excess ratios for 28 hydrolytic kinetic resolution reactions catalyzed by fluoroacetate dehalogenase RPA1163, halohydrin HheC, A. mediolanus epoxide hydrolase, and P. fluorescens esterase. The performance of EnzyKR was compared against a recently developed kinetic predictor, DLKcat. EnzyKR correctly predicts the favored enantiomer and outperforms DLKcat in 18 out of 28 reactions, occupying 64% of the test cases. These results demonstrate EnzyKR as a new approach for prediction of enantiomeric outcomes in hydrolase-catalyzed kinetic resolution reactions

    Investigating the Non-Electrostatic Component of Substrate Positioning Dynamics

    No full text
    Substrate positioning dynamics (SPD) orients the substrate to reactive conformations in the active site, accelerating enzymatic reactions. However, it remains unknown whether SPD effects originate primarily from electrostatic perturbation inside the enzyme or can independently mediate catalysis with a significant non-electrostatic component. Here we investigated how the non-electrostatic component of SPD affects transition state stabilization. Using high-throughput enzyme modeling, we selected Kemp eliminase variants with similar electrostatics inside the enzyme but significantly different SPD. The kinetic parameters of these selected mutants were experimentally characterized. We observed a valley-shaped, two-segment linear correlation between the TS stabilization free energy (converted from kinetic parameters) and an index used to quantify SPD. Favorable SPD was observed for a distal mutant R154W, leading to the lowest activation free energy among the mutants tested. R154W involves an increased proportion of reactive conformations. These results indicate the contribution of the non-electrostatic component of SPD to mediating enzyme catalytic efficiency

    Bioinspired Synthesis of (−)‐PF‐1018

    No full text
    The combination of electrocyclizations and cycloadditions accounts for the formation of a range of fascinating natural products. Cascades consisting of 8π electrocyclizations, followed by 6π electrocyclization, and a cycloaddition are relatively common. We now report the synthesis of the tetramic acid PF-1018 through an 8π electrocyclization, the product of which is immediately intercepted by a Diels–Alder cycloaddition. The success of this pericyclic cascade was critically dependent on the substitution pattern of the starting polyene and could be rationalized through DFD calculations. The completion of the synthesis required the instalment of a trisubstituted double bond via radical deoxygenation. An unexpected byproduct formed through 4-exo-trig radical cyclization could be recycled through an unprecedented triflation/fragmentation
    corecore