7 research outputs found
Recommended from our members
Kinetic Characterization of 100 Glycoside Hydrolase Mutants Enables the Discovery of Structural Features Correlated with Kinetic Constants.
The use of computational modeling algorithms to guide the design of novel enzyme catalysts is a rapidly growing field. Force-field based methods have now been used to engineer both enzyme specificity and activity. However, the proportion of designed mutants with the intended function is often less than ten percent. One potential reason for this is that current force-field based approaches are trained on indirect measures of function rather than direct correlation to experimentally-determined functional effects of mutations. We hypothesize that this is partially due to the lack of data sets for which a large panel of enzyme variants has been produced, purified, and kinetically characterized. Here we report the kcat and KM values of 100 purified mutants of a glycoside hydrolase enzyme. We demonstrate the utility of this data set by using machine learning to train a new algorithm that enables prediction of each kinetic parameter based on readily-modeled structural features. The generated dataset and analyses carried out in this study not only provide insight into how this enzyme functions, they also provide a clear path forward for the improvement of computational enzyme redesign algorithms
Kinetic Characterization of 100 Glycoside Hydrolase Mutants Enables the Discovery of Structural Features Correlated with Kinetic Constants
<div><p>The use of computational modeling algorithms to guide the design of novel enzyme catalysts is a rapidly growing field. Force-field based methods have now been used to engineer both enzyme specificity and activity. However, the proportion of designed mutants with the intended function is often less than ten percent. One potential reason for this is that current force-field based approaches are trained on indirect measures of function rather than direct correlation to experimentally-determined functional effects of mutations. We hypothesize that this is partially due to the lack of data sets for which a large panel of enzyme variants has been produced, purified, and kinetically characterized. Here we report the <i>k</i><sub>cat</sub> and K<sub>M</sub> values of 100 purified mutants of a glycoside hydrolase enzyme. We demonstrate the utility of this data set by using machine learning to train a new algorithm that enables prediction of each kinetic parameter based on readily-modeled structural features. The generated dataset and analyses carried out in this study not only provide insight into how this enzyme functions, they also provide a clear path forward for the improvement of computational enzyme redesign algorithms.</p></div
Structure and catalyzed reaction of BglB.
<p>(A) Structure of BglB in complex with the modeled <i>p</i>-nitrophenyl-β-D-glucoside (pNPG) used for design. Alpha carbons of residues mutated shown as blue spheres. The image was drawn with PyMOL. [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0147596#pone.0147596.ref016" target="_blank">16</a>] (B) The BglB–catalyzed reaction on pNPG used to evaluate kinetic constants of designed mutants.</p
Correlation between machine learning predictions and experimentally-determined kinetic constants.
<p><i>Top panels</i>: predicted versus experimentally-measured values for kinetic constants <i>k</i><sub>cat</sub>/K<sub>M</sub> (A), <i>k</i><sub>cat</sub> (B), and 1/K<sub>M</sub> (C). All values are relative to the wild type enzyme and on a log scale. The standard deviation (error bars) of the predicted values are calculated based on the prediction by 1000-fold cross validation for each point. The red line corresponds to linear regression and has been added for visualization purposes. <i>Bottom panels</i>: Histograms of experimentally-determined values in the data set (90, 80 and 80 samples for <i>k</i><sub>cat</sub>/K<sub>M</sub>, <i>k</i><sub>cat</sub>, and K<sub>M</sub>, respectively), along with the residual errors (scatter plot) between predicted and measured kinetic values.</p
Active site model and conservation analysis of BglB.
<p>(A) Docked model of pNPG in the active site of BglB showing established catalytic residues (navy) and a selection of residues mutated (gold). A multiple sequence alignment of the Pfam database’s collection of 1,554 family 1 glycoside hydrolases was made and the sequence logo for (B) selected regions around specific residues discussed in the text and (C) over the entire BglB coding sequence is represented. The height for each amino acid indicates the sequence conservation at that position.</p
Most informative structural features predicting each kinetic constant.
<p>For each mutant, 10 out of 100 models were selected based on the lowest total system energy. Fifty-nine structural features were calculated for the selected models and the most informative features were selected based on a constrained regularization technique (elastic net with bagging; see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0147596#sec012" target="_blank">Methods</a>). The table contains features that have been assigned non-zero weights during training (9 for <i>k</i><sub>cat</sub>/K<sub>M</sub>, 8 for <i>k</i><sub>cat</sub>, 10 for K<sub>M</sub>). The weights are multiplied by a normalized form of the value (not shown), and can therefore indicate both a positive or negative relationship. For example, a negative weight for hydrogen bonding is consistent with a positive correlation to hydrogen bonding where a smaller number indicates more hydrogen bonding is occurring. Inversely, a positive weight for packing would indicate a positive correlation since a larger value indicates a system with fewer voids. The relative contribution of each feature in determining the kinetic constant is given as a normalized weight (columns 1–3). Column 4 provides a description of each feature, and columns 5 and 6 show the range of observed values in the training dataset. The full feature table is available in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0147596#pone.0147596.s007" target="_blank">S2 Table</a>. <i>ns = feature not selected by the algorithm</i>.</p