journal article
Predicting recombinant protein expression experiments using molecular dynamics simulation
Abstract
Soluble expression of de novo-designed proteins in Escherichia coli (E. coli) remains empirical. For given experimental conditions expression success is determined in part by protein primary sequence. This has been previously explored with varying success using a variety of statistical solubility prediction tools though without taking fold stability into account. In the present study, the three-dimensional structure of proteins in molecular dynamics (MD) simulations is used to predict expression as a new approach with a set of four-helix bundles. Stability-related parameters for ten structures were determined in a thermal unfolding MD simulation and used to build statistical models with a support vector machine (SVM) classifier. The most accurate models were identified by their performance on five independent four-helix bundle sequences. The final model provided accurate classification prediction for this test set and was successfully applied in a model challenge with two newly designed sequences. The combination of simulation-derived parameters and an SVM classifier has potential to predict recombinant expression outcome for this set of four-helix bundles. With further development, this approach of utilizing higher-dimensional protein structural information to predict expression may have potential to advance recombinant biotechnology through modern computational and statistical science- Journal Article
- Four-helix bundle
- Expression
- MD simulation
- Stability
- Support Vector Machine
- Secondary Structure Prediction
- Sequence-Based Prediction
- Escherichia-Coli
- 3-Dimensional Structures
- Structural Proteomics
- 4-Helix Bundle
- Force-Field
- Solubility
- Overexpression
- 1500 Chemical Engineering
- 1600 Chemistry
- 2209 Industrial and Manufacturing Engineering
- 2604 Applied Mathematics