41 research outputs found
Comparison of multiple amber force fields and development of improved protein backbone parameters,”
ABSTRACT The ff94 force field that is commonly associated with the Amber simulation package is one of the most widely used parameter sets for biomolecular simulation. After a decade of extensive use and testing, limitations in this force field, such as over-stabilization of a-helices, were reported by us and other researchers. This led to a number of attempts to improve these parameters, resulting in a variety of ''Amber'' force fields and significant difficulty in determining which should be used for a particular application. We show that several of these continue to suffer from inadequate balance between different secondary structure elements. In addition, the approach used in most of these studies neglected to account for the existence in Amber of two sets of backbone u/w dihedral terms. This led to parameter sets that provide unreasonable conformational preferences for glycine. We report here an effort to improve the u/w dihedral terms in the ff99 energy function. Dihedral term parameters are based on fitting the energies of multiple conformations of glycine and alanine tetrapeptides from high level ab initio quantum mechanical calculations. The new parameters for backbone dihedrals replace those in the existing ff99 force field. This parameter set, which we denote ff99SB, achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data. It also accomplishes improved agreement with published experimental data for conformational preferences of short alanine peptides and better accord with experimental NMR relaxation data of test protein systems
Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening
Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUDE). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of proteinligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development
Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening
Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUDE). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of proteinligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development
Evaluating Molecular Mechanical Potentials for Helical Peptides and Proteins
Multiple variants of the AMBER all-atom force field were quantitatively evaluated with respect to their ability to accurately characterize helix-coil equilibria in explicit solvent simulations. Using a global distributed computing network, absolute conformational convergence was achieved for large ensembles of the capped A21 and Fs helical peptides. Further assessment of these AMBER variants was conducted via simulations of a flexible 164-residue five-helix-bundle protein, apolipophorin-III, on the 100 ns timescale. Of the contemporary potentials that had not been assessed previously, the AMBER-99SB force field showed significant helix-destabilizing tendencies, with beta bridge formation occurring in helical peptides, and unfolding of apolipophorin-III occurring on the tens of nanoseconds timescale. The AMBER-03 force field, while showing adequate helical propensities for both peptides and stabilizing apolipophorin-III, (i) predicts an unexpected decrease in helicity with ALA→ARG+ substitution, (ii) lacks experimentally observed 310 helical content, and (iii) deviates strongly from average apolipophorin-III NMR structural properties. As is observed for AMBER-99SB, AMBER-03 significantly overweighs the contribution of extended and polyproline backbone configurations to the conformational equilibrium. In contrast, the AMBER-99φ force field, which was previously shown to best reproduce experimental measurements of the helix-coil transition in model helical peptides, adequately stabilizes apolipophorin-III and yields both an average gyration radius and polar solvent exposed surface area that are in excellent agreement with the NMR ensemble
Helix movement is coupled to displacement of the second extracellular loop in rhodopsin activation
The second extracellular loop (EL2) of rhodopsin forms a cap over the binding site of its photoreactive 11-cis retinylidene chromophore. A crucial question has been whether EL2 forms a reversible gate that opens upon activation or acts as a rigid barrier. Distance measurements using solid-state 13C NMR spectroscopy between the retinal chromophore and the β4 strand of EL2 show that the loop is displaced from the retinal binding site upon activation, and there is a rearrangement in the hydrogen-bonding networks connecting EL2 with the extracellular ends of transmembrane helices H4, H5 and H6. NMR measurements further reveal that structural changes in EL2 are coupled to the motion of helix H5 and breaking of the ionic lock that regulates activation. These results provide a comprehensive view of how retinal isomerization triggers helix motion and activation in this prototypical G protein-coupled receptor. © 2009 Nature America, Inc. All rights reserved
Contributions of the membrane dipole potential to the function of voltage-gated cation channels and modulation by small molecule potentiators
The dipole potential (Ψd) is a fundamental property of phospholipid bilayers, and the largest of three electric potentials existing within excitable membranes. Ψd arises in part from unfavorable alignment of phospholipid dipoles, and varies both temporally and spatially across bilayer surfaces, corresponding to the conformational states and locations of integral membrane proteins (increasing unfavorably in response to conformations promoting increased lipid packing density). Building on the work of Clarke, we propose that the transmembrane potential (ΔΨm) and Ψd serve as complementary barriers to voltage-gated cation channel activation and deactivation, respectively. Ψd serves as the energetic driving force for activation during depolarization, being opposed by ΔΨm in the resting state. Conversely, ΔΨm serves as the energetic driving force for deactivation during repolarization, being opposed by Ψd in the non-resting state. We further propose that modulation of Ψd by certain membrane-partitioned molecules alters the delicate balance between the two potentials, and thereby shifts the voltage dependence of voltage gated channel transitions. Here, we use molecular dynamics simulations to calculate Ψd modulation by a series of hERG activators partitioned in model POPC bilayers. Our findings suggest a strong correlation between Ψd lowering and hERG current increase across the series. Our hypothesis differs from the conventional view that potentiators act via direct ion channel binding, suggesting a plausible mechanism for 1) the transduction of binding energy into alteration of ion channel transition barriers (particularly under the non-equilibrium conditions found in vivo); and 2) both activator and inhibitor modalities, as may occur within the same chemical series
Relative Binding Free-Energy Calculations at Lipid-Exposed Sites: Deciphering Hot Spots.
Relative binding free-energy (RBFE) calculations are experiencing resurgence in the computer-aided drug design of novel small molecules due to performance gains allowed by cutting-edge molecular mechanic force fields and computer hardware. Application of RBFE to soluble proteins is becoming a routine, while recent studies outline necessary steps to successfully apply RBFE at the orthosteric site of membrane-embedded G-protein-coupled receptors (GPCRs). In this work, we apply RBFE to a congeneric series of antagonists that bind to a lipid-exposed, extra-helical site of the P2Y1 receptor. We find promising performance of RBFE, such that it may be applied in a predictive manner on drug discovery programs targeting lipid-exposed sites. Further, by the application of the microkinetic model, binding at a lipid-exposed site can be split into (1) membrane partitioning of the drug molecule followed by (2) binding at the extra-helical site. We find that RBFE can be applied to calculate the free energy of each step, allowing the uncoupling of observed binding free energy from the influence of membrane affinity. This protocol may be used to identify binding hot spots at extra-helical sites and guide drug discovery programs toward optimizing intrinsic activity at the target
Benchmarking in silico Tools for Cysteine pKa Prediction
Accurate estimation of the pKas of cysteine residues in proteins could inform targeted approaches in hit discovery. The pKa of a targetable cysteine residue in a disease-related protein is an important physiochemical parameter in covalent drug dis- covery, as it influences the fraction of nucleophilic thiolate amenable to chemical protein modification. Traditional structure-based in silico tools are limited in their predictive accuracy of cysteine pKas relative to other titratable residues. Additionally, there are limited comprehensive benchmark assessments for cysteine pKa predictive tools. This raises the need for extensive assessment and evaluation of methods for cysteine pKa prediction. Here, we report the performance of several computational pKa methods, including single structure and ensemble-based approaches, on a diverse test set of experimental cysteine pKas retrieved from the PKAD data- base. The dataset consisted of 16 wildtype and 10 mutant proteins with experimentally measured cysteine pKa values. Our results highlight that these methods are varied in their overall predictive accuracies. Among the test set of wildtype proteins evaluated, the best method yielded a mean absolute error of 2.3 pK units highlighting the need for improvement of existing pKa methods for accurate cysteine pKa estimation. Given the limited accuracy of these methods, further development is needed before these approaches can be routinely employed to drive design decisions in early drug discovery efforts
Benchmarking tools for in silico cysteine pKa prediction
Accurate estimation of the pKa’s of cysteine residues in proteins could inform targeted approaches in hit discovery. The pKa of a targetable cysteine residue in a disease-related protein is an important physiochemical parameter in covalent drug discovery, as it influences the fraction of nucleophilic thiolate amenable to chemical protein modification. Traditional structure-based in silico tools are limited in their predictive accuracy of cysteine pKa’s relative to other titratable residues. Additionally, there are limited comprehensive benchmark assessments for cysteine pKa predictive tools. This raises the need for exten-sive assessment and evaluation of methods for cysteine pKa prediction. Here, we report the performance of several computa-tional pKa methods, including structure-based and ensemble-based sampling approaches, on a diverse test set of experimental cysteine pKa’s retrieved from the PKAD database. The dataset consisted of 16 wildtype and 10 mutant proteins with experimentally measured cysteine pKa values. Our results highlight that these methods are varied in their overall predictive accura-cies. Among the test set of wildtype proteins evaluated, the best method yielded a mean absolute error of 2.3 pK units — highlighting the need for improvement of existing pKa methods for accurate cysteine pKa estimation. Given the limited accuracy of these methods, further development is needed before these approaches can be routinely employed to drive design decisions in early drug discovery efforts