16 research outputs found
Towards crystal structure prediction of complex organic compounds - a report on the fifth blind test
Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1: 1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories - a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome
Quantum-mechanics-derived 13Cα chemical shift server (CheShift) for protein structure validation
A server (CheShift) has been developed to predict 13Cα chemical shifts of protein structures. It is based on the generation of 696,916 conformations as a function of the φ, ψ, ω, χ1 and χ2 torsional angles for all 20 naturally occurring amino acids. Their 13Cα chemical shifts were computed at the DFT level of theory with a small basis set and extrapolated, with an empirically-determined linear regression formula, to reproduce the values obtained with a larger basis set. Analysis of the accuracy and sensitivity of the CheShift predictions, in terms of both the correlation coefficient R and the conformational-averaged rmsd between the observed and predicted 13Cα chemical shifts, was carried out for 3 sets of conformations: (i) 36 x-ray-derived protein structures solved at 2.3 Å or better resolution, for which sets of 13Cα chemical shifts were available; (ii) 15 pairs of x-ray and NMR-derived sets of protein conformations; and (iii) a set of decoys for 3 proteins showing an rmsd with respect to the x-ray structure from which they were derived of up to 3 Å. Comparative analysis carried out with 4 popular servers, namely SHIFTS, SHIFTX, SPARTA, and PROSHIFT, for these 3 sets of conformations demonstrated that CheShift is the most sensitive server with which to detect subtle differences between protein models and, hence, to validate protein structures determined by either x-ray or NMR methods, if the observed 13Cα chemical shifts are available. CheShift is available as a web server
Identifying native-like protein structures with scoring functions based on all-atom ECEPP force fields, implicit solvent models and structure relaxation
Availability of energy functions which can discriminate native-like from non-native protein conformations is crucial for theoretical protein structure prediction and refinement of low-resolution protein models. This article reports the results of benchmark tests for scoring functions based on two all-atom ECEPP force fields, that is, ECEPP/3 and ECEPP05, and two implicit solvent models for a large set of protein decoys. The following three scoring functions are considered: (i) ECEPP05 plus a solvent-accessible surface area model with the parameters optimized with a set of protein decoys (ECEPP05/SA); (ii) ECEPP/3 plus the solvent-accessible surface area model of Ooi et al. (Proc Natl Acad Sci USA 1987;84:3086–3090) (ECEPP3/OONS); and (iii) ECEPP05 plus an implicit solvent model based on a solution of the Poisson equation with an optimized Fast Adaptive Multigrid Boundary Element (FAMBEpH) method (ECEPP05/FAMBEpH). Short Monte Carlo-with-Minimization (MCM) simulations, following local energy minimization, are used as a scoring method with ECEPP05/SA and ECEPP3/OONS potentials, whereas energy calculation is used with ECEPP05/FAMBEpH. The performance of each scoring function is evaluated by examining its ability to distinguish between native-like and non-native protein structures. The results of the tests show that the new ECEPP05/SA scoring function represents a significant improvement over the earlier ECEPP3/OONS version of the force field. Thus, it is able to rank native-like structures with Cα root-mean-square-deviations below 3.5 Ã… as lowest-energy conformations for 76% and within the top 10 for 87% of the proteins tested, compared with 69 and 80%, respectively, for ECEPP3/OONS. The use of the FAMBEpH solvation model, which provides a more accurate description of the protein-solvent interactions, improves the discriminative ability of the scoring function to 89%. All failed tests in which the native-like structures cannot be discriminated as those with low energy, are due to omission of protein–protein interactions. The results of this study represent a benchmark in force-field development, and may be useful for evaluation of the performance of different force fields. Proteins 2009.Fil: Arnautova, Yelena A.. Cornell University; Estados UnidosFil: Vorobjev, Yury N.. Russian Academy of Science; RusiaFil: Vila, Jorge Alberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias FÃsico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; ArgentinaFil: Scheraga, Haroldo A.. Cornell University; Estados Unido
All-Atom Internal Coordinate Mechanics (ICM) Force Field for Hexopyranoses and Glycoproteins
We present an extension of the all-atom
internal-coordinate force
field, ICMFF, that allows for simulation of heterogeneous systems
including hexopyranose saccharides and glycan chains in addition to
proteins. A library of standard glycan geometries containing α-
and β-anomers of the most common hexapyranoses, i.e., d-galactose, d-glucose, d-mannose, d-xylose, l-fucose, <i>N</i>-acetylglucosamine, <i>N</i>-acetylgalactosamine, sialic, and glucuronic acids, is created based
on the analysis of the saccharide structures reported in the Cambridge
Structural Database. The new force field parameters include molecular
electrostatic potential-derived partial atomic charges and the torsional
parameters derived from quantum mechanical data for a collection of
minimal molecular fragments and related molecules. The ϕ/ψ
torsional parameters for different types of glycosidic linkages are
developed using model compounds containing the key atoms in the full
carbohydrates, i.e., glycosidic-linked tetrahydropyran–cyclohexane
dimers. Target data for parameter optimization include two-dimensional
energy surfaces corresponding to the ϕ/ψ glycosidic dihedral
angles in the disaccharide analogues, as determined by quantum mechanical
MP2/6-31G** single-point energies on HF/6-31G** optimized structures.
To achieve better agreement with the observed geometries of glycosidic
linkages, the bond angles at the O-linkage atoms are added to the
internal variable set and the corresponding bond bending energy term
is parametrized using quantum mechanical data. The resulting force
field is validated on glycan chains of 1–12 residues from a
set of high-resolution X-ray glycoprotein structures based on heavy
atom root-mean-square deviations of the lowest-energy glycan conformations
generated by the biased probability Monte Carlo (BPMC) molecular mechanics
simulations from the native structures. The appropriate BPMC distributions
for monosaccharide–monosaccharide and protein–glycan
linkages are derived from the extensive analysis of conformational
properties of glycoprotein structures reported in the Protein Data
Bank. Use of the BPMC search leads to significant improvements in
sampling efficiency for glycan simulations. Moreover, good agreement
with the X-ray glycoprotein structures is achieved for all glycan
chain lengths. Thus, average/median RMSDs are 0.81/0.68 Ã… for
one-residue glycans and 1.32/1.47 Ã… for three-residue glycans.
RMSD from the native structure for the lowest-energy conformation
of the 12-residue glycan chain (PDB ID 3og2) is 1.53 Ã…. Additionally, results
obtained for free short oligosaccharides using the new force field
are in line with the available experimental data, i.e., the most populated
conformations in solution are predicted to be the lowest energy ones.
The newly developed parameters allow for the accurate modeling of
linear and branched hexopyranose glycosides in heterogeneous systems
Protein-RNA Docking Using ICM
Protein-RNA interactions play an
important role in many biological
processes. Computational methods such as docking have been developed
to complement existing biophysical and structural biology techniques.
Computational prediction of protein-RNA complex structures includes
two steps: generating candidate structures from the individual protein
and RNA parts and scoring the generated poses to pick out the correct
one. In this work, we considered three recently developed data sets
of protein-RNA complexes to evaluate and improve the performance of
the FFT-based rigid-body docking algorithm implemented in the ICM
package. An electrostatic term describing interactions between negatively
charged phosphate groups and positively charged protein residues was
added to the energy function used during the docking step to take
into account the greater role that electrostatic interactions play
in protein-RNA complexes. Next, the docking results were used to optimize
a scoring function including van der Waals, electrostatic, and solvation
terms. This optimization yielded a much smaller weight for the solvation
term indicating that solvation energy may be less important for the
scoring of protein-RNA structures. Rescoring of the generated poses
with the new scoring function led to much higher success rates, while
pose clustering by contact fingerprints produced further improvements,
achieving a success rate of 0.66 for the top 100 structures
Are accurate computations of the 13C' shielding feasible at the DFT level of theory?
The goal of this study is twofold. First, to investigate the relative influence of the main structural factors affecting the computation of the 13C0 shielding, namely, the conformation of the residue itself and the next nearest-neighbor effects. Second, to determine whether calculation of the 13C0 shielding at the density functional level of theory (DFT), with an accuracy similar to that of the 13Ca shielding, is feasible with the existing computational resources. The DFT calculations, carried out for a large number of possible conformations of the tripeptide Ac-GXY-NMe, with different combinations of X and Y residues, enable us to conclude that the accurate computation of the 13C0 shielding for a given residue X depends on the: (i) (/,w) backbone torsional angles of X; (ii) side-chain conformation of X; (iii) (/,w) torsional angles of Y; and (iv) identity of residue Y. Consequently, DFT-based quantum mechanical calculations of the 13C0 shielding, with all these factors taken into account, are two orders of magnitude more CPU demanding than the computation, with similar accuracy, of the 13Ca shielding. Despite not considering the effect of the possible hydrogen bond interaction of the carbonyl oxygen, this work contributes to our general understanding of the main structural factors affecting the accurate computation of the 13C0 shielding in proteins and may spur significant progress in effort to develop new validation methods for protein structures.Fil: Vila, Jorge Alberto. Cornell University; Estados Unidos. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis ; ArgentinaFil: Arnautova, Yelena A.. Molsoft L.L.C.; Estados UnidosFil: MartÃn, Osvaldo Antonio. Cornell University; Estados Unidos. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis ; ArgentinaFil: Scheraga, Harold A.. Cornell University; Estados Unido
What can we learn by computing 13Cα chemical shifts for X-ray protein models?
The room-temperature X-ray structures of two proteins, solved at 1.8 and 1.9 Å resolution, are used to investigate whether a set of conformations, rather than a single X-ray structure, provides better agreement with both the X-ray data and the observed 13Cα chemical shifts in solution