16 research outputs found

    Towards crystal structure prediction of complex organic compounds - a report on the fifth blind test

    Get PDF
    Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1: 1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories - a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome

    Quantum-mechanics-derived 13Cα chemical shift server (CheShift) for protein structure validation

    No full text
    A server (CheShift) has been developed to predict 13Cα chemical shifts of protein structures. It is based on the generation of 696,916 conformations as a function of the φ, ψ, ω, χ1 and χ2 torsional angles for all 20 naturally occurring amino acids. Their 13Cα chemical shifts were computed at the DFT level of theory with a small basis set and extrapolated, with an empirically-determined linear regression formula, to reproduce the values obtained with a larger basis set. Analysis of the accuracy and sensitivity of the CheShift predictions, in terms of both the correlation coefficient R and the conformational-averaged rmsd between the observed and predicted 13Cα chemical shifts, was carried out for 3 sets of conformations: (i) 36 x-ray-derived protein structures solved at 2.3 Å or better resolution, for which sets of 13Cα chemical shifts were available; (ii) 15 pairs of x-ray and NMR-derived sets of protein conformations; and (iii) a set of decoys for 3 proteins showing an rmsd with respect to the x-ray structure from which they were derived of up to 3 Å. Comparative analysis carried out with 4 popular servers, namely SHIFTS, SHIFTX, SPARTA, and PROSHIFT, for these 3 sets of conformations demonstrated that CheShift is the most sensitive server with which to detect subtle differences between protein models and, hence, to validate protein structures determined by either x-ray or NMR methods, if the observed 13Cα chemical shifts are available. CheShift is available as a web server

    Identifying native-like protein structures with scoring functions based on all-atom ECEPP force fields, implicit solvent models and structure relaxation

    No full text
    Availability of energy functions which can discriminate native-like from non-native protein conformations is crucial for theoretical protein structure prediction and refinement of low-resolution protein models. This article reports the results of benchmark tests for scoring functions based on two all-atom ECEPP force fields, that is, ECEPP/3 and ECEPP05, and two implicit solvent models for a large set of protein decoys. The following three scoring functions are considered: (i) ECEPP05 plus a solvent-accessible surface area model with the parameters optimized with a set of protein decoys (ECEPP05/SA); (ii) ECEPP/3 plus the solvent-accessible surface area model of Ooi et al. (Proc Natl Acad Sci USA 1987;84:3086–3090) (ECEPP3/OONS); and (iii) ECEPP05 plus an implicit solvent model based on a solution of the Poisson equation with an optimized Fast Adaptive Multigrid Boundary Element (FAMBEpH) method (ECEPP05/FAMBEpH). Short Monte Carlo-with-Minimization (MCM) simulations, following local energy minimization, are used as a scoring method with ECEPP05/SA and ECEPP3/OONS potentials, whereas energy calculation is used with ECEPP05/FAMBEpH. The performance of each scoring function is evaluated by examining its ability to distinguish between native-like and non-native protein structures. The results of the tests show that the new ECEPP05/SA scoring function represents a significant improvement over the earlier ECEPP3/OONS version of the force field. Thus, it is able to rank native-like structures with Cα root-mean-square-deviations below 3.5 Å as lowest-energy conformations for 76% and within the top 10 for 87% of the proteins tested, compared with 69 and 80%, respectively, for ECEPP3/OONS. The use of the FAMBEpH solvation model, which provides a more accurate description of the protein-solvent interactions, improves the discriminative ability of the scoring function to 89%. All failed tests in which the native-like structures cannot be discriminated as those with low energy, are due to omission of protein–protein interactions. The results of this study represent a benchmark in force-field development, and may be useful for evaluation of the performance of different force fields. Proteins 2009.Fil: Arnautova, Yelena A.. Cornell University; Estados UnidosFil: Vorobjev, Yury N.. Russian Academy of Science; RusiaFil: Vila, Jorge Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi". Universidad Nacional de San Luis. Facultad de Ciencias Físico, Matemáticas y Naturales. Instituto de Matemática Aplicada de San Luis "Prof. Ezio Marchi"; ArgentinaFil: Scheraga, Haroldo A.. Cornell University; Estados Unido

    All-Atom Internal Coordinate Mechanics (ICM) Force Field for Hexopyranoses and Glycoproteins

    Get PDF
    We present an extension of the all-atom internal-coordinate force field, ICMFF, that allows for simulation of heterogeneous systems including hexopyranose saccharides and glycan chains in addition to proteins. A library of standard glycan geometries containing α- and β-anomers of the most common hexapyranoses, i.e., d-galactose, d-glucose, d-mannose, d-xylose, l-fucose, <i>N</i>-acetylglucosamine, <i>N</i>-acetylgalactosamine, sialic, and glucuronic acids, is created based on the analysis of the saccharide structures reported in the Cambridge Structural Database. The new force field parameters include molecular electrostatic potential-derived partial atomic charges and the torsional parameters derived from quantum mechanical data for a collection of minimal molecular fragments and related molecules. The ϕ/ψ torsional parameters for different types of glycosidic linkages are developed using model compounds containing the key atoms in the full carbohydrates, i.e., glycosidic-linked tetrahydropyran–cyclohexane dimers. Target data for parameter optimization include two-dimensional energy surfaces corresponding to the ϕ/ψ glycosidic dihedral angles in the disaccharide analogues, as determined by quantum mechanical MP2/6-31G** single-point energies on HF/6-31G** optimized structures. To achieve better agreement with the observed geometries of glycosidic linkages, the bond angles at the O-linkage atoms are added to the internal variable set and the corresponding bond bending energy term is parametrized using quantum mechanical data. The resulting force field is validated on glycan chains of 1–12 residues from a set of high-resolution X-ray glycoprotein structures based on heavy atom root-mean-square deviations of the lowest-energy glycan conformations generated by the biased probability Monte Carlo (BPMC) molecular mechanics simulations from the native structures. The appropriate BPMC distributions for monosaccharide–monosaccharide and protein–glycan linkages are derived from the extensive analysis of conformational properties of glycoprotein structures reported in the Protein Data Bank. Use of the BPMC search leads to significant improvements in sampling efficiency for glycan simulations. Moreover, good agreement with the X-ray glycoprotein structures is achieved for all glycan chain lengths. Thus, average/median RMSDs are 0.81/0.68 Å for one-residue glycans and 1.32/1.47 Å for three-residue glycans. RMSD from the native structure for the lowest-energy conformation of the 12-residue glycan chain (PDB ID 3og2) is 1.53 Å. Additionally, results obtained for free short oligosaccharides using the new force field are in line with the available experimental data, i.e., the most populated conformations in solution are predicted to be the lowest energy ones. The newly developed parameters allow for the accurate modeling of linear and branched hexopyranose glycosides in heterogeneous systems

    Protein-RNA Docking Using ICM

    No full text
    Protein-RNA interactions play an important role in many biological processes. Computational methods such as docking have been developed to complement existing biophysical and structural biology techniques. Computational prediction of protein-RNA complex structures includes two steps: generating candidate structures from the individual protein and RNA parts and scoring the generated poses to pick out the correct one. In this work, we considered three recently developed data sets of protein-RNA complexes to evaluate and improve the performance of the FFT-based rigid-body docking algorithm implemented in the ICM package. An electrostatic term describing interactions between negatively charged phosphate groups and positively charged protein residues was added to the energy function used during the docking step to take into account the greater role that electrostatic interactions play in protein-RNA complexes. Next, the docking results were used to optimize a scoring function including van der Waals, electrostatic, and solvation terms. This optimization yielded a much smaller weight for the solvation term indicating that solvation energy may be less important for the scoring of protein-RNA structures. Rescoring of the generated poses with the new scoring function led to much higher success rates, while pose clustering by contact fingerprints produced further improvements, achieving a success rate of 0.66 for the top 100 structures

    Are accurate computations of the 13C' shielding feasible at the DFT level of theory?

    Get PDF
    The goal of this study is twofold. First, to investigate the relative influence of the main structural factors affecting the computation of the 13C0 shielding, namely, the conformation of the residue itself and the next nearest-neighbor effects. Second, to determine whether calculation of the 13C0 shielding at the density functional level of theory (DFT), with an accuracy similar to that of the 13Ca shielding, is feasible with the existing computational resources. The DFT calculations, carried out for a large number of possible conformations of the tripeptide Ac-GXY-NMe, with different combinations of X and Y residues, enable us to conclude that the accurate computation of the 13C0 shielding for a given residue X depends on the: (i) (/,w) backbone torsional angles of X; (ii) side-chain conformation of X; (iii) (/,w) torsional angles of Y; and (iv) identity of residue Y. Consequently, DFT-based quantum mechanical calculations of the 13C0 shielding, with all these factors taken into account, are two orders of magnitude more CPU demanding than the computation, with similar accuracy, of the 13Ca shielding. Despite not considering the effect of the possible hydrogen bond interaction of the carbonyl oxygen, this work contributes to our general understanding of the main structural factors affecting the accurate computation of the 13C0 shielding in proteins and may spur significant progress in effort to develop new validation methods for protein structures.Fil: Vila, Jorge Alberto. Cornell University; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis ; ArgentinaFil: Arnautova, Yelena A.. Molsoft L.L.C.; Estados UnidosFil: Martín, Osvaldo Antonio. Cornell University; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto de Matemática Aplicada de San Luis ; ArgentinaFil: Scheraga, Harold A.. Cornell University; Estados Unido

    What can we learn by computing 13Cα chemical shifts for X-ray protein models?

    No full text
    The room-temperature X-ray structures of two proteins, solved at 1.8 and 1.9 Å resolution, are used to investigate whether a set of conformations, rather than a single X-ray structure, provides better agreement with both the X-ray data and the observed 13Cα chemical shifts in solution
    corecore