2,460 research outputs found

    Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors

    Full text link
    An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only CαC_\alpha or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue specific reduced discrete state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or CαC_\alpha atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side chain centers or coordinates of all side chain atoms. By reducing the residue alphabets down to size 5 for local structure-sequence relationship, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein

    Correcting pervasive errors in RNA crystallography through enumerative structure prediction

    Full text link
    Three-dimensional RNA models fitted into crystallographic density maps exhibit pervasive conformational ambiguities, geometric errors and steric clashes. To address these problems, we present enumerative real-space refinement assisted by electron density under Rosetta (ERRASER), coupled to Python-based hierarchical environment for integrated 'xtallography' (PHENIX) diffraction-based refinement. On 24 data sets, ERRASER automatically corrects the majority of MolProbity-assessed errors, improves the average Rfree factor, resolves functionally important discrepancies in noncanonical structure and refines low-resolution models to better match higher-resolution models

    Practically Useful: What the Rosetta Protein Modeling Suite Can Do for You

    Get PDF
    The objective of this review is to enable researchers to use the software package ROSETTA for biochemical and biomedicinal studies. We provide a brief review of the six most frequent research problems tackled with ROSETTA. For each of these six tasks, we provide a tutorial that illustrates a basic ROSETTA protocol. The ROSETTA method was originally developed for de novo protein structure prediction and is regularly one of the best performers in the community-wide biennial Critical Assessment of Structure Prediction. Predictions for protein domains with fewer than 125 amino acids regularly have a backbone root-mean-square deviation of better than 5.0 A Ëš. More impressively, there are several cases in which ROSETTA has been used to predict structures with atomic level accuracy better than 2.5 A Ëš. In addition to de novo structure prediction, ROSETTA also has methods for molecular docking, homology modeling, determining protein structures from sparse experimental NMR or EPR data, and protein design. ROSETTA has been used to accurately design a novel protein structure, predict the structure of protein-protein complexes, design altered specificity protein-protein and protein-DNA interactions, and stabilize proteins and protein complexes. Most recently, ROSETTA has been used to solve the X-ray crystallographic phase problem. ROSETTA is a unified software package for protein structure prediction and functional design. It has been used to predic

    A restraint molecular dynamics and simulated annealing approach for protein homology modeling utilizing mean angles

    Get PDF
    BACKGROUND: We have developed the program PERMOL for semi-automated homology modeling of proteins. It is based on restrained molecular dynamics using a simulated annealing protocol in torsion angle space. As main restraints defining the optimal local geometry of the structure weighted mean dihedral angles and their standard deviations are used which are calculated with an algorithm described earlier by Döker et al. (1999, BBRC, 257, 348–350). The overall long-range contacts are established via a small number of distance restraints between atoms involved in hydrogen bonds and backbone atoms of conserved residues. Employing the restraints generated by PERMOL three-dimensional structures are obtained using standard molecular dynamics programs such as DYANA or CNS. RESULTS: To test this modeling approach it has been used for predicting the structure of the histidine-containing phosphocarrier protein HPr from E. coli and the structure of the human peroxisome proliferator activated receptor γ (Ppar γ). The divergence between the modeled HPr and the previously determined X-ray structure was comparable to the divergence between the X-ray structure and the published NMR structure. The modeled structure of Ppar γ was also very close to the previously solved X-ray structure with an RMSD of 0.262 nm for the backbone atoms. CONCLUSION: In summary, we present a new method for homology modeling capable of producing high-quality structure models. An advantage of the method is that it can be used in combination with incomplete NMR data to obtain reasonable structure models in accordance with the experimental data

    Creation, refinement, and evaluation of conformational ensembles of proteins using the Torsional Network Model

    Full text link
    Máster Universitario en Bioinformática y Biología ComputacionalOne of the main limitations of structural bioinformatics lies in the difficulty of properly accounting for the dynamical aspects of proteins, which are often critical to their functional mechanisms. Among the tools developed to deal with this issue, the Torsional Network Model (TNM) relies on internal degrees of freedom (torsion angles of the protein backbone), and can give a description of the thermal fluctuations of a protein structure, as well as generate structural ensembles. However, the TNM is a coarse-grained model that cannot ensure that the newly created conformations are exempt from any structural defects. Therefore, the main hypothesis of this project is that TNM assembly process can be improved. The ability to generate high-quality structural ensembles describing the dynamical properties of a protein would indeed be highly valuable in various applications. In this thesis, we create, evaluate and refine TNM ensembles from a set of reference protein structures defined experimentally (Levin et al., 2007). An approximation used in Bastolla and Dehouck, 2019, is developed: the evaluation is performed by Molprobity analysis, and the refinement is done by SIDEpro. Furthermore, a new approach is taken when refining the ensembles by Energy Minimization (EM). The results show a potential improvement of the TNM ensembles when adjusting the target RMSD to the protein studied; point to a enhancement when using side-chain reconstructions , and to its combination with Energy Minimization as a way to optimize the structure quality. On the other hand, the pros and cons of the followed methodology are discussed, because the use of the available static-protein oriented measures and methods makes specially important to beware of their limitations when applied to the protein-dynamic oriented TNM. Exploring further target RMSD values, adjusting them to specific protein dynamic simulations or replicating the same pipe-line in different data-sets are some of the proposals for future work. Furthermore, taking into account variables like the temperature, the flexibility of the protein, and the estimated optimal RMSD would be interesting for the next studies

    Evaluation of template-based modeling in CASP13.

    Get PDF
    Performance in the template-based modeling (TBM) category of CASP13 is assessed here, using a variety of metrics. Performance of the predictor groups that participated is ranked using the primary ranking score that was developed by the assessors for CASP12. This reveals that the best results are obtained by groups that include contact predictions or inter-residue distance predictions derived from deep multiple sequence alignments. In cases where there is a good homolog in the wwPDB (TBM-easy category), the best results are obtained by modifying a template. However, for cases with poorer homologs (TBM-hard), very good results can be obtained without using an explicit template, by deep learning algorithms trained on the wwPDB. Alternative metrics are introduced, to allow testing of aspects of structural models that are not addressed by traditional CASP metrics. These include comparisons to the main-chain and side-chain torsion angles of the target, and the utility of models for solving crystal structures by the molecular replacement method. The alternative metrics are poorly correlated with the traditional metrics, and it is proposed that modeling has reached a sufficient level of maturity that the best models should be expected to satisfy this wider range of criteria

    Improving the Physical Realism and Structural Accuracy of Protein Models by a Two-Step Atomic-Level Energy Minimization

    Get PDF
    AbstractMost protein structural prediction algorithms assemble structures as reduced models that represent amino acids by a reduced number of atoms to speed up the conformational search. Building accurate full-atom models from these reduced models is a necessary step toward a detailed function analysis. However, it is difficult to ensure that the atomic models retain the desired global topology while maintaining a sound local atomic geometry because the reduced models often have unphysical local distortions. To address this issue, we developed a new program, called ModRefiner, to construct and refine protein structures from Cα traces based on a two-step, atomic-level energy minimization. The main-chain structures are first constructed from initial Cα traces and the side-chain rotamers are then refined together with the backbone atoms with the use of a composite physics- and knowledge-based force field. We tested the method by performing an atomic structure refinement of 261 proteins with the initial models constructed from both ab initio and template-based structure assemblies. Compared with other state-of-art programs, ModRefiner shows improvements in both global and local structures, which have more accurate side-chain positions, better hydrogen-bonding networks, and fewer atomic overlaps. ModRefiner is freely available at http://zhanglab.ccmb.med.umich.edu/ModRefiner

    NMR refinement of under-determined loop regions of the E200K variant of the human prion protein using database-derived distance constraints

    Get PDF
    Computational studies and research conducted in order to facilitate the understanding of the conversion of the normal cellular prion (PrP[Superscript c]) to the scrapie prion (PrP[Superscript Sc]) in prion diseases, are usually based on the structures determined by NMR. This is mainly attributed to the difficulties involved in crystallizing the prion protein. Due to insufficient experimental restraints, a biologically critical loop region in PrP[Superscript c] (residues 167-171), which is the potential binding site for the hypothized Protein X, is under-determined in most mammalian species. In this research, we show that by adding information about distance constraints derived from a database of high-resolution protein structures, this under-determined loop and some other secondary structural elements of the E200K variant of human PrP[Superscript c] can be refined into more generally realistic and acceptable structures within an ensemble, with improved quality and increased accuracy. In particular, the ensemble becomes more compact after the refinement with database derived distances constraints and the percentage of residues in the most favorable region of the Ramachandran diagram is increased to about 90% in the refined structures from the 80 to 85% range in the previously reported structures. In NMR structures, a model with 90% or more residues lying in the most favorable regions of the Ramachandran plot, is considered a good quality model. Our results not only provide a significantly improved model of structures of the Human prion protein, that would hence facilitate insights into its conversion in the spongiform encephalopathies, but also demonstrate the strong potential for using databases of known protein structures for structure determination and refinement

    Elucidating the effects of mutation and evolutionary divergence upon protein structure quantitative stability/flexibility relationships

    Get PDF
    The importance of flexibility and stability on protein function has been recognized for over five decades. A protein must be flexible enough to mediate a reaction pathway, yet rigid enough to achieve high fidelity in molecular recognition. To understand these relationships, the main focus of our research has been a comparative investigation of proteins' dynamics and thermodynamics across both "depth" and "breadth". Specifically, we compare stability and flexibility properties across a set of human c-type lysozyme point mutations (depth), as well as across a set of functionally related Ăź-lactamase protein orthologs (breadth). To accomplish these tasks we employ a Distance Constraint Model (DCM), which provides a robust statistical mechanical description of proteins and the relationships therein. The DCM is based on network rigidity that provides mechanical mechanism for enthalpy-entropy compensation, from which Quantitative Stability/Flexibility Relationships (QSFR) can be calculated. Our results suggest that DCM can be used for predicting stability of proteins with an average percent error of 4.3%. Deciphering changes in flexibility, DCM results suggest that the influence of mutations can lead to frequent, large and long-range effects in protein dynamics. Our breadth analyses indicate that QSFR and physiochemical property characterization of orthologs in a protein family parallel evolutionary relationship. Going further, we present protocols for clustering protein structures using their QSFR properties, thus paving way for comprehensive quantitative stability/flexibility relationship analysis across protein families and superfamilies. To summarize, the results presented in this work provide a complete description of proteins that account for their stability, flexibility and function
    • …
    corecore