32 research outputs found

    Saturating representation of loop conformational fragments in structure databanks

    Get PDF
    BACKGROUND: Short fragments of proteins are fundamental starting points in various structure prediction applications, such as in fragment based loop modeling methods but also in various full structure build-up procedures. The applicability and performance of these approaches depend on the availability of short fragments in structure databanks. RESULTS: We studied the representation of protein loop fragments up to 14 residues in length. All possible query fragments found in sequence databases (Sequence Space) were clustered and cross referenced with available structural fragments in Protein Data Bank (Structure Space). We found that the expansion of PDB in the last few years resulted in a dense coverage of loop conformational fragments. For each loops of length 8 in the current Sequence Space there is at least one loop in Structure Space with 50% or higher sequence identity. By correlating sequence and structure clusters of loops we found that a 50% sequence identity generally guarantees structural similarity. These percentages of coverage at 50% sequence cutoff drop to 96, 94, 68, 53, 33 and 13% for loops of length 9, 10, 11, 12, 13, and 14, respectively. There is not a single loop in the current Sequence Space at any length up to 14 residues that is not matched with a conformational segment that shares at least 20% sequence identity. This minimum observed identity is 40% for loops of 12 residues or shorter and is as high as 50% for 10 residue or shorter loops. We also assessed the impact of rapidly growing sequence databanks on the estimated number of new loop conformations and found that while the number of sequentially unique sequence segments increased about six folds during the last five years there are almost no unique conformational segments among these up to 12 residues long fragments. CONCLUSION: The results suggest that fragment based prediction approaches are not limited any more by the completeness of fragments in databanks but rather by the effective scoring and search algorithms to locate them. The current favorable coverage and trends observed will be further accentuated with the progress of Protein Structure Initiative that targets new protein folds and ultimately aims at providing an exhaustive coverage of the structure space

    A word of caution about biological inference - Revisiting cysteine covalent state predictions

    Get PDF
    The success of methods for predicting the redox state of cysteine residues from the sequence environment seemed to validate the basic assumption that this state is mainly determined locally. However, the accuracy of predictions on randomized sequences or of non-cysteine residues remained high, suggesting that these predictions rather capture global features of proteins such as subcellular localization, which depends on composition. This illustrates that even high prediction accuracy is insufficient to validate implicit assumptions about a biological phenomenon. Correctly identifying the relevant underlying biochemical reasons for the success of a method is essential to gain proper biological insights and develop more accurate and novel bioinformatics tools. 2014 The Authors. Published by Elsevier B.V. on behalf of the Federation of European Biochemical Societies. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/)

    Impairment of a model peptide by oxidative stress: Thermodynamic stabilities of asparagine diamide C(alpha)-radical foldamers

    Get PDF
    Electron structure calculations on N-acetyl asparagine N-methylamide were performed to identify the global minimum from which radicals were formed after H-abstraction by the OH radical. It was found that the radical generated by breaking the C–H bond of the alpha-carbon was thermodynamically the most stable one in the gas- and aqueous phases. The extended ((beta)L and (beta)D) backbone conformations are the most stable, but syn–syn or inverse gamma-turn ((gamma)L) and gamma-turn ((gamma)D) have substantial stability too. The highest energy conformers are the degenerate eL and eD foldamers. Clearly, the most stable beta foldamer is the most likely intermediate for racemization

    Catalytic mechanism of alpha-phosphate attack in dUTPase is revealed by X-ray crystallographic snapshots of distinct intermediates, 31P-NMR spectroscopy and reaction path modelling.

    Get PDF
    Enzymatic synthesis and hydrolysis of nucleoside phosphate compounds play a key role in various biological pathways, like signal transduction, DNA synthesis and metabolism. Although these processes have been studied extensively, numerous key issues regarding the chemical pathway and atomic movements remain open for many enzymatic reactions. Here, using the Mason-Pfizer monkey retrovirus dUTPase, we study the dUTPase-catalyzed hydrolysis of dUTP, an incorrect DNA building block, to elaborate the mechanistic details at high resolution. Combining mass spectrometry analysis of the dUTPase-catalyzed reaction carried out in and quantum mechanics/molecular mechanics (QM/MM) simulation, we show that the nucleophilic attack occurs at the alpha-phosphate site. Phosphorus-31 NMR spectroscopy (31P-NMR) analysis confirms the site of attack and shows the capability of dUTPase to cleave the dUTP analogue alpha,beta-imido-dUTP, containing the imido linkage usually regarded to be non-hydrolyzable. We present numerous X-ray crystal structures of distinct dUTPase and nucleoside phosphate complexes, which report on the progress of the chemical reaction along the reaction coordinate. The presently used combination of diverse structural methods reveals details of the nucleophilic attack and identifies a novel enzyme-product complex structure

    Fehérje-térszerkezetek homológia modellezése

    Get PDF

    A supersecondary structure library and search algorithm for modeling loops in protein structures

    Get PDF
    We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105 950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed ϕ/ψ main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 Å of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a ∼5:1 ratio an earlier developed database search method

    ArchPRED:a template based loop structure prediction server

    Get PDF
    ArchPRED server () implements a novel fragment-search based method for predicting loop conformations. The inputs to the server are the atomic coordinates of the query protein and the position of the loop. The algorithm selects candidate loop fragments from a regularly updated loop library (Search Space) by matching the length, the types of bracing secondary structures of the query and by satisfying the geometrical restraints imposed by the stem residues. Subsequently, candidate loops are inserted in the query protein framework where their side chains are rebuilt and their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed [ϕ/ψ] main chain dihedral angle propensities. The final loop conformation is built in the protein structure and annealed in the environment using conjugate gradient minimization. The prediction method was benchmarked on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it was possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 Å of r.m.s.d. accuracy, respectively. In a head to head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a ∼5:1 ratio an earlier developed database search method
    corecore