51 research outputs found

    CSpritz: accurate prediction of protein disorder segments with annotation for homology, secondary structure and linear motifs

    Get PDF
    CSpritz is a web server for the prediction of intrinsic protein disorder. It is a combination of previous Spritz with two novel orthogonal systems developed by our group (Punch and ESpritz). Punch is based on sequence and structural templates trained with support vector machines. ESpritz is an efficient single sequence method based on bidirectional recursive neural networks. Spritz was extended to filter predictions based on structural homologues. After extensive testing, predictions are combined by averaging their probabilities. The CSpritz website can elaborate single or multiple predictions for either short or long disorder. The server provides a global output page, for download and simultaneous statistics of all predictions. Links are provided to each individual protein where the amino acid sequence and disorder prediction are displayed along with statistics for the individual protein. As a novel feature, CSpritz provides information about structural homologues as well as secondary structure and short functional linear motifs in each disordered segment. Benchmarking was performed on the very recent CASP9 data, where CSpritz would have ranked consistently well with a Sw measure of 49.27 and AUC of 0.828. The server, together with help and methods pages including examples, are freely available at URL: http://protein.bio.unipd.it/cspritz/

    Correlated Mutations: A Hallmark of Phenotypic Amino Acid Substitutions

    Get PDF
    Point mutations resulting in the substitution of a single amino acid can cause severe functional consequences, but can also be completely harmless. Understanding what determines the phenotypical impact is important both for planning targeted mutation experiments in the laboratory and for analyzing naturally occurring mutations found in patients. Common wisdom suggests using the extent of evolutionary conservation of a residue or a sequence motif as an indicator of its functional importance and thus vulnerability in case of mutation. In this work, we put forward the hypothesis that in addition to conservation, co-evolution of residues in a protein influences the likelihood of a residue to be functionally important and thus associated with disease. While the basic idea of a relation between co-evolution and functional sites has been explored before, we have conducted the first systematic and comprehensive analysis of point mutations causing disease in humans with respect to correlated mutations. We included 14,211 distinct positions with known disease-causing point mutations in 1,153 human proteins in our analysis. Our data show that (1) correlated positions are significantly more likely to be disease-associated than expected by chance, and that (2) this signal cannot be explained by conservation patterns of individual sequence positions. Although correlated residues have primarily been used to predict contact sites, our data are in agreement with previous observations that (3) many such correlations do not relate to physical contacts between amino acid residues. Access to our analysis results are provided at http://webclu.bio.wzw.tum.de/~pagel/supplements/correlated-positions/

    Analysing the origin of long-range interactions in proteins using lattice models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Long-range communication is very common in proteins but the physical basis of this phenomenon remains unclear. In order to gain insight into this problem, we decided to explore whether long-range interactions exist in lattice models of proteins. Lattice models of proteins have proven to capture some of the basic properties of real proteins and, thus, can be used for elucidating general principles of protein stability and folding.</p> <p>Results</p> <p>Using a computational version of double-mutant cycle analysis, we show that long-range interactions emerge in lattice models even though they are not an input feature of them. The coupling energy of both short- and long-range pairwise interactions is found to become more positive (destabilizing) in a linear fashion with increasing 'contact-frequency', an entropic term that corresponds to the fraction of states in the conformational ensemble of the sequence in which the pair of residues is in contact. A mathematical derivation of the linear dependence of the coupling energy on 'contact-frequency' is provided.</p> <p>Conclusion</p> <p>Our work shows how 'contact-frequency' should be taken into account in attempts to stabilize proteins by introducing (or stabilizing) contacts in the native state and/or through 'negative design' of non-native contacts.</p

    Trade-off between Positive and Negative Design of Protein Stability: From Lattice Models to Real Proteins

    Get PDF
    Two different strategies for stabilizing proteins are (i) positive design in which the native state is stabilized and (ii) negative design in which competing non-native conformations are destabilized. Here, the circumstances under which one strategy might be favored over the other are explored in the case of lattice models of proteins and then generalized and discussed with regard to real proteins. The balance between positive and negative design of proteins is found to be determined by their average ‘‘contact-frequency’’, a property that corresponds to the fraction of states in the conformational ensemble of the sequence in which a pair of residues is in contact. Lattice model proteins with a high average contact-frequency are found to use negative design more than model proteins with a low average contact-frequency. A mathematical derivation of this result indicates that it is general and likely to hold also for real proteins. Comparison of the results of correlated mutation analysis for real proteins with typical contact-frequencies to those of proteins likely to have high contactfrequencies (such as disordered proteins and proteins that are dependent on chaperonins for their folding) indicates that the latter tend to have stronger interactions between residues that are not in contact in their native conformation. Hence, our work indicates that negative design is employed when insufficient stabilization is achieved via positive design owing to high contact-frequencies

    Detecting selection for negative design in proteins through an improved model of the misfolded state

    No full text
    Proteins that need to be structured in their native state must be stable both against the unfolded ensemble and against incorrectly folded (misfolded) conformations with low free energy. Positive design targets the first type of stability by strengthening native interactions. The second type of stability is achieved by destabilizing interactions that occur frequently in the misfolded ensemble, a strategy called negative design. Here, we investigate negative design adopting a statistical mechanical model of the misfolded ensemble, which improves the usual Gaussian approximation by taking into account the third moment of the energy distribution and contact correlations. Applying this model, we detect and quantify selection for negative design in most natural proteins, and we analytically design protein sequences that are stable both against unfolding and against misfolding. © 2013 Wiley Periodicals, Inc.Spanish Government (BFU2011-24595, BFU2012-40020 and CSD2006-00023); Comunidad Autonoma de Madrid; AMAROUTO; Deutsche Forschungsgemeinschaft (PO 1025/6)Peer Reviewe
    corecore