55 research outputs found

    Towards the Design of Metamorphic Proteins using Ensemble-Based Energetic Information

    Get PDF
    Miralles, Enric; Tagliabue, BenedettaPrimer pla d'una part del Parc de Diagonal-Mar. Es pot veure el paisatge del parc, realitzat amb estructures d'acer i elements de trencadís ceràmic. Al fons, es veuen uns edificis de gran alçada

    Investigating Homology between Proteins using Energetic Profiles

    Get PDF
    Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology

    Peptide Conformer Acidity Analysis of Protein Flexibility Monitored by Hydrogen Exchange†

    Get PDF
    ABSTRACT: The amide hydrogens that are exposed to solvent in the high-resolution X-ray structures of ubiquitin, FK506-binding protein, chymotrypsin inhibitor 2, and rubredoxin span a billion-fold range in hydroxide-catalyzed exchange rates which are predictable by continuum dielectric methods. To facilitate analysis of transiently accessible amides, the hydroxide-catalyzed rate constants for every backbone amide of ubiquitin were determined under near physiological conditions. With the previously reported NMR-restrained molecular dynamics ensembles of ubiquitin (PDB codes 2NR2 and 2K39) used as representations of the Boltzmann-weighted conformational distribution, nearly all of the exchange rates for the highly exposed amides were more accurately predicted than by use of the high-resolution X-ray structure. More strikingly, predictions for the amide hydrogens of the NMR relaxation-restrained ensemble that become exposed to solvent in more than one but less than half of the 144 protein conformations in this ensemble were almost as accurate. In marked contrast, the exchange rates for many of the analogous amides in the residual dipolar coupling-restrained ubiquitin ensemble are substantially overestimated, as was particularly evident for the Ile 44 to Lys 48 segment which constitutes the primary interaction site for the proteasome targeting enzymes involved in polyubiquitylation. For both ensembles, “excited state ” conformers in this active site region having markedly elevated peptide acidities are represented at a population level that is 102 to 103 abov

    Automated Alphabet Reduction for Protein Datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques.</p> <p>Results</p> <p>We applied this protocol to the prediction of two protein structural features: contact number and relative solvent accessibility. For both features we generated alphabets of two, three, four and five letters. The five-letter alphabets gave prediction accuracies statistically similar to that obtained using the full amino acid alphabet. Moreover, the automatically designed alphabets were compared against other reduced alphabets taken from the literature or human-designed, outperforming them. The differences between our alphabets and the alphabets taken from the literature were quantitatively analyzed. All the above process had been performed using a primary sequence representation of proteins. As a final experiment, we extrapolated the obtained five-letter alphabet to reduce a, much richer, protein representation based on evolutionary information for the prediction of the same two features. Again, the performance gap between the full representation and the reduced representation was small, showing that the results of our automated alphabet reduction protocol, even if they were obtained using a simple representation, are also able to capture the crucial information needed for state-of-the-art protein representations.</p> <p>Conclusion</p> <p>Our automated alphabet reduction protocol generates competent reduced alphabets tailored specifically for a variety of protein datasets. This process is done without any domain knowledge, using information theory metrics instead. The reduced alphabets contain some unexpected (but sound) groups of amino acids, thus suggesting new ways of interpreting the data.</p

    Nature of protein family signatures: Insights from singular value analysis of position-specific scoring matrices

    Get PDF
    Position-specific scoring matrices (PSSMs) are useful for detecting weak homology in protein sequence analysis, and they are thought to contain some essential signatures of the protein families. In order to elucidate what kind of ingredients constitute such family-specific signatures, we apply singular value decomposition to a set of PSSMs and examine the properties of dominant right and left singular vectors. The first right singular vectors were correlated with various amino acid indices including relative mutability, amino acid composition in protein interior, hydropathy, or turn propensity, depending on proteins. A significant correlation between the first left singular vector and a measure of site conservation was observed. It is shown that the contribution of the first singular component to the PSSMs act to disfavor potentially but falsely functionally important residues at conserved sites. The second right singular vectors were highly correlated with hydrophobicity scales, and the corresponding left singular vectors with contact numbers of protein structures. It is suggested that sequence alignment with a PSSM is essentially equivalent to threading supplemented with functional information. The presented method may be used to separate functionally important sites from structurally important ones, and thus it may be a useful tool for predicting protein functions.Comment: 22 pages, 7 figures, 4 table

    A horizontal alignment tool for numerical trend discovery in sequence data: application to protein hydropathy.

    Get PDF
    PMC3794901An algorithm is presented that returns the optimal pairwise gapped alignment of two sets of signed numerical sequence values. One distinguishing feature of this algorithm is a flexible comparison engine (based on both relative shape and absolute similarity measures) that does not rely on explicit gap penalties. Additionally, an empirical probability model is developed to estimate the significance of the returned alignment with respect to randomized data. The algorithm's utility for biological hypothesis formulation is demonstrated with test cases including database search and pairwise alignment of protein hydropathy. However, the algorithm and probability model could possibly be extended to accommodate other diverse types of protein or nucleic acid data, including positional thermodynamic stability and mRNA translation efficiency. The algorithm requires only numerical values as input and will readily compare data other than protein hydropathy. The tool is therefore expected to complement, rather than replace, existing sequence and structure based tools and may inform medical discovery, as exemplified by proposed similarity between a chlamydial ORFan protein and bacterial colicin pore-forming domain. The source code, documentation, and a basic web-server application are available.JH Libraries Open Access Fun
    corecore