86 research outputs found

    Background frequencies for residue variability estimates: BLOSUM revisited

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Shannon entropy applied to columns of multiple sequence alignments as a score of residue conservation has proven one of the most fruitful ideas in bioinformatics. This straightforward and intuitively appealing measure clearly shows the regions of a protein under increased evolutionary pressure, highlighting their functional importance. The inability of the column entropy to differentiate between residue types, however, limits its resolution power.</p> <p>Results</p> <p>In this work we suggest generalizing Shannon's expression to a function with similar mathematical properties, that, at the same time, includes observed propensities of residue types to mutate to each other. To do that, we revisit the original construction of BLOSUM matrices, and re-interpret them as mutation probability matrices. These probabilities are then used as background frequencies in the revised residue conservation measure.</p> <p>Conclusion</p> <p>We show that joint entropy with BLOSUM-proportional probabilities as a reference distribution enables detection of protein functional sites comparable in quality to a time-costly maximum-likelihood evolution simulation method (rate4site), and offers greater resolution than the Shannon entropy alone, in particular in the cases when the available sequences are of narrow evolutionary scope.</p

    pFlexAna: detecting conformational changes in remotely related proteins

    Get PDF
    The pFlexAna (protein flexibility analyzer) web server detects and displays conformational changes in remotely related proteins, without relying on sequence homology. To do so, it first applies a reliable statistical test to align core protein fragments that are structurally similar and then clusters these aligned fragment pairs into ‘super-alignments’, according to the similarity of geometric transformations that align them. The result is that the dominant conformational changes occur between the clusters, while the smaller conformational changes occur within a cluster. pFlexAna is available at http://bigbird.comp.nus.edu.sg/pfa2/

    Activation Energy in a Quantum Hall Ferromagnet and Non-Hartree-Fock Skyrmions

    Full text link
    The energy of Skyrmions is calculated with the help of a technique based on the excitonic representation: the basic set of one-exciton states is used for the perturbation-theory formalism instead of the basic set of one-particle states. We use the approach, at which a skyrmion-type excitation (at zero Lande factor) is considered as a smooth non-uniform rotation in the 3D spin space. The result within the framework of an excitonically diagonalized part of the Coulomb Hamiltonian can be obtained by any ratio rC=(e2/ϵlB)/ωcr_{\tiny C}=(e^2/\epsilon {}l_B)/\hbar \omega_c [where e2/ϵlBe^2/\epsilon {}l_B is the typical Coulomb energy (lB{}l_B being the magnetic length); ωc\omega_c is the cyclotron frequency], and the Landau-level mixing is thereby taken into account. In parallel with this, the result is also found exactly, to second order in terms of the rCr_{\tiny C} (if supposing rCr_{\tiny C} to be small) with use of the total Hamiltonian. When extrapolated to the region rC1r_{\tiny C}\sim 1, our calculations show that the skyrmion gap becomes substantially reduced in comparison with the Hartree-Fock calculations. This fact brings the theory essentially closer to the available experimental data.Comment: 14 pages, 1 figure. to appear in Phys. Rev. B, Vol. 65 (Numbers ~ 19-22), 200

    The Effects of Disorder on the ν=1\nu=1 Quantum Hall State

    Full text link
    A disorder-averaged Hartree-Fock treatment is used to compute the density of single particle states for quantum Hall systems at filling factor ν=1\nu=1. It is found that transport and spin polarization experiments can be simultaneously explained by a model of mostly short-range effective disorder. The slope of the transport gap (due to quasiparticles) in parallel field emerges as a result of the interplay between disorder-induced broadening and exchange, and has implications for skyrmion localization.Comment: 4 pages, 3 eps figure

    Accurate Protein Structure Annotation through Competitive Diffusion of Enzymatic Functions over a Network of Local Evolutionary Similarities

    Get PDF
    High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks

    Identifying allosteric fluctuation transitions between different protein conformational states as applied to Cyclin Dependent Kinase 2

    Get PDF
    BACKGROUND: The mechanisms underlying protein function and associated conformational change are dominated by a series of local entropy fluctuations affecting the global structure yet are mediated by only a few key residues. Transitional Dynamic Analysis (TDA) is a new method to detect these changes in local protein flexibility between different conformations arising from, for example, ligand binding. Additionally, Positional Impact Vertex for Entropy Transfer (PIVET) uses TDA to identify important residue contact changes that have a large impact on global fluctuation. We demonstrate the utility of these methods for Cyclin-dependent kinase 2 (CDK2), a system with crystal structures of this protein in multiple functionally relevant conformations and experimental data revealing the importance of local fluctuation changes for protein function. RESULTS: TDA and PIVET successfully identified select residues that are responsible for conformation specific regional fluctuation in the activation cycle of Cyclin Dependent Kinase 2 (CDK2). The detected local changes in protein flexibility have been experimentally confirmed to be essential for the regulation and function of the kinase. The methodologies also highlighted possible errors in previous molecular dynamic simulations that need to be resolved in order to understand this key player in cell cycle regulation. Finally, the use of entropy compensation as a possible allosteric mechanism for protein function is reported for CDK2. CONCLUSION: The methodologies embodied in TDA and PIVET provide a quick approach to identify local fluctuation change important for protein function and residue contacts that contributes to these changes. Further, these approaches can be used to check for possible errors in protein dynamic simulations and have the potential to facilitate a better understanding of the contribution of entropy to protein allostery and function

    Active site prediction using evolutionary and structural information

    Get PDF
    Motivation: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benchmark. Here we present a new method, Discern, which provides a significant improvement over the state-of-the-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites

    Joint Evolutionary Trees: A Large-Scale Method To Predict Protein Interfaces Based on Sequence Sampling

    Get PDF
    The Joint Evolutionary Trees (JET) method detects protein interfaces, the core residues involved in the folding process, and residues susceptible to site-directed mutagenesis and relevant to molecular recognition. The approach, based on the Evolutionary Trace (ET) method, introduces a novel way to treat evolutionary information. Families of homologous sequences are analyzed through a Gibbs-like sampling of distance trees to reduce effects of erroneous multiple alignment and impacts of weakly homologous sequences on distance tree construction. The sampling method makes sequence analysis more sensitive to functional and structural importance of individual residues by avoiding effects of the overrepresentation of highly homologous sequences and improves computational efficiency. A carefully designed clustering method is parametrized on the target structure to detect and extend patches on protein surfaces into predicted interaction sites. Clustering takes into account residues' physical-chemical properties as well as conservation. Large-scale application of JET requires the system to be adjustable for different datasets and to guarantee predictions even if the signal is low. Flexibility was achieved by a careful treatment of the number of retrieved sequences, the amino acid distance between sequences, and the selective thresholds for cluster identification. An iterative version of JET (iJET) that guarantees finding the most likely interface residues is proposed as the appropriate tool for large-scale predictions. Tests are carried out on the Huang database of 62 heterodimer, homodimer, and transient complexes and on 265 interfaces belonging to signal transduction proteins, enzymes, inhibitors, antibodies, antigens, and others. A specific set of proteins chosen for their special functional and structural properties illustrate JET behavior on a large variety of interactions covering proteins, ligands, DNA, and RNA. JET is compared at a large scale to ET and to Consurf, Rate4Site, siteFiNDER|3D, and SCORECONS on specific structures. A significant improvement in performance and computational efficiency is shown
    corecore