7,591 research outputs found

    Interaction site prediction by structural similarity to neighboring clusters in protein-protein interaction networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recently, revealing the function of proteins with protein-protein interaction (PPI) networks is regarded as one of important issues in bioinformatics. With the development of experimental methods such as the yeast two-hybrid method, the data of protein interaction have been increasing extremely. Many databases dealing with these data comprehensively have been constructed and applied to analyzing PPI networks. However, few research on prediction interaction sites using both PPI networks and the 3D protein structures complementarily has explored.</p> <p>Results</p> <p>We propose a method of predicting interaction sites in proteins with unknown function by using both of PPI networks and protein structures. For a protein with unknown function as a target, several clusters are extracted from the neighboring proteins based on their structural similarity. Then, interaction sites are predicted by extracting similar sites from the group of a protein cluster and the target protein. Moreover, the proposed method can improve the prediction accuracy by introducing repetitive prediction process.</p> <p>Conclusions</p> <p>The proposed method has been applied to small scale dataset, then the effectiveness of the method has been confirmed. The challenge will now be to apply the method to large-scale datasets.</p

    Mean-Field Theory of Meta-Learning

    Full text link
    We discuss here the mean-field theory for a cellular automata model of meta-learning. The meta-learning is the process of combining outcomes of individual learning procedures in order to determine the final decision with higher accuracy than any single learning method. Our method is constructed from an ensemble of interacting, learning agents, that acquire and process incoming information using various types, or different versions of machine learning algorithms. The abstract learning space, where all agents are located, is constructed here using a fully connected model that couples all agents with random strength values. The cellular automata network simulates the higher level integration of information acquired from the independent learning trials. The final classification of incoming input data is therefore defined as the stationary state of the meta-learning system using simple majority rule, yet the minority clusters that share opposite classification outcome can be observed in the system. Therefore, the probability of selecting proper class for a given input data, can be estimated even without the prior knowledge of its affiliation. The fuzzy logic can be easily introduced into the system, even if learning agents are build from simple binary classification machine learning algorithms by calculating the percentage of agreeing agents.Comment: 23 page

    The Energy Landscape, Folding Pathways and the Kinetics of a Knotted Protein

    Get PDF
    The folding pathway and rate coefficients of the folding of a knotted protein are calculated for a potential energy function with minimal energetic frustration. A kinetic transition network is constructed using the discrete path sampling approach, and the resulting potential energy surface is visualized by constructing disconnectivity graphs. Owing to topological constraints, the low-lying portion of the landscape consists of three distinct regions, corresponding to the native knotted state and to configurations where either the N- or C-terminus is not yet folded into the knot. The fastest folding pathways from denatured states exhibit early formation of the N-terminus portion of the knot and a rate-determining step where the C-terminus is incorporated. The low-lying minima with the N-terminus knotted and the C-terminus free therefore constitute an off-pathway intermediate for this model. The insertion of both the N- and C-termini into the knot occur late in the folding process, creating large energy barriers that are the rate limiting steps in the folding process. When compared to other protein folding proteins of a similar length, this system folds over six orders of magnitude more slowly.Comment: 19 page

    Accounting for epistatic interactions improves the functional analysis of protein structures

    Get PDF
    Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A new paradigm of biological investigation takes advantage of technologies that produce large high throughput datasets, including genome sequences, interactions of proteins, and gene expression. The ability of biologists to analyze and interpret such data relies on functional annotation of the included proteins, but even in highly characterized organisms many proteins can lack the functional evidence necessary to infer their biological relevance.</p> <p>Results</p> <p>Here we have applied high confidence function predictions from our automated prediction system, PFP, to three genome sequences, <it>Escherichia coli</it>, <it>Saccharomyces cerevisiae</it>, and <it>Plasmodium falciparum </it>(malaria). The number of annotated genes is increased by PFP to over 90% for all of the genomes. Using the large coverage of the function annotation, we introduced the functional similarity networks which represent the functional space of the proteomes. Four different functional similarity networks are constructed for each proteome, one each by considering similarity in a single Gene Ontology (GO) category, <it>i.e. </it>Biological Process, Cellular Component, and Molecular Function, and another one by considering overall similarity with the <it>funSim </it>score. The functional similarity networks are shown to have higher modularity than the protein-protein interaction network. Moreover, the <it>funSim </it>score network is distinct from the single GO-score networks by showing a higher clustering degree exponent value and thus has a higher tendency to be hierarchical. In addition, examining function assignments to the protein-protein interaction network and local regions of genomes has identified numerous cases where subnetworks or local regions have functionally coherent proteins. These results will help interpreting interactions of proteins and gene orders in a genome. Several examples of both analyses are highlighted.</p> <p>Conclusion</p> <p>The analyses demonstrate that applying high confidence predictions from PFP can have a significant impact on a researchers' ability to interpret the immense biological data that are being generated today. The newly introduced functional similarity networks of the three organisms show different network properties as compared with the protein-protein interaction networks.</p
    corecore