69 research outputs found

    A global optimization algorithm for protein surface alignment

    Get PDF
    Background A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. Several matching strategies have been designed for the recognition of protein-ligand binding sites and of protein-protein interfaces but the problem cannot be considered solved. Results In this paper we propose a new method for local structural alignment of protein surfaces based on continuous global optimization techniques. Given the three-dimensional structures of two proteins, the method finds the isometric transformation (rotation plus translation) that best superimposes active regions of two structures. We draw our inspiration from the well-known Iterative Closest Point (ICP) method for three-dimensional (3D) shapes registration. Our main contribution is in the adoption of a controlled random search as a more efficient global optimization approach along with a new dissimilarity measure. The reported computational experience and comparison show viability of the proposed approach. Conclusions Our method performs well to detect similarity in binding sites when this in fact exists. In the future we plan to do a more comprehensive evaluation of the method by considering large datasets of non-redundant proteins and applying a clustering technique to the results of all comparisons to classify binding sites

    Rank-Similarity Measures for Comparing Gene Prioritizations: A Case Study in Autism

    Get PDF
    We discuss the challenge of comparing three gene prioritization methods: network propagation, integer linear programming rank aggregation (RA), and statistical RA. These methods are based on different biological categories and estimate disease?gene association. Previously proposed comparison schemes are based on three measures of performance: receiver operating curve, area under the curve, and median rank ratio. Although they may capture important aspects of gene prioritization performance, they may fail to capture important differences in the rankings of individual genes. We suggest that comparison schemes could be improved by also considering recently proposed measures of similarity between gene rankings. We tested this suggestion on comparison schemes for prioritizations of genes associated with autism that were obtained using brain- and tissue-specific data. Our results show the effectiveness of our measures of similarity in clustering brain regions based on their relevance to autism

    Permutation Classifier

    Get PDF
    We consider permutations of a given set of n different symbols. We are given two unordered training sets, T1 and T2, of such permutations that are each assumed to contain examples of permutations of the corresponding type, t1 and t2. Our goal is to train a classifier, C(q), by computing a statistical model from T1 and T2, which, when given a candidate permutation, q, decides whether q is of type t1 or t2. We discuss two versions of this problem. The ranking version focuses on the order of the symbols. Our Separation Average Distance Matrix (SADiM) solution expands on previously proposed ranking aggregation formulations. The grouping version focuses on contiguity of symbols and hierarchical grouping. We propose and compare two solutions: (1) The Population Augmentation Ratio (PAR) solution computes a PQ-tree for each training set and uses a novel measure of distance between these and q that is based on ratios of population counts (i.e., of numbers of permutations explained by specific PQ-trees). (2) The Difference of Positions (DoP) solution is computationally less expensive than PAR and is independent of the absolute population counts. Although DoP does not have the simple statistical grounding of PAR, our experiments show that it is consistently effective

    Analysis of interactions between ribosomal proteins and RNA structural motifs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One important goal of structural bioinformatics is to recognize and predict the interactions between protein binding sites and RNA. Recently, a comprehensive analysis of ribosomal proteins and their interactions with rRNA has been done. Interesting results emerged from the comparison of r-proteins within the small subunit in <it>T. thermophilus </it>and <it>E. coli</it>, supporting the idea of a core made by both RNA and proteins, conserved by evolution. Recent work showed also that ribosomal RNA is modularly composed. Motifs are generally single-stranded sequences of consecutive nucleotides (ssRNA) with characteristic folding. The role of these motifs in protein-RNA interactions has been so far only sparsely investigated.</p> <p>Results</p> <p>This work explores the role of RNA structural motifs in the interaction of proteins with ribosomal RNA (rRNA). We analyze composition, local geometries and conformation of interface regions involving motifs such as tetraloops, kink turns and single extruded nucleotides. We construct an interaction map of protein binding sites that allows us to identify the common types of shared 3-D physicochemical binding patterns for tetraloops. Furthermore, we investigate the protein binding pockets that accommodate single extruded nucleotides either involved in kink-turns or in arbitrary RNA strands. This analysis reveals a new structural motif, called <it>tripod</it>.</p> <p>It corresponds to small pockets consisting of three aminoacids arranged at the vertices of an almost equilateral triangle. We developed a search procedure for the recognition of tripods, based on an empirical tripod fingerprint.</p> <p>Conclusion</p> <p>A comparative analysis with the overall RNA surface and interfaces shows that contact surfaces involving RNA motifs have distinctive features that may be useful for the recognition and prediction of interactions.</p

    AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology

    Get PDF
    Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo

    A Unifying Framework for Systolic Designs

    Get PDF

    Geometric Methods for Protein Structure Comparison

    Get PDF
    In this paper, we review some of the theoretical results on the computa- tional complexity of the algorithms designed to obtain optimal solutions to the problem of matching sets of points using specific metrics. From a theo- retical point of view, the problem has been extensively studied in the area of computational geometry, where it is often formulated as the problem of find- ing correspondences between sets of geometric features (for instance, points or segments). From these studies it appears that, in most practical cases, exact algorithms are too time consuming to be useful. Thus, approximate algorithms are considered that are computationally practical and at the same time are guaranteed to produce solutions that are within a certain bound from optimal. Furthermore, we discuss methods for the estimation of rigid transforma- tions under different metrics such as the Root Mean Square Deviation (RMSD) and the Hausdorff distance. Geometric indexing techniques prove their effec- tiveness in searching large protein databases and they are presented in details. Finally graph-theoretic protein modeling is reviewed as it is useful in designing algorithms for substructure identification and comparison

    Synthesizing Non-Uniform Systolic Designs

    Get PDF

    Parallel Algorithms for Line Detection on a Mesh

    Get PDF
    corecore