115 research outputs found

    Graph edit distance from spectral seriation

    Get PDF
    This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graph-matching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems

    Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem

    Full text link
    This paper builds upon the fundamental work of Niwa et al. [34], which provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free standardized microenvironment. The hardness of the problem comes from the superposition between the driving forces of intra- and inter-molecule interactions and it is mirrored by the evidences of shift from folding to aggregation phenotypes by single-point mutations [10]. Here we apply several state-of-the-art classification methods coming from the field of structural pattern recognition, with the aim to compare different representations of the same proteins gathered from the Niwa et al. data base; such representations include sequences and labeled (contact) graphs enriched with chemico-physical attributes. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating "easily foldable" from "hardly foldable" molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution used in the various discrimination systems.Comment: 17 pages, 3 figures, 46 reference

    Convex Relaxations for Permutation Problems

    Full text link
    Seriation seeks to reconstruct a linear order between variables using unsorted, pairwise similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We write seriation as an optimization problem by proving the equivalence between the seriation and combinatorial 2-SUM problems on similarity matrices (2-SUM is a quadratic minimization problem over permutations). The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we derive several convex relaxations for 2-SUM to improve the robustness of seriation solutions in noisy settings. These convex relaxations also allow us to impose structural constraints on the solution, hence solve semi-supervised seriation problems. We derive new approximation bounds for some of these relaxations and present numerical experiments on archeological data, Markov chains and DNA assembly from shotgun gene sequencing data.Comment: Final journal version, a few typos and references fixe

    An introduction to spectral distances in networks (extended version)

    Full text link
    Many functions have been recently defined to assess the similarity among networks as tools for quantitative comparison. They stem from very different frameworks - and they are tuned for dealing with different situations. Here we show an overview of the spectral distances, highlighting their behavior in some basic cases of static and dynamic synthetic and real networks

    Distributed Graph Isomorphism using Quantum Walks

    Get PDF
    Graph isomorphism being an NP problem, most of the systems that solves the graph isomorphism are constrained with some classes of the graph, and do not work for all types of graphs in polynomial time. We exploited the two particle quantum walks on different classes of graphs including strongly regular graphs which are co-spectral in nature. We simulated two particle quantum walks on graph using distributed algorithm. To show the effectiveness of the technique, we applied it to the large graphs derived from images using Delauney triangulation. The results show a remarkable speedup for large data. The two-particle quantum walks is implemented in map-reduce programming technique which scales the computation as the cluster get scaled to account Big data. We checked the isomorphism of the graphs with upto 100 vertices in polynomial time. The system is scalable to accept big inputs from any other domain in graph format. DOI: 10.17762/ijritcc2321-8169.15021

    A hybrid approach for categorizing images based on complex networks and neural networks

    Get PDF
    There are several methods for categorizing images, the most of which are statistical, geometric, model-based and structural methods. In this paper, a new method for describing images based on complex network models is presented. Each image contains a number of key points that can be identified through standard edge detection algorithms. To understand each image better, we can use these points to create a graph of the image. In order to facilitate the use of graphs, generated graphs are created in the form of a complex network of small-worlds. Complex grid features such as topological and dynamic features can be used to display image-related features. After generating this information, it normalizes them and uses them as suitable features for categorizing images. For this purpose, the generated information is given to the neural network. Based on these features and the use of neural networks, comparisons between new images are performed. The results of the article show that this method has a good performance in identifying similarities and finally categorizing them
    • …
    corecore