26,237 research outputs found

    Anytime and Distributed Approaches for Graph Matching

    Get PDF
    Due to the inherent genericity of graph-based representations, and thanks to the improvement of computer capacities, structural representations have become more and more popular in the field of Pattern Recognition (PR). In a graph-based representation, vertices and their attributes describe objects (or part of them) while edges represent interrelationships between the objects. Representing objects by graphs turns the problem of object comparison into graph matching (GM) where correspondences between vertices and edges of two graphs have to be found.In the domain of GM, over the last decade, Graph Edit Distance (GED) has been given a specific attention due to its flexibility to match many types of graphs. GED has been applied to a wide range of specific applications from molecule recognition to image classification. Researchers have shed light on the approximate methods that can find suboptimal solutions hopefully close to the optimal ones but the gap between optimal and suboptimal solutions has not been deeply studied yet. For that reason, in this thesis, we focus on exact GED algorithms. Unfortunately, exact GED methods have an exponential complexity. Thus, coming up with an exact GED algorithm that can be scaled up to match graphs involved in PR tasks is a great challenge. Two promising ways to cut-off computational time are search space pruning and distributed algorithms. To this end, we first propose a depth-first GED algorithm which requires less memory and search time. An evaluation of all possible solutions is performed without explicitly enumerating all of them. Candidates are discarded using an upper and lower bounds strategy.To find a trade-off between speed and optimality, we describe how to convert the proposed depth-first GED method into an anytime one that is capable of delivering a first solution very quickly. It also can find a list of improved solutions and eventually converges to the optimal solution instead of providing one and only one solution (i.e., the optimal solution). With the delight of more time, anytime methods can also reach the optimal solution. To illustrate the usage of anytime GM algorithms, we convert our depth-first GED algorithm into an anytime one. We analyze the properties of such methods to solve GM problems and consider the performance in terms of accuracy of the provided solution compared to the optimal or the best one found by a state-of-the-art methods.This thesis is also considered as a first attempt to reduce the run time of exact GED methods usingparallel and distributed fashions. Two parallel and distributed GED approaches are put forward; both of them are based on the depth-first GED method. The search space is decomposed into smaller search trees which are solved independently in a parallel or a distributed manner.To benchmark the proposed GED methods, we propose not only assessing GED methods in a classification context but also evaluating them in a graph-level one (i.e., evaluating their distance and matchin accuracy). Due to the exponential complexity of exact GED algorithms and in order to obtain this kind of information about methods, we propose analyzing the behavior of the eight compared methods under time and memory constraints. In addition to the performance evaluations metrics, we propose a graph database repository dedicated to GED. In this repository, we add graph-level information to well-known and publicly used databases. Added information consists of the best found edit distance of each pair of graphs as well as their vertex-to-vertex and edge-to-edge mappings corresponding to the best found distance. This information helps in assessing the feasibility of exact and approximate GED methods. This thesis brings into question the usual evidences saying that it is impossible to use exact errortolerant GM methods in real-world applications when matching large graphs, or even in a classification context. However, we argue and show that a new type of GM, referred to as anytime methods, can be successful in a graph-level context as well as a classification one. Anytime videos, pseudo-codes and the publications related to the thesis are publicly available at: http://www.rfai.li.univ-tours.fr/ PagesPerso/zabuaisheh/home.html. The thesis is also publicly available at: http://www.rfai.li.univ-tours.fr/Documents/Articles_RFAI/PhD2016zeina.pd

    Convex Graph Invariant Relaxations For Graph Edit Distance

    Get PDF
    The edit distance between two graphs is a widely used measure of similarity that evaluates the smallest number of vertex and edge deletions/insertions required to transform one graph to another. It is NP-hard to compute in general, and a large number of heuristics have been proposed for approximating this quantity. With few exceptions, these methods generally provide upper bounds on the edit distance between two graphs. In this paper, we propose a new family of computationally tractable convex relaxations for obtaining lower bounds on graph edit distance. These relaxations can be tailored to the structural properties of the particular graphs via convex graph invariants. Specific examples that we highlight in this paper include constraints on the graph spectrum as well as (tractable approximations of) the stability number and the maximum-cut values of graphs. We prove under suitable conditions that our relaxations are tight (i.e., exactly compute the graph edit distance) when one of the graphs consists of few eigenvalues. We also validate the utility of our framework on synthetic problems as well as real applications involving molecular structure comparison problems in chemistry.Comment: 27 pages, 7 figure

    Topology Discovery of Sparse Random Graphs With Few Participants

    Get PDF
    We consider the task of topology discovery of sparse random graphs using end-to-end random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain end-to-end measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario (a), our proposed algorithm results in a sub-linear edit-distance guarantee using a sub-linear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sub-linear number of uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of end-to-end information along all the paths between the participants.Comment: A shorter version appears in ACM SIGMETRICS 2011. This version is scheduled to appear in J. on Random Structures and Algorithm

    If the Current Clique Algorithms are Optimal, so is Valiant's Parser

    Full text link
    The CFG recognition problem is: given a context-free grammar G\mathcal{G} and a string ww of length nn, decide if ww can be obtained from G\mathcal{G}. This is the most basic parsing question and is a core computer science problem. Valiant's parser from 1975 solves the problem in O(nω)O(n^{\omega}) time, where ω<2.373\omega<2.373 is the matrix multiplication exponent. Dozens of parsing algorithms have been proposed over the years, yet Valiant's upper bound remains unbeaten. The best combinatorial algorithms have mildly subcubic O(n3/log3n)O(n^3/\log^3{n}) complexity. Lee (JACM'01) provided evidence that fast matrix multiplication is needed for CFG parsing, and that very efficient and practical algorithms might be hard or even impossible to obtain. Lee showed that any algorithm for a more general parsing problem with running time O(Gn3ε)O(|\mathcal{G}|\cdot n^{3-\varepsilon}) can be converted into a surprising subcubic algorithm for Boolean Matrix Multiplication. Unfortunately, Lee's hardness result required that the grammar size be G=Ω(n6)|\mathcal{G}|=\Omega(n^6). Nothing was known for the more relevant case of constant size grammars. In this work, we prove that any improvement on Valiant's algorithm, even for constant size grammars, either in terms of runtime or by avoiding the inefficiencies of fast matrix multiplication, would imply a breakthrough algorithm for the kk-Clique problem: given a graph on nn nodes, decide if there are kk that form a clique. Besides classifying the complexity of a fundamental problem, our reduction has led us to similar lower bounds for more modern and well-studied cubic time problems for which faster algorithms are highly desirable in practice: RNA Folding, a central problem in computational biology, and Dyck Language Edit Distance, answering an open question of Saha (FOCS'14)
    corecore