18 research outputs found

    Anytime and Distributed Approaches for Graph Matching

    Get PDF
    Due to the inherent genericity of graph-based representations, and thanks to the improvement of computer capacities, structural representations have become more and more popular in the field of Pattern Recognition (PR). In a graph-based representation, vertices and their attributes describe objects (or part of them) while edges represent interrelationships between the objects. Representing objects by graphs turns the problem of object comparison into graph matching (GM) where correspondences between vertices and edges of two graphs have to be found.In the domain of GM, over the last decade, Graph Edit Distance (GED) has been given a specific attention due to its flexibility to match many types of graphs. GED has been applied to a wide range of specific applications from molecule recognition to image classification. Researchers have shed light on the approximate methods that can find suboptimal solutions hopefully close to the optimal ones but the gap between optimal and suboptimal solutions has not been deeply studied yet. For that reason, in this thesis, we focus on exact GED algorithms. Unfortunately, exact GED methods have an exponential complexity. Thus, coming up with an exact GED algorithm that can be scaled up to match graphs involved in PR tasks is a great challenge. Two promising ways to cut-off computational time are search space pruning and distributed algorithms. To this end, we first propose a depth-first GED algorithm which requires less memory and search time. An evaluation of all possible solutions is performed without explicitly enumerating all of them. Candidates are discarded using an upper and lower bounds strategy.To find a trade-off between speed and optimality, we describe how to convert the proposed depth-first GED method into an anytime one that is capable of delivering a first solution very quickly. It also can find a list of improved solutions and eventually converges to the optimal solution instead of providing one and only one solution (i.e., the optimal solution). With the delight of more time, anytime methods can also reach the optimal solution. To illustrate the usage of anytime GM algorithms, we convert our depth-first GED algorithm into an anytime one. We analyze the properties of such methods to solve GM problems and consider the performance in terms of accuracy of the provided solution compared to the optimal or the best one found by a state-of-the-art methods.This thesis is also considered as a first attempt to reduce the run time of exact GED methods usingparallel and distributed fashions. Two parallel and distributed GED approaches are put forward; both of them are based on the depth-first GED method. The search space is decomposed into smaller search trees which are solved independently in a parallel or a distributed manner.To benchmark the proposed GED methods, we propose not only assessing GED methods in a classification context but also evaluating them in a graph-level one (i.e., evaluating their distance and matchin accuracy). Due to the exponential complexity of exact GED algorithms and in order to obtain this kind of information about methods, we propose analyzing the behavior of the eight compared methods under time and memory constraints. In addition to the performance evaluations metrics, we propose a graph database repository dedicated to GED. In this repository, we add graph-level information to well-known and publicly used databases. Added information consists of the best found edit distance of each pair of graphs as well as their vertex-to-vertex and edge-to-edge mappings corresponding to the best found distance. This information helps in assessing the feasibility of exact and approximate GED methods. This thesis brings into question the usual evidences saying that it is impossible to use exact errortolerant GM methods in real-world applications when matching large graphs, or even in a classification context. However, we argue and show that a new type of GM, referred to as anytime methods, can be successful in a graph-level context as well as a classification one. Anytime videos, pseudo-codes and the publications related to the thesis are publicly available at: http://www.rfai.li.univ-tours.fr/ PagesPerso/zabuaisheh/home.html. The thesis is also publicly available at: http://www.rfai.li.univ-tours.fr/Documents/Articles_RFAI/PhD2016zeina.pd

    Алгоритм нахождения наибольшего общего подграфа

    Get PDF
    Предлагается новый переборный алгоритм решения задачи нахождения наибольшего общего подграфа. Приведены результаты численного анализа производительности алгоритма на графах различных классов и размеров, входящих в состав базы графов для оценки производительности алгоритмов решения задач установления морфизма на графах. Дана оценка потенциала применения разработанного алгоритма для решения реальных прикладных задач на графах размером порядка сотен вершин.Запропоновано новий переборний алгоритм вирішення задачі знаходження найбільшого загального підграфа. Наведено результати чисельного аналізу продуктивності алгоритму на графах різних класів та розмірів, що складають базу графів для оцінки продуктивності алгоритмів вирішення задач встановлення морфізму на графах. Надана оцінка потенціалу застосування розробленого алгоритму для вирішення реальних задач на графах розміром до декількох сотень вершин.A new enumerating algorithm for the solution of the problem of finding a maximal common subgraph is proposed. The results are presented for the numerical analysis of the algorithm efficiency on graphs of different classes and sizes, which compose the graph database for estimation of the efficiency of algorithms for solving problems concerning morphism on graphs. The potential of using the algorithm in solving real-world problems on graphs sizing up to several hundreds of vertices is estimated

    Between Subgraph Isomorphism and Maximum Common Subgraph

    Get PDF
    When a small pattern graph does not occur inside a larger target graph, we can ask how to find "as much of the pattern as possible" inside the target graph. In general, this is known as the maximum common subgraph problem, which is much more computationally challenging in practice than subgraph isomorphism. We introduce a restricted alternative, where we ask if all but k vertices from the pattern can be found in the target graph. This allows for the development of slightly weakened forms of certain invariants from subgraph isomorphism which are based upon degree and number of paths. We show that when k is small, weakening the invariants still retains much of their effectiveness. We are then able to solve this problem on the standard problem instances used to benchmark subgraph isomorphism algorithms, despite these instances being too large for current maximum common subgraph algorithms to handle. Finally, by iteratively increasing k, we obtain an algorithm which is also competitive for the maximum common subgraph

    Experimental Evaluation of Subgraph Isomorphism Solvers

    Get PDF
    International audienceSubgraph Isomorphism (SI) is an NP-complete problem which is at the heart of many structural pattern recognition tasks as it involves finding a copy of a pattern graph into a target graph. In the pattern recognition community, the most well-known SI solvers are VF2, VF3, and RI. SI is also widely studied in the constraint programming community, and many constraint-based SI solvers have been proposed since Ullman, such as LAD and Glasgow, for example. All these SI solvers can solve very quickly some large SI instances, that involve graphs with thousands of nodes. However, McCreesh et al. have recently shown how to randomly generate SI instances the hardness of which can be controlled and predicted, and they have built small instances which are computationally challenging for all solvers. They have also shown that some small instances, which are predicted to be easy and are easily solved by constraint-based solvers, appear to be challenging for VF2 and VF3. In this paper, we widen this study by considering a large test suite coming from eight benchmarks. We show that, as expected for an NP-complete problem, the solving time of an instance does not depend on its size, and that some small instances coming from real applications are not solved by any of the considered solvers. We also show that, if RI and VF3 can solve very quickly a large number of easy instances, for which Glasgow or LAD need more time, they fail at solving some other instances that are quickly solved by Glasgow or LAD, and they are clearly outperformed by Glasgow on hard instances. Finally, we show that we can easily combine solvers to take benefit of their complementarity

    SEARCH-TREE SIZE ESTIMATION FOR THE SUBGRAPH ISOMORPHISM PROBLEM

    Get PDF
    This article addresses the problem of finding patterns in graphs. This is formally defined as the subgraph isomorphism problem and is one of the core problems in theoretical computer science. We consider the counting variation of this problem. The task is to count all instances of the pattern G occurring in a (usually larger) graph H. The vast majority of algorithms for this problem use a variation of backtracking. Most commonly they exhaustively search through the space of all possible monomorphisms between G and H. The size of the search tree depends heavily on the choice of the ordering of vertices of G, which are systematically assigned to the vertices of H. We use a method called heuristic sampling to estimate the size of the search tree for each ordering in advance. We use this estimation to select the most suitable order of vertices of G which minimizes the expected tree size. This approach is empirically evaluated on a set of instances, showing the practical potential of the method

    Certifying Solvers for Clique and Maximum Common (Connected) Subgraph Problems

    Get PDF
    An algorithm is said to be certifying if it outputs, together with a solution to the problem it solves, a proof that this solution is correct. We explain how state of the art maximum clique, maximum weighted clique, maximal clique enumeration and maximum common (connected) induced subgraph algorithms can be turned into certifying solvers by using pseudo-Boolean models and cutting planes proofs, and demonstrate that this approach can also handle reductions between problems. The generality of our results suggests that this method is ready for widespread adoption in solvers for combinatorial graph problems
    corecore