482 research outputs found

    Experimental Evaluation of Subgraph Isomorphism Solvers

    Get PDF
    International audienceSubgraph Isomorphism (SI) is an NP-complete problem which is at the heart of many structural pattern recognition tasks as it involves finding a copy of a pattern graph into a target graph. In the pattern recognition community, the most well-known SI solvers are VF2, VF3, and RI. SI is also widely studied in the constraint programming community, and many constraint-based SI solvers have been proposed since Ullman, such as LAD and Glasgow, for example. All these SI solvers can solve very quickly some large SI instances, that involve graphs with thousands of nodes. However, McCreesh et al. have recently shown how to randomly generate SI instances the hardness of which can be controlled and predicted, and they have built small instances which are computationally challenging for all solvers. They have also shown that some small instances, which are predicted to be easy and are easily solved by constraint-based solvers, appear to be challenging for VF2 and VF3. In this paper, we widen this study by considering a large test suite coming from eight benchmarks. We show that, as expected for an NP-complete problem, the solving time of an instance does not depend on its size, and that some small instances coming from real applications are not solved by any of the considered solvers. We also show that, if RI and VF3 can solve very quickly a large number of easy instances, for which Glasgow or LAD need more time, they fail at solving some other instances that are quickly solved by Glasgow or LAD, and they are clearly outperformed by Glasgow on hard instances. Finally, we show that we can easily combine solvers to take benefit of their complementarity

    Solving hard subgraph problems in parallel

    Get PDF
    This thesis improves the state of the art in exact, practical algorithms for finding subgraphs. We study maximum clique, subgraph isomorphism, and maximum common subgraph problems. These are widely applicable: within computing science, subgraph problems arise in document clustering, computer vision, the design of communication protocols, model checking, compiler code generation, malware detection, cryptography, and robotics; beyond, applications occur in biochemistry, electrical engineering, mathematics, law enforcement, fraud detection, fault diagnosis, manufacturing, and sociology. We therefore consider both the ``pure'' forms of these problems, and variants with labels and other domain-specific constraints. Although subgraph-finding should theoretically be hard, the constraint-based search algorithms we discuss can easily solve real-world instances involving graphs with thousands of vertices, and millions of edges. We therefore ask: is it possible to generate ``really hard'' instances for these problems, and if so, what can we learn? By extending research into combinatorial phase transition phenomena, we develop a better understanding of branching heuristics, as well as highlighting a serious flaw in the design of graph database systems. This thesis also demonstrates how to exploit two of the kinds of parallelism offered by current computer hardware. Bit parallelism allows us to carry out operations on whole sets of vertices in a single instruction---this is largely routine. Thread parallelism, to make use of the multiple cores offered by all modern processors, is more complex. We suggest three desirable performance characteristics that we would like when introducing thread parallelism: lack of risk (parallel cannot be exponentially slower than sequential), scalability (adding more processing cores cannot make runtimes worse), and reproducibility (the same instance on the same hardware will take roughly the same time every time it is run). We then detail the difficulties in guaranteeing these characteristics when using modern algorithmic techniques. Besides ensuring that parallelism cannot make things worse, we also increase the likelihood of it making things better. We compare randomised work stealing to new tailored strategies, and perform experiments to identify the factors contributing to good speedups. We show that whilst load balancing is difficult, the primary factor influencing the results is the interaction between branching heuristics and parallelism. By using parallelism to explicitly offset the commitment made to weak early branching choices, we obtain parallel subgraph solvers which are substantially and consistently better than the best sequential algorithms

    Efficient Algorithms for Graph Optimization Problems

    Get PDF
    A doktori értekezés hatékony algoritmusokat mutat be gráfokon értelmezett nehéz kombinatorikus optimalizálási feladatok megoldására. A kutatás legfontosabb eredményét különböző megoldási módszerekhez kidolgozott javítások jelentik, amelyek magukban foglalnak új heurisztikákat, valamint gráfok és fák speciális reprezentációit is. Az elvégzett elemzések igazolták, hogy a szerző által adott leghatékonyabb algoritmusok az esetek többségében gyorsabbak, illetve jobb eredményeket adnak, mint más elérhető implementációk. A dolgozat első fele hét különböző algoritmust és számos hasznos javítást mutat be a minimális költségű folyam feladatra, amely a legtöbbet vizsgált és alkalmazott gráfoptimalizálási problémák egyike. Az implementációinkat egy átfogó tapasztalati elemzés keretében összehasonlítottuk nyolc másik megoldóprogrammal, köztük a leggyakrabban használt és legelismertebb implementációkkal. A hálózati szimplex algoritmusunk lényegesen hatékonyabbnak és robusztusabbnak bizonyult, mint a módszer más implementációi, továbbá a legtöbb tesztadaton ez az algoritmus a leggyorsabb. A bemutatott költségskálázó algoritmus szintén rendkívül hatékony; nagy méretű ritka gráfokon felülmúlja a hálózati szimplex implementációkat. Az értekezésben tárgyalt másik optimalizálási feladat a legnagyobb közös részgráf probléma. Ezt a feladatot kémiai alkalmazások szempontjából vizsgáltuk. Hatékony heurisztikákat dolgoztunk ki, amelyek jelentősen javítják két megoldási módszer pontosságát és sebességét, valamint kémiailag relevánsabb módon rendelik egymáshoz molekulagráfok atomjait és kötéseit. Az algoritmusainkat összehasonlítottuk két ismert megoldóprogrammal, amelyeknél lényegesen jobb eredményeket sikerült elérnünk. A kifejlesztett implementációk bekerültek a ChemAxon Kft. több szoftvertermékébe, melyek vezető nemzetközi gyógyszercégek használatában állnak. Ezen kívül az értekezés röviden bemutatja a LEMON nevű nyílt forrású C++ gráfoptimalizációs programkönyvtárat, amely magában foglalja a minimális költségű folyam feladatra adott algoritmusokat. Ezek az implementációk nagy mértékben hozzájárultak a programcsomag népszerűségének növekedéséhez

    The Multi-Maximum and Quasi-Maximum Common Subgraph Problem

    Get PDF
    The Maximum Common Subgraph problem has been long proven NP-hard. Nevertheless, it has countless practical applications, and researchers are still searching for exact solutions and scalable heuristic approaches. Driven by applications in molecular science and cyber-security, we concentrate on the Maximum Common Subgraph among an indefinite number of graphs. We first extend a state-of-the-art branch-and-bound procedure working on two graphs to N graphs. Then, given the high computational cost of this approach, we trade off complexity for accuracy, and we propose a set of heuristics to approximate the exact solution for N graphs. We analyze sequential, parallel multi-core, and parallel-many core (GPU-based) approaches, exploiting several leveraging techniques to decrease the contention among threads, improve the workload balance of the different tasks, reduce the computation time, and increase the final result size. We also present several sorting heuristics to order the vertices of the graphs and the graphs themselves. We compare our algorithms with a state-of-the-art method on publicly available benchmark sets. On graph pairs, we are able to speed up the exact computation by a 2× factor, pruning the search space by more than 60%. On sets of more than two graphs, all exact solutions are extremely time-consuming and of a complex application in many real cases. On the contrary, our heuristics are far less expensive (as they show a lower-bound for the speed up of 10×), have a far better asymptotic complexity (with speed ups up to several orders of magnitude in our experiments), and obtain excellent approximations of the maximal solution with 98.5% of the nodes on average

    A SAT approach to the initial mapping problem in SWAP gate insertion for commuting gates

    Full text link
    Most quantum circuits require SWAP gate insertion to run on quantum hardware with limited qubit connectivity. A promising SWAP gate insertion method for blocks of commuting two-qubit gates is a predetermined swap strategy which applies layers of SWAP gates simultaneously executable on the coupling map. A good initial mapping for the swap strategy reduces the number of required swap gates. However, even when a circuit consists of commuting gates, e.g., as in the Quantum Approximate Optimization Algorithm (QAOA) or trotterized simulations of Ising Hamiltonians, finding a good initial mapping is a hard problem. We present a SAT-based approach to find good initial mappings for circuits with commuting gates transpiled to the hardware with swap strategies. Our method achieves a 65% reduction in gate count for random three-regular graphs with 500 nodes. In addition, we present a heuristic approach that combines the SAT formulation with a clustering algorithm to reduce large problems to a manageable size. This approach reduces the number of swap layers by 25% compared to both a trivial and random initial mapping for a random three-regular graph with 1000 nodes. Good initial mappings will therefore enable the study of quantum algorithms, such as QAOA and Ising Hamiltonian simulation applied to sparse problems, on noisy quantum hardware with several hundreds of qubits.Comment: 7 page

    Sequential and parallel solution-biased search for subgraph algorithms

    Get PDF
    Funding: This work was supported by the Engineering and Physical Sciences Research Council (grant numbers EP/P026842/1, EP/M508056/1, and EP/N007565).The current state of the art in subgraph isomorphism solving involves using degree as a value-ordering heuristic to direct backtracking search. Such a search makes a heavy commitment to the first branching choice, which is often incorrect. To mitigate this, we introduce and evaluate a new approach, which we call “solution-biased search”. By combining a slightly-random value-ordering heuristic, rapid restarts, and nogood recording, we design an algorithm which instead uses degree to direct the proportion of search effort spent in different subproblems. This increases performance by two orders of magnitude on satisfiable instances, whilst not affecting performance on unsatisfiable instances. This algorithm can also be parallelised in a very simple but effective way: across both satisfiable and unsatisfiable instances, we get a further speedup of over thirty from thirty-six cores, and over one hundred from ten distributed-memory hosts. Finally, we show that solution-biased search is also suitable for optimisation problems, by using it to improve two maximum common induced subgraph algorithms.Postprin

    Partitioning algorithms for induced subgraph problems

    Get PDF
    This dissertation introduces the MCSPLIT family of algorithms for two closely-related NP-hard problems that involve finding a large induced subgraph contained by each of two input graphs: the induced subgraph isomorphism problem and the maximum common induced subgraph problem. The MCSPLIT algorithms resemble forward-checking constrant programming algorithms, but use problem-specific data structures that allow multiple, identical domains to be stored without duplication. These data structures enable fast, simple constraint propagation algorithms and very fast calculation of upper bounds. Versions of these algorithms for both sparse and dense graphs are described and implemented. The resulting algorithms are over an order of magnitude faster than the best existing algorithm for maximum common induced subgraph on unlabelled graphs, and outperform the state of the art on several classes of induced subgraph isomorphism instances. A further advantage of the MCSPLIT data structures is that variables and values are treated identically; this allows us to choose to branch on variables representing vertices of either input graph with no overhead. An extensive set of experiments shows that such two-sided branching can be particularly beneficial if the two input graphs have very different orders or densities. Finally, we turn from subgraphs to supergraphs, tackling the problem of finding a small graph that contains every member of a given family of graphs as an induced subgraph. Exact and heuristic techniques are developed for this problem, in each case using a MCSPLIT algorithm as a subroutine. These algorithms allow us to add new terms to two entries of the On-Line Encyclopedia of Integer Sequences
    corecore