2,454 research outputs found

    Between Subgraph Isomorphism and Maximum Common Subgraph

    Get PDF
    When a small pattern graph does not occur inside a larger target graph, we can ask how to find "as much of the pattern as possible" inside the target graph. In general, this is known as the maximum common subgraph problem, which is much more computationally challenging in practice than subgraph isomorphism. We introduce a restricted alternative, where we ask if all but k vertices from the pattern can be found in the target graph. This allows for the development of slightly weakened forms of certain invariants from subgraph isomorphism which are based upon degree and number of paths. We show that when k is small, weakening the invariants still retains much of their effectiveness. We are then able to solve this problem on the standard problem instances used to benchmark subgraph isomorphism algorithms, despite these instances being too large for current maximum common subgraph algorithms to handle. Finally, by iteratively increasing k, we obtain an algorithm which is also competitive for the maximum common subgraph

    A Partitioning Algorithm for Maximum Common Subgraph Problems

    Get PDF
    We introduce a new branch and bound algorithm for the maximum common subgraph and maximum common connected subgraph problems which is based around vertex labelling and partitioning. Our method in some ways resembles a traditional constraint programming approach, but uses a novel compact domain store and supporting inference algorithms which dramatically reduce the memory and computation requirements during search, and allow better dual viewpoint ordering heuristics to be calculated cheaply. Experiments show a speedup of more than an order of magnitude over the state of the art, and demonstrate that we can operate on much larger graphs without running out of memory

    Shared Memory Parallel Subgraph Enumeration

    Full text link
    The subgraph enumeration problem asks us to find all subgraphs of a target graph that are isomorphic to a given pattern graph. Determining whether even one such isomorphic subgraph exists is NP-complete---and therefore finding all such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration has applications in many fields, including biochemistry and social networks, and interestingly the fastest algorithms for solving the problem for biochemical inputs are sequential. Since they depend on depth-first tree traversal, an efficient parallelization is far from trivial. Nevertheless, since important applications produce data sets with increasing difficulty, parallelism seems beneficial. We thus present here a shared-memory parallelization of the state-of-the-art subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs) by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing and our implementation demonstrates a significant speedup on real-world biochemical data---despite a highly irregular data access pattern. We also improve RI-DS by pruning the search space better; this further improves the empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel / Distributed Computing and Optimization (PDCO 2017

    On the Optimality of Pseudo-polynomial Algorithms for Integer Programming

    Get PDF
    In the classic Integer Programming (IP) problem, the objective is to decide whether, for a given m×nm \times n matrix AA and an mm-vector b=(b1,…,bm)b=(b_1,\dots, b_m), there is a non-negative integer nn-vector xx such that Ax=bAx=b. Solving (IP) is an important step in numerous algorithms and it is important to obtain an understanding of the precise complexity of this problem as a function of natural parameters of the input. The classic pseudo-polynomial time algorithm of Papadimitriou [J. ACM 1981] for instances of (IP) with a constant number of constraints was only recently improved upon by Eisenbrand and Weismantel [SODA 2018] and Jansen and Rohwedder [ArXiv 2018]. We continue this line of work and show that under the Exponential Time Hypothesis (ETH), the algorithm of Jansen and Rohwedder is nearly optimal. We also show that when the matrix AA is assumed to be non-negative, a component of Papadimitriou's original algorithm is already nearly optimal under ETH. This motivates us to pick up the line of research initiated by Cunningham and Geelen [IPCO 2007] who studied the complexity of solving (IP) with non-negative matrices in which the number of constraints may be unbounded, but the branch-width of the column-matroid corresponding to the constraint matrix is a constant. We prove a lower bound on the complexity of solving (IP) for such instances and obtain optimal results with respect to a closely related parameter, path-width. Specifically, we prove matching upper and lower bounds for (IP) when the path-width of the corresponding column-matroid is a constant.Comment: 29 pages, To appear in ESA 201

    Maximum common subgraph isomorphism algorithms for the matching of chemical structures

    Get PDF
    The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks
    • …
    corecore