2,454 research outputs found
Between Subgraph Isomorphism and Maximum Common Subgraph
When a small pattern graph does not occur inside a larger target graph, we can ask how to find "as much of the pattern as possible" inside the target graph. In general, this is known as the maximum common subgraph problem, which is much more computationally challenging in practice than subgraph isomorphism. We introduce a restricted alternative, where we ask if all but k vertices from the pattern can be found in the target graph. This allows for the development of slightly weakened forms of certain invariants from subgraph isomorphism which are based upon degree and number of paths. We show that when k is small, weakening the invariants still retains much of their effectiveness. We are then able to solve this problem on the standard problem instances used to benchmark subgraph isomorphism algorithms, despite these instances being too large for current maximum common subgraph algorithms to handle. Finally, by iteratively increasing k, we obtain an algorithm which is also competitive for the maximum common subgraph
A Partitioning Algorithm for Maximum Common Subgraph Problems
We introduce a new branch and bound algorithm for the maximum common subgraph and maximum common connected subgraph problems which is based around vertex labelling and partitioning. Our method in some ways resembles a traditional constraint programming approach, but uses a novel compact domain store and supporting inference algorithms which dramatically reduce the memory and computation requirements during search, and allow better dual viewpoint ordering heuristics to be calculated cheaply. Experiments show a speedup of more than an order of magnitude over the state of the art, and demonstrate that we can operate on much larger graphs without running out of memory
Shared Memory Parallel Subgraph Enumeration
The subgraph enumeration problem asks us to find all subgraphs of a target
graph that are isomorphic to a given pattern graph. Determining whether even
one such isomorphic subgraph exists is NP-complete---and therefore finding all
such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration
has applications in many fields, including biochemistry and social networks,
and interestingly the fastest algorithms for solving the problem for
biochemical inputs are sequential. Since they depend on depth-first tree
traversal, an efficient parallelization is far from trivial. Nevertheless,
since important applications produce data sets with increasing difficulty,
parallelism seems beneficial.
We thus present here a shared-memory parallelization of the state-of-the-art
subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs)
by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing
and our implementation demonstrates a significant speedup on real-world
biochemical data---despite a highly irregular data access pattern. We also
improve RI-DS by pruning the search space better; this further improves the
empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel
/ Distributed Computing and Optimization (PDCO 2017
On the Optimality of Pseudo-polynomial Algorithms for Integer Programming
In the classic Integer Programming (IP) problem, the objective is to decide
whether, for a given matrix and an -vector , there is a non-negative integer -vector such that . Solving
(IP) is an important step in numerous algorithms and it is important to obtain
an understanding of the precise complexity of this problem as a function of
natural parameters of the input.
The classic pseudo-polynomial time algorithm of Papadimitriou [J. ACM 1981]
for instances of (IP) with a constant number of constraints was only recently
improved upon by Eisenbrand and Weismantel [SODA 2018] and Jansen and Rohwedder
[ArXiv 2018]. We continue this line of work and show that under the Exponential
Time Hypothesis (ETH), the algorithm of Jansen and Rohwedder is nearly optimal.
We also show that when the matrix is assumed to be non-negative, a
component of Papadimitriou's original algorithm is already nearly optimal under
ETH.
This motivates us to pick up the line of research initiated by Cunningham and
Geelen [IPCO 2007] who studied the complexity of solving (IP) with non-negative
matrices in which the number of constraints may be unbounded, but the
branch-width of the column-matroid corresponding to the constraint matrix is a
constant. We prove a lower bound on the complexity of solving (IP) for such
instances and obtain optimal results with respect to a closely related
parameter, path-width. Specifically, we prove matching upper and lower bounds
for (IP) when the path-width of the corresponding column-matroid is a constant.Comment: 29 pages, To appear in ESA 201
Maximum common subgraph isomorphism algorithms for the matching of chemical structures
The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks
- …