290 research outputs found

    Maximum common subgraph isomorphism algorithms for the matching of chemical structures

    Get PDF
    The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks

    RASCAL: calculation of graph similarity using maximum common edge subgraphs

    Get PDF
    A new graph similarity calculation procedure is introduced for comparing labeled graphs. Given a minimum similarity threshold, the procedure consists of an initial screening process to determine whether it is possible for the measure of similarity between the two graphs to exceed the minimum threshold, followed by a rigorous maximum common edge subgraph (MCES) detection algorithm to compute the exact degree and composition of similarity. The proposed MCES algorithm is based on a maximum clique formulation of the problem and is a significant improvement over other published algorithms. It presents new approaches to both lower and upper bounding as well as vertex selection

    A Much Faster Algorithm for Finding a Maximum Clique

    Get PDF
    We present improvements to a branch-and-bound maximumclique-finding algorithm MCS (WALCOM 2010, LNCS 5942, pp. 191–203) that was shown to be fast. First, we employ an efficient approximation algorithm for finding a maximum clique. Second, we make use of appropriate sorting of vertices only near the root of the search tree. Third, we employ a lightened approximate coloring mainly near the leaves of the search tree. A new algorithm obtained from MCS with the above improvements is named MCT. It is shown that MCT is much faster than MCS by extensive computational experiments. In particular, MCT is shown to be faster than MCS for gen400 p0.9 75 and gen400 p0.9 65 by over 328,000 and 77,000 times, respectively

    Scalable Kernelization for Maximum Independent Sets

    Get PDF
    The most efficient algorithms for finding maximum independent sets in both theory and practice use reduction rules to obtain a much smaller problem instance called a kernel. The kernel can then be solved quickly using exact or heuristic algorithms---or by repeatedly kernelizing recursively in the branch-and-reduce paradigm. It is of critical importance for these algorithms that kernelization is fast and returns a small kernel. Current algorithms are either slow but produce a small kernel, or fast and give a large kernel. We attempt to accomplish both of these goals simultaneously, by giving an efficient parallel kernelization algorithm based on graph partitioning and parallel bipartite maximum matching. We combine our parallelization techniques with two techniques to accelerate kernelization further: dependency checking that prunes reductions that cannot be applied, and reduction tracking that allows us to stop kernelization when reductions become less fruitful. Our algorithm produces kernels that are orders of magnitude smaller than the fastest kernelization methods, while having a similar execution time. Furthermore, our algorithm is able to compute kernels with size comparable to the smallest known kernels, but up to two orders of magnitude faster than previously possible. Finally, we show that our kernelization algorithm can be used to accelerate existing state-of-the-art heuristic algorithms, allowing us to find larger independent sets faster on large real-world networks and synthetic instances.Comment: Extended versio

    Efficient Algorithms for Finding Maximum and Maximal Cliques and Their Applications

    Get PDF
    The problem of finding a maximum clique or enumerating all maximal cliques is very important and has been explored in several excellent survey papers. Here, we focus our attention on the step-by-step examination of a series of branch-and-bound depth-first search algorithms: Basics, MCQ, MCR, MCS, and MCT. Subsequently, as with the depth-first search as above, we present our algorithm, CLIQUES, for enumerating all maximal cliques. Finally, we describe some of the applications of the algorithms and their variants in bioinformatics, data mining, and other fields

    Finding Near-Optimal Independent Sets at Scale

    Full text link
    The independent set problem is NP-hard and particularly difficult to solve in large sparse graphs. In this work, we develop an advanced evolutionary algorithm, which incorporates kernelization techniques to compute large independent sets in huge sparse networks. A recent exact algorithm has shown that large networks can be solved exactly by employing a branch-and-reduce technique that recursively kernelizes the graph and performs branching. However, one major drawback of their algorithm is that, for huge graphs, branching still can take exponential time. To avoid this problem, we recursively choose vertices that are likely to be in a large independent set (using an evolutionary approach), then further kernelize the graph. We show that identifying and removing vertices likely to be in large independent sets opens up the reduction space---which not only speeds up the computation of large independent sets drastically, but also enables us to compute high-quality independent sets on much larger instances than previously reported in the literature.Comment: 17 pages, 1 figure, 8 tables. arXiv admin note: text overlap with arXiv:1502.0168

    Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph

    Full text link
    In the Densest k-Subgraph problem, given a graph G and a parameter k, one needs to find a subgraph of G induced on k vertices that contains the largest number of edges. There is a significant gap between the best known upper and lower bounds for this problem. It is NP-hard, and does not have a PTAS unless NP has subexponential time algorithms. On the other hand, the current best known algorithm of Feige, Kortsarz and Peleg, gives an approximation ratio of n^(1/3-epsilon) for some specific epsilon > 0 (estimated at around 1/60). We present an algorithm that for every epsilon > 0 approximates the Densest k-Subgraph problem within a ratio of n^(1/4+epsilon) in time n^O(1/epsilon). In particular, our algorithm achieves an approximation ratio of O(n^1/4) in time n^O(log n). Our algorithm is inspired by studying an average-case version of the problem where the goal is to distinguish random graphs from graphs with planted dense subgraphs. The approximation ratio we achieve for the general case matches the distinguishing ratio we obtain for this planted problem. At a high level, our algorithms involve cleverly counting appropriately defined trees of constant size in G, and using these counts to identify the vertices of the dense subgraph. Our algorithm is based on the following principle. We say that a graph G(V,E) has log-density alpha if its average degree is Theta(|V|^alpha). The algorithmic core of our result is a family of algorithms that output k-subgraphs of nontrivial density whenever the log-density of the densest k-subgraph is larger than the log-density of the host graph.Comment: 23 page