77 research outputs found

    Tree comparison: enumeration and application to cheminformatics

    Get PDF
    Graphs are a well-known data structure used in many application domains that rely on relationships between individual entities. Examples are social networks, where the users may be in friendship with each other, road networks, where one-way or bidirectional roads connect crossings, and work package assignments, where workers are assigned to tasks. In chem- and bioinformatics, molecules are often represented as molecular graphs, where vertices represent atoms, and bonds between them are represented by edges connecting the vertices. Since there is an ever-increasing amount of data that can be treated as graphs, fast algorithms are needed to compare such graphs. A well-researched concept to compare two graphs is the maximum common subgraph. On the one hand, this allows finding substructures that are common to both input graphs. On the other hand, we can derive a similarity score from the maximum common subgraph. A practical application is rational drug design which involves molecular similarity searches. In this thesis, we study the maximum common subgraph problem, which entails finding a largest graph, which is isomorphic to subgraphs of two input graphs. We focus on restrictions that allow polynomial-time algorithms with a low exponent. An example is the maximum common subtree of two input trees. We succeed in improving the previously best-known time bound. Additionally, we provide a lower time bound under certain assumptions. We study a generalization of the maximum common subtree problem, the block-and-bridge preserving maximum common induced subgraph problem between outerplanar graphs. This problem is motivated by the application to cheminformatics. First, the vast majority of drugs modeled as molecular graphs is outerplanar, and second, the blocks correspond to the ring structures and the bridges to atom chains or linkers. If we allow disconnected common subgraphs, the problem becomes NP-hard even for trees as input. We propose a second generalization of the maximum common subtree problem, which allows skipping vertices in the input trees while maintaining polynomial running time. Since a maximum common subgraph is not unique in general, we investigate the problem to enumerate all maximum solutions. We do this for both the maximum common subtree problem and the block-and-bridge preserving maximum common induced subgraph problem between outerplanar graphs. An arising subproblem which we analyze is the enumeration of maximum weight matchings in bipartite graphs. We support a weight function between the vertices and edges for all proposed common subgraph methods in this thesis. Thus the objective is to compute a common subgraph of maximum weight. The weights may be integral or real-valued, including negative values. A special case of using such a weight function is computing common subgraph isomorphisms between labeled graphs, where labels between mapped vertices and edges must be equal. An experimental study evaluates the practical running times and the usefulness of our block-and-bridge preserving maximum common induced subgraph algorithm against state of the art algorithms

    Connectivity Oracles for Graphs Subject to Vertex Failures

    Full text link
    We introduce new data structures for answering connectivity queries in graphs subject to batched vertex failures. A deterministic structure processes a batch of ddd\leq d_{\star} failed vertices in O~(d3)\tilde{O}(d^3) time and thereafter answers connectivity queries in O(d)O(d) time. It occupies space O(dmlogn)O(d_{\star} m\log n). We develop a randomized Monte Carlo version of our data structure with update time O~(d2)\tilde{O}(d^2), query time O(d)O(d), and space O~(m)\tilde{O}(m) for any failure bound dnd\le n. This is the first connectivity oracle for general graphs that can efficiently deal with an unbounded number of vertex failures. We also develop a more efficient Monte Carlo edge-failure connectivity oracle. Using space O(nlog2n)O(n\log^2 n), dd edge failures are processed in O(dlogdloglogn)O(d\log d\log\log n) time and thereafter, connectivity queries are answered in O(loglogn)O(\log\log n) time, which are correct w.h.p. Our data structures are based on a new decomposition theorem for an undirected graph G=(V,E)G=(V,E), which is of independent interest. It states that for any terminal set UVU\subseteq V we can remove a set BB of U/(s2)|U|/(s-2) vertices such that the remaining graph contains a Steiner forest for UBU-B with maximum degree ss

    Traversing combinatorial 0/1-polytopes via optimization

    Get PDF
    In this paper, we present a new framework that exploits combinatorial optimization for efficiently generating a large variety of combinatorial objects based on graphs, matroids, posets and polytopes. Our method relies on a simple and versatile algorithm for computing a Hamilton path on the skeleton of any 0/1-polytope \conv(X), where X\seq \{0,1\}^n. The algorithm uses as a black box any algorithm that solves a variant of the classical linear optimization problem~min{wxxX}\min\{w\cdot x\mid x\in X\}, and the resulting delay, i.e., the running time per visited vertex on the Hamilton path, is only by a factor of logn\log n larger than the running time of the optimization algorithm. When XX encodes a particular class of combinatorial objects, then traversing the skeleton of the polytope~\conv(X) along a Hamilton path corresponds to listing the combinatorial objects by local change operations, i.e., we obtain Gray code listings. As concrete results of our general framework, we obtain efficient algorithms for generating all (cc-optimal) bases and independent sets in a matroid; (cc-optimal) spanning trees, forests, matchings, maximum matchings, and cc-optimal matchings in a general graph; vertex covers, minimum vertex covers, cc-optimal vertex covers, stable sets, maximum stable sets and cc-optimal stable sets in a bipartite graph; as well as antichains, maximum antichains, cc-optimal antichains, and cc-optimal ideals of a poset. Specifically, the delay and space required by these algorithms are polynomial in the size of the matroid ground set, graph, or poset, respectively. Furthermore, all of these listings correspond to Hamilton paths on the corresponding combinatorial polytopes, namely the base polytope, matching polytope, vertex cover polytope, stable set polytope, chain polytope and order polytope, respectively. As another corollary from our framework, we obtain an \cO(t_{\upright{LP}} \log n) delay algorithm for the vertex enumeration problem on 0/1-polytopes {xRnAxb}\{x\in\mathbb{R}^n\mid Ax\leq b\}, where ARm×nA\in \mathbb{R}^{m\times n} and~bRmb\in\mathbb{R}^m, and t_{\upright{LP}} is the time needed to solve the linear program min{wxAxb}\min\{w\cdot x\mid Ax\leq b\}. This improves upon the 25-year old \cO(t_{\upright{LP}}\,n) delay algorithm due to Bussieck and L\"ubbecke

    Combinatorial and Geometric Aspects of Computational Network Construction - Algorithms and Complexity

    Get PDF
    corecore