77 research outputs found
Tree comparison: enumeration and application to cheminformatics
Graphs are a well-known data structure used in many application domains that rely on relationships between individual entities. Examples are social networks, where the users may be in friendship with each other, road networks, where one-way or bidirectional roads connect crossings, and work package assignments, where workers are assigned to tasks. In chem- and bioinformatics, molecules are often represented as molecular graphs, where vertices represent atoms, and bonds between them are represented by edges connecting the vertices. Since there is an ever-increasing amount of data that can be treated as graphs, fast algorithms are needed to compare such graphs. A well-researched concept to compare two graphs is the maximum common subgraph. On the one hand, this allows finding substructures that are common to both input graphs. On the other hand, we can derive a similarity score from the maximum common subgraph. A practical application is rational drug design which involves molecular similarity searches.
In this thesis, we study the maximum common subgraph problem, which entails finding a largest graph, which is isomorphic to subgraphs of two input graphs. We focus on restrictions that allow polynomial-time algorithms with a low exponent. An example is the maximum common subtree of two input trees. We succeed in improving the previously best-known time bound. Additionally, we provide a lower time bound under certain assumptions. We study a generalization of the maximum common subtree problem, the block-and-bridge preserving maximum common induced subgraph problem between outerplanar graphs. This problem is motivated by the application to cheminformatics. First, the vast majority of drugs modeled as molecular graphs is outerplanar, and second, the blocks correspond to the ring structures and the bridges to atom chains or linkers. If we allow disconnected common subgraphs, the problem becomes NP-hard even for trees as input. We propose a second generalization of the maximum common subtree problem, which allows skipping vertices in the input trees while maintaining polynomial running time.
Since a maximum common subgraph is not unique in general, we investigate the problem to enumerate all maximum solutions. We do this for both the maximum common subtree problem and the block-and-bridge preserving maximum common induced subgraph problem between outerplanar graphs. An arising subproblem which we analyze is the enumeration of maximum weight matchings in bipartite graphs. We support a weight function between the vertices and edges for all proposed common subgraph methods in this thesis. Thus the objective is to compute a common subgraph of maximum weight. The weights may be integral or real-valued, including negative values. A special case of using such a weight function is computing common subgraph isomorphisms between labeled graphs, where labels between mapped vertices and edges must be equal. An experimental study evaluates the practical running times and the usefulness of our block-and-bridge preserving maximum common induced subgraph algorithm against state of the art algorithms
Connectivity Oracles for Graphs Subject to Vertex Failures
We introduce new data structures for answering connectivity queries in graphs
subject to batched vertex failures. A deterministic structure processes a batch
of failed vertices in time and thereafter
answers connectivity queries in time. It occupies space . We develop a randomized Monte Carlo version of our data structure
with update time , query time , and space
for any failure bound . This is the first connectivity oracle for
general graphs that can efficiently deal with an unbounded number of vertex
failures.
We also develop a more efficient Monte Carlo edge-failure connectivity
oracle. Using space , edge failures are processed in time and thereafter, connectivity queries are answered in
time, which are correct w.h.p.
Our data structures are based on a new decomposition theorem for an
undirected graph , which is of independent interest. It states that
for any terminal set we can remove a set of
vertices such that the remaining graph contains a Steiner forest for with
maximum degree
Traversing combinatorial 0/1-polytopes via optimization
In this paper, we present a new framework that exploits combinatorial optimization for efficiently generating a large variety of combinatorial objects based on graphs, matroids, posets and polytopes.
Our method relies on a simple and versatile algorithm for computing a Hamilton path on the skeleton of any 0/1-polytope \conv(X), where X\seq \{0,1\}^n.
The algorithm uses as a black box any algorithm that solves a variant of the classical linear optimization problem~, and the resulting delay, i.e., the running time per visited vertex on the Hamilton path, is only by a factor of larger than the running time of the optimization algorithm.
When encodes a particular class of combinatorial objects, then traversing the skeleton of the polytope~\conv(X) along a Hamilton path corresponds to listing the combinatorial objects by local change operations, i.e., we obtain Gray code listings.
As concrete results of our general framework, we obtain efficient algorithms for generating all (-optimal) bases and independent sets in a matroid; (-optimal) spanning trees, forests, matchings, maximum matchings, and -optimal matchings in a general graph; vertex covers, minimum vertex covers, -optimal vertex covers, stable sets, maximum stable sets and -optimal stable sets in a bipartite graph; as well as antichains, maximum antichains, -optimal antichains, and -optimal ideals of a poset.
Specifically, the delay and space required by these algorithms are polynomial in the size of the matroid ground set, graph, or poset, respectively.
Furthermore, all of these listings correspond to Hamilton paths on the corresponding combinatorial polytopes, namely the base polytope, matching polytope, vertex cover polytope, stable set polytope, chain polytope and order polytope, respectively.
As another corollary from our framework, we obtain an \cO(t_{\upright{LP}} \log n) delay algorithm for the vertex enumeration problem on 0/1-polytopes , where and~, and t_{\upright{LP}} is the time needed to solve the linear program .
This improves upon the 25-year old \cO(t_{\upright{LP}}\,n) delay algorithm due to Bussieck and L\"ubbecke
- …