2,506 research outputs found

    Online Steiner Tree with Deletions

    Full text link
    In the online Steiner tree problem, the input is a set of vertices that appear one-by-one, and we have to maintain a Steiner tree on the current set of vertices. The cost of the tree is the total length of edges in the tree, and we want this cost to be close to the cost of the optimal Steiner tree at all points in time. If we are allowed to only add edges, a tight bound of Θ(logn)\Theta(\log n) on the competitiveness is known. Recently it was shown that if we can add one new edge and make one edge swap upon every vertex arrival, we can maintain a constant-competitive tree online. But what if the set of vertices sees both additions and deletions? Again, we would like to obtain a low-cost Steiner tree with as few edge changes as possible. The original paper of Imase and Waxman had also considered this model, and it gave a greedy algorithm that maintained a constant-competitive tree online, and made at most O(n3/2)O(n^{3/2}) edge changes for the first nn requests. In this paper give the following two results. Our first result is an online algorithm that maintains a Steiner tree only under deletions: we start off with a set of vertices, and at each time one of the vertices is removed from this set: our Steiner tree no longer has to span this vertex. We give an algorithm that changes only a constant number of edges upon each request, and maintains a constant-competitive tree at all times. Our algorithm uses the primal-dual framework and a global charging argument to carefully make these constant number of changes. We then study the natural greedy algorithm proposed by Imase and Waxman that maintains a constant-competitive Steiner tree in the fully-dynamic model (where each request either adds or deletes a vertex). Our second result shows that this algorithm makes only a constant number of changes per request in an amortized sense.Comment: An extended abstract appears in the SODA 2014 conferenc

    The Power of Dynamic Distance Oracles: Efficient Dynamic Algorithms for the Steiner Tree

    Get PDF
    In this paper we study the Steiner tree problem over a dynamic set of terminals. We consider the model where we are given an nn-vertex graph G=(V,E,w)G=(V,E,w) with positive real edge weights, and our goal is to maintain a tree which is a good approximation of the minimum Steiner tree spanning a terminal set SVS \subseteq V, which changes over time. The changes applied to the terminal set are either terminal additions (incremental scenario), terminal removals (decremental scenario), or both (fully dynamic scenario). Our task here is twofold. We want to support updates in sublinear o(n)o(n) time, and keep the approximation factor of the algorithm as small as possible. We show that we can maintain a (6+ε)(6+\varepsilon)-approximate Steiner tree of a general graph in O~(nlogD)\tilde{O}(\sqrt{n} \log D) time per terminal addition or removal. Here, DD denotes the stretch of the metric induced by GG. For planar graphs we achieve the same running time and the approximation ratio of (2+ε)(2+\varepsilon). Moreover, we show faster algorithms for incremental and decremental scenarios. Finally, we show that if we allow higher approximation ratio, even more efficient algorithms are possible. In particular we show a polylogarithmic time (4+ε)(4+\varepsilon)-approximate algorithm for planar graphs. One of the main building blocks of our algorithms are dynamic distance oracles for vertex-labeled graphs, which are of independent interest. We also improve and use the online algorithms for the Steiner tree problem.Comment: Full version of the paper accepted to STOC'1

    Compressing DNA sequence databases with coil

    Get PDF
    Background: Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results: We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion: coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work

    Connectivity Oracles for Graphs Subject to Vertex Failures

    Full text link
    We introduce new data structures for answering connectivity queries in graphs subject to batched vertex failures. A deterministic structure processes a batch of ddd\leq d_{\star} failed vertices in O~(d3)\tilde{O}(d^3) time and thereafter answers connectivity queries in O(d)O(d) time. It occupies space O(dmlogn)O(d_{\star} m\log n). We develop a randomized Monte Carlo version of our data structure with update time O~(d2)\tilde{O}(d^2), query time O(d)O(d), and space O~(m)\tilde{O}(m) for any failure bound dnd\le n. This is the first connectivity oracle for general graphs that can efficiently deal with an unbounded number of vertex failures. We also develop a more efficient Monte Carlo edge-failure connectivity oracle. Using space O(nlog2n)O(n\log^2 n), dd edge failures are processed in O(dlogdloglogn)O(d\log d\log\log n) time and thereafter, connectivity queries are answered in O(loglogn)O(\log\log n) time, which are correct w.h.p. Our data structures are based on a new decomposition theorem for an undirected graph G=(V,E)G=(V,E), which is of independent interest. It states that for any terminal set UVU\subseteq V we can remove a set BB of U/(s2)|U|/(s-2) vertices such that the remaining graph contains a Steiner forest for UBU-B with maximum degree ss

    Approximate Closest Community Search in Networks

    Get PDF
    Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some state-of-the-art models have attempted to address this issue, but not only are their formulated problems NP-hard, they do not admit any approximations without restrictive assumptions, which may not always hold in practice. In this paper, given an undirected graph G and a set of query nodes Q, we study community search using the k-truss based community model. We formulate our problem of finding a closest truss community (CTC), as finding a connected k-truss subgraph with the largest k that contains Q, and has the minimum diameter among such subgraphs. We prove this problem is NP-hard. Furthermore, it is NP-hard to approximate the problem within a factor (2ε)(2-\varepsilon), for any ε>0\varepsilon >0 . However, we develop a greedy algorithmic framework, which first finds a CTC containing Q, and then iteratively removes the furthest nodes from Q, from the graph. The method achieves 2-approximation to the optimal solution. To further improve the efficiency, we make use of a compact truss index and develop efficient algorithms for k-truss identification and maintenance as nodes get eliminated. In addition, using bulk deletion optimization and local exploration strategies, we propose two more efficient algorithms. One of them trades some approximation quality for efficiency while the other is a very efficient heuristic. Extensive experiments on 6 real-world networks show the effectiveness and efficiency of our community model and search algorithms

    Finding and counting vertex-colored subtrees

    Full text link
    The problems studied in this article originate from the Graph Motif problem introduced by Lacroix et al. in the context of biological networks. The problem is to decide if a vertex-colored graph has a connected subgraph whose colors equal a given multiset of colors MM. It is a graph pattern-matching problem variant, where the structure of the occurrence of the pattern is not of interest but the only requirement is the connectedness. Using an algebraic framework recently introduced by Koutis et al., we obtain new FPT algorithms for Graph Motif and variants, with improved running times. We also obtain results on the counting versions of this problem, proving that the counting problem is FPT if M is a set, but becomes W[1]-hard if M is a multiset with two colors. Finally, we present an experimental evaluation of this approach on real datasets, showing that its performance compares favorably with existing software.Comment: Conference version in International Symposium on Mathematical Foundations of Computer Science (MFCS), Brno : Czech Republic (2010) Journal Version in Algorithmic

    Relaxing the Irrevocability Requirement for Online Graph Algorithms

    Get PDF
    Online graph problems are considered in models where the irrevocability requirement is relaxed. Motivated by practical examples where, for example, there is a cost associated with building a facility and no extra cost associated with doing it later, we consider the Late Accept model, where a request can be accepted at a later point, but any acceptance is irrevocable. Similarly, we also consider a Late Reject model, where an accepted request can later be rejected, but any rejection is irrevocable (this is sometimes called preemption). Finally, we consider the Late Accept/Reject model, where late accepts and rejects are both allowed, but any late reject is irrevocable. For Independent Set, the Late Accept/Reject model is necessary to obtain a constant competitive ratio, but for Vertex Cover the Late Accept model is sufficient and for Minimum Spanning Forest the Late Reject model is sufficient. The Matching problem has a competitive ratio of 2, but in the Late Accept/Reject model, its competitive ratio is 3/2
    corecore