28 research outputs found

    Online Maximum k-Coverage

    Get PDF
    We study an online model for the maximum k-vertex-coverage problem, where given a graph G = (V,E) and an integer k, we ask for a subset A ⊆ V, such that |A | = k and the number of edges covered by A is maximized. In our model, at each step i, a new vertex vi is revealed, and we have to decide whether we will keep it or discard it. At any time of the process, only k vertices can be kept in memory; if at some point the current solution already contains k vertices, any inclusion of any new vertex in the solution must entail the irremediable deletion of one vertex of the current solution (a vertex not kept when revealed is irremediably deleted). We propose algorithms for several natural classes of graphs (mainly regular and bipartite), improving on an easy 1/2-competitive ratio. We next settle a set-version of the problem, called maximum k-(set)-coverage problem. For this problem we present an algorithm that improves upon former results for the same model for small and moderate values of k

    An Efficient Streaming Algorithm for the Submodular Cover Problem

    Get PDF
    We initiate the study of the classical Submodular Cover (SC) problem in the data streaming model which we refer to as the Streaming Submodular Cover (SSC). We show that any single pass streaming algorithm using sublinear memory in the size of the stream will fail to provide any non-trivial approximation guarantees for SSC. Hence, we consider a relaxed version of SSC, where we only seek to find a partial cover. We design the first Efficient bicriteria Submodular Cover Streaming (ESC-Streaming) algorithm for this problem, and provide theoretical guarantees for its performance supported by numerical evidence. Our algorithm finds solutions that are competitive with the near-optimal offline greedy algorithm despite requiring only a single pass over the data stream. In our numerical experiments, we evaluate the performance of ESC-Streaming on active set selection and large-scale graph cover problems.Comment: To appear in NIPS'1

    Semi-Streaming Set Cover

    Full text link
    This paper studies the set cover problem under the semi-streaming model. The underlying set system is formalized in terms of a hypergraph G=(V,E)G = (V, E) whose edges arrive one-by-one and the goal is to construct an edge cover FEF \subseteq E with the objective of minimizing the cardinality (or cost in the weighted case) of FF. We consider a parameterized relaxation of this problem, where given some 0ϵ<10 \leq \epsilon < 1, the goal is to construct an edge (1ϵ)(1 - \epsilon)-cover, namely, a subset of edges incident to all but an ϵ\epsilon-fraction of the vertices (or their benefit in the weighted case). The key limitation imposed on the algorithm is that its space is limited to (poly)logarithmically many bits per vertex. Our main result is an asymptotically tight trade-off between ϵ\epsilon and the approximation ratio: We design a semi-streaming algorithm that on input graph GG, constructs a succinct data structure D\mathcal{D} such that for every 0ϵ<10 \leq \epsilon < 1, an edge (1ϵ)(1 - \epsilon)-cover that approximates the optimal edge \mbox{(11-)cover} within a factor of f(ϵ,n)f(\epsilon, n) can be extracted from D\mathcal{D} (efficiently and with no additional space requirements), where f(ϵ,n)={O(1/ϵ),if ϵ>1/nO(n),otherwise. f(\epsilon, n) = \left\{ \begin{array}{ll} O (1 / \epsilon), & \text{if } \epsilon > 1 / \sqrt{n} \\ O (\sqrt{n}), & \text{otherwise} \end{array} \right. \, . In particular for the traditional set cover problem we obtain an O(n)O(\sqrt{n})-approximation. This algorithm is proved to be best possible by establishing a family (parameterized by ϵ\epsilon) of matching lower bounds.Comment: Full version of the extended abstract that will appear in Proceedings of ICALP 2014 track

    Almost Optimal Streaming Algorithms for Coverage Problems

    Full text link
    Maximum coverage and minimum set cover problems --collectively called coverage problems-- have been studied extensively in streaming models. However, previous research not only achieve sub-optimal approximation factors and space complexities, but also study a restricted set arrival model which makes an explicit or implicit assumption on oracle access to the sets, ignoring the complexity of reading and storing the whole set at once. In this paper, we address the above shortcomings, and present algorithms with improved approximation factor and improved space complexity, and prove that our results are almost tight. Moreover, unlike most of previous work, our results hold on a more general edge arrival model. More specifically, we present (almost) optimal approximation algorithms for maximum coverage and minimum set cover problems in the streaming model with an (almost) optimal space complexity of O~(n)\tilde{O}(n), i.e., the space is {\em independent of the size of the sets or the size of the ground set of elements}. These results not only improve over the best known algorithms for the set arrival model, but also are the first such algorithms for the more powerful {\em edge arrival} model. In order to achieve the above results, we introduce a new general sketching technique for coverage functions: This sketching scheme can be applied to convert an α\alpha-approximation algorithm for a coverage problem to a (1-\eps)\alpha-approximation algorithm for the same problem in streaming, or RAM models. We show the significance of our sketching technique by ruling out the possibility of solving coverage problems via accessing (as a black box) a (1 \pm \eps)-approximate oracle (e.g., a sketch function) that estimates the coverage function on any subfamily of the sets

    Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover

    Full text link
    Set cover, over a universe of size nn, may be modelled as a data-streaming problem, where the mm sets that comprise the instance are to be read one by one. A semi-streaming algorithm is allowed only O(npoly{logn,logm})O(n\, \mathrm{poly}\{\log n, \log m\}) space to process this stream. For each p1p \ge 1, we give a very simple deterministic algorithm that makes pp passes over the input stream and returns an appropriately certified (p+1)n1/(p+1)(p+1)n^{1/(p+1)}-approximation to the optimum set cover. More importantly, we proceed to show that this approximation factor is essentially tight, by showing that a factor better than 0.99n1/(p+1)/(p+1)20.99\,n^{1/(p+1)}/(p+1)^2 is unachievable for a pp-pass semi-streaming algorithm, even allowing randomisation. In particular, this implies that achieving a Θ(logn)\Theta(\log n)-approximation requires Ω(logn/loglogn)\Omega(\log n/\log\log n) passes, which is tight up to the loglogn\log\log n factor. These results extend to a relaxation of the set cover problem where we are allowed to leave an ε\varepsilon fraction of the universe uncovered: the tight bounds on the best approximation factor achievable in pp passes turn out to be Θp(min{n1/(p+1),ε1/p})\Theta_p(\min\{n^{1/(p+1)}, \varepsilon^{-1/p}\}). Our lower bounds are based on a construction of a family of high-rank incidence geometries, which may be thought of as vast generalisations of affine planes. This construction, based on algebraic techniques, appears flexible enough to find other applications and is therefore interesting in its own right.Comment: 20 page

    On Streaming and Communication Complexity of the Set Cover Problem

    Get PDF
    We develop the first streaming algorithm and the first two-party communication protocol that uses a constant number of passes/rounds and sublinear space/communication for logarithmic approximation to the classic Set Cover problem. Specifically, for n elements and m sets, our algorithm/protocol achieves a space bound of O(m ·n [superscript δ] log[superscript 2] n logm) using O(4[superscript 1/δ]) passes/rounds while achieving an approximation factor of O(4[superscript 1/δ]logn) in polynomial time (for δ = Ω(1/logn)). If we allow the algorithm/protocol to spend exponential time per pass/round, we achieve an approximation factor of O(4[superscript 1/δ]). Our approach uses randomization, which we show is necessary: no deterministic constant approximation is possible (even given exponential time) using o(m n) space. These results are some of the first on streaming algorithms and efficient two-party communication protocols for approximation algorithms. Moreover, we show that our algorithm can be applied to multi-party communication model.National Science Foundation (U.S.) (Grant CCF-1161626)National Science Foundation (U.S.) (Grant CCF-1065125)United States. Defense Advanced Research Projects Agency (United States. Air Force Office of Scientific Research Grant FA9550-12-1-0423)David & Lucile Packard FoundationSimons FoundationDanish National Research Foundation. Center for Massiave Data Algorithmics (MADALGO

    Towards Tight Bounds for the Streaming Set Cover Problem

    Full text link
    We consider the classic Set Cover problem in the data stream model. For nn elements and mm sets (mnm\geq n) we give a O(1/δ)O(1/\delta)-pass algorithm with a strongly sub-linear O~(mnδ)\tilde{O}(mn^{\delta}) space and logarithmic approximation factor. This yields a significant improvement over the earlier algorithm of Demaine et al. [DIMV14] that uses exponentially larger number of passes. We complement this result by showing that the tradeoff between the number of passes and space exhibited by our algorithm is tight, at least when the approximation factor is equal to 11. Specifically, we show that any algorithm that computes set cover exactly using (12δ1)({1 \over 2\delta}-1) passes must use Ω~(mnδ)\tilde{\Omega}(mn^{\delta}) space in the regime of m=O(n)m=O(n). Furthermore, we consider the problem in the geometric setting where the elements are points in R2\mathbb{R}^2 and sets are either discs, axis-parallel rectangles, or fat triangles in the plane, and show that our algorithm (with a slight modification) uses the optimal O~(n)\tilde{O}(n) space to find a logarithmic approximation in O(1/δ)O(1/\delta) passes. Finally, we show that any randomized one-pass algorithm that distinguishes between covers of size 2 and 3 must use a linear (i.e., Ω(mn)\Omega(mn)) amount of space. This is the first result showing that a randomized, approximate algorithm cannot achieve a space bound that is sublinear in the input size. This indicates that using multiple passes might be necessary in order to achieve sub-linear space bounds for this problem while guaranteeing small approximation factors.Comment: A preliminary version of this paper is to appear in PODS 201
    corecore