28 research outputs found
Online Maximum k-Coverage
We study an online model for the maximum k-vertex-coverage problem, where given a graph G = (V,E) and an integer k, we ask for a subset A ⊆ V, such that |A | = k and the number of edges covered by A is maximized. In our model, at each step i, a new vertex vi is revealed, and we have to decide whether we will keep it or discard it. At any time of the process, only k vertices can be kept in memory; if at some point the current solution already contains k vertices, any inclusion of any new vertex in the solution must entail the irremediable deletion of one vertex of the current solution (a vertex not kept when revealed is irremediably deleted). We propose algorithms for several natural classes of graphs (mainly regular and bipartite), improving on an easy 1/2-competitive ratio. We next settle a set-version of the problem, called maximum k-(set)-coverage problem. For this problem we present an algorithm that improves upon former results for the same model for small and moderate values of k
An Efficient Streaming Algorithm for the Submodular Cover Problem
We initiate the study of the classical Submodular Cover (SC) problem in the
data streaming model which we refer to as the Streaming Submodular Cover (SSC).
We show that any single pass streaming algorithm using sublinear memory in the
size of the stream will fail to provide any non-trivial approximation
guarantees for SSC. Hence, we consider a relaxed version of SSC, where we only
seek to find a partial cover.
We design the first Efficient bicriteria Submodular Cover Streaming
(ESC-Streaming) algorithm for this problem, and provide theoretical guarantees
for its performance supported by numerical evidence. Our algorithm finds
solutions that are competitive with the near-optimal offline greedy algorithm
despite requiring only a single pass over the data stream. In our numerical
experiments, we evaluate the performance of ESC-Streaming on active set
selection and large-scale graph cover problems.Comment: To appear in NIPS'1
Semi-Streaming Set Cover
This paper studies the set cover problem under the semi-streaming model. The
underlying set system is formalized in terms of a hypergraph whose
edges arrive one-by-one and the goal is to construct an edge cover with the objective of minimizing the cardinality (or cost in the weighted
case) of . We consider a parameterized relaxation of this problem, where
given some , the goal is to construct an edge -cover, namely, a subset of edges incident to all but an
-fraction of the vertices (or their benefit in the weighted case).
The key limitation imposed on the algorithm is that its space is limited to
(poly)logarithmically many bits per vertex.
Our main result is an asymptotically tight trade-off between and
the approximation ratio: We design a semi-streaming algorithm that on input
graph , constructs a succinct data structure such that for
every , an edge -cover that approximates
the optimal edge \mbox{(-)cover} within a factor of can be
extracted from (efficiently and with no additional space
requirements), where In particular for the traditional
set cover problem we obtain an -approximation. This algorithm is
proved to be best possible by establishing a family (parameterized by
) of matching lower bounds.Comment: Full version of the extended abstract that will appear in Proceedings
of ICALP 2014 track
Almost Optimal Streaming Algorithms for Coverage Problems
Maximum coverage and minimum set cover problems --collectively called
coverage problems-- have been studied extensively in streaming models. However,
previous research not only achieve sub-optimal approximation factors and space
complexities, but also study a restricted set arrival model which makes an
explicit or implicit assumption on oracle access to the sets, ignoring the
complexity of reading and storing the whole set at once. In this paper, we
address the above shortcomings, and present algorithms with improved
approximation factor and improved space complexity, and prove that our results
are almost tight. Moreover, unlike most of previous work, our results hold on a
more general edge arrival model. More specifically, we present (almost) optimal
approximation algorithms for maximum coverage and minimum set cover problems in
the streaming model with an (almost) optimal space complexity of
, i.e., the space is {\em independent of the size of the sets or
the size of the ground set of elements}. These results not only improve over
the best known algorithms for the set arrival model, but also are the first
such algorithms for the more powerful {\em edge arrival} model. In order to
achieve the above results, we introduce a new general sketching technique for
coverage functions: This sketching scheme can be applied to convert an
-approximation algorithm for a coverage problem to a
(1-\eps)\alpha-approximation algorithm for the same problem in streaming, or
RAM models. We show the significance of our sketching technique by ruling out
the possibility of solving coverage problems via accessing (as a black box) a
(1 \pm \eps)-approximate oracle (e.g., a sketch function) that estimates the
coverage function on any subfamily of the sets
Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover
Set cover, over a universe of size , may be modelled as a data-streaming
problem, where the sets that comprise the instance are to be read one by
one. A semi-streaming algorithm is allowed only space to process this stream. For each , we give a very
simple deterministic algorithm that makes passes over the input stream and
returns an appropriately certified -approximation to the
optimum set cover. More importantly, we proceed to show that this approximation
factor is essentially tight, by showing that a factor better than
is unachievable for a -pass semi-streaming
algorithm, even allowing randomisation. In particular, this implies that
achieving a -approximation requires
passes, which is tight up to the factor. These results extend to a
relaxation of the set cover problem where we are allowed to leave an
fraction of the universe uncovered: the tight bounds on the best
approximation factor achievable in passes turn out to be
. Our lower bounds are based
on a construction of a family of high-rank incidence geometries, which may be
thought of as vast generalisations of affine planes. This construction, based
on algebraic techniques, appears flexible enough to find other applications and
is therefore interesting in its own right.Comment: 20 page
On Streaming and Communication Complexity of the Set Cover Problem
We develop the first streaming algorithm and the first two-party communication protocol that uses a constant number of passes/rounds and sublinear space/communication for logarithmic approximation to the classic Set Cover problem. Specifically, for n elements and m sets, our algorithm/protocol achieves a space bound of O(m ·n [superscript δ] log[superscript 2] n logm) using O(4[superscript 1/δ]) passes/rounds while achieving an approximation factor of O(4[superscript 1/δ]logn) in polynomial time (for δ = Ω(1/logn)). If we allow the algorithm/protocol to spend exponential time per pass/round, we achieve an approximation factor of O(4[superscript 1/δ]). Our approach uses randomization, which we show is necessary: no deterministic constant approximation is possible (even given exponential time) using o(m n) space. These results are some of the first on streaming algorithms and efficient two-party communication protocols for approximation algorithms. Moreover, we show that our algorithm can be applied to multi-party communication model.National Science Foundation (U.S.) (Grant CCF-1161626)National Science Foundation (U.S.) (Grant CCF-1065125)United States. Defense Advanced Research Projects Agency (United States. Air Force Office of Scientific Research Grant FA9550-12-1-0423)David & Lucile Packard FoundationSimons FoundationDanish National Research Foundation. Center for Massiave Data Algorithmics (MADALGO
Towards Tight Bounds for the Streaming Set Cover Problem
We consider the classic Set Cover problem in the data stream model. For
elements and sets () we give a -pass algorithm with a
strongly sub-linear space and logarithmic
approximation factor. This yields a significant improvement over the earlier
algorithm of Demaine et al. [DIMV14] that uses exponentially larger number of
passes. We complement this result by showing that the tradeoff between the
number of passes and space exhibited by our algorithm is tight, at least when
the approximation factor is equal to . Specifically, we show that any
algorithm that computes set cover exactly using passes
must use space in the regime of .
Furthermore, we consider the problem in the geometric setting where the
elements are points in and sets are either discs, axis-parallel
rectangles, or fat triangles in the plane, and show that our algorithm (with a
slight modification) uses the optimal space to find a
logarithmic approximation in passes.
Finally, we show that any randomized one-pass algorithm that distinguishes
between covers of size 2 and 3 must use a linear (i.e., ) amount of
space. This is the first result showing that a randomized, approximate
algorithm cannot achieve a space bound that is sublinear in the input size.
This indicates that using multiple passes might be necessary in order to
achieve sub-linear space bounds for this problem while guaranteeing small
approximation factors.Comment: A preliminary version of this paper is to appear in PODS 201