136,746 research outputs found
Almost Optimal Streaming Algorithms for Coverage Problems
Maximum coverage and minimum set cover problems --collectively called
coverage problems-- have been studied extensively in streaming models. However,
previous research not only achieve sub-optimal approximation factors and space
complexities, but also study a restricted set arrival model which makes an
explicit or implicit assumption on oracle access to the sets, ignoring the
complexity of reading and storing the whole set at once. In this paper, we
address the above shortcomings, and present algorithms with improved
approximation factor and improved space complexity, and prove that our results
are almost tight. Moreover, unlike most of previous work, our results hold on a
more general edge arrival model. More specifically, we present (almost) optimal
approximation algorithms for maximum coverage and minimum set cover problems in
the streaming model with an (almost) optimal space complexity of
, i.e., the space is {\em independent of the size of the sets or
the size of the ground set of elements}. These results not only improve over
the best known algorithms for the set arrival model, but also are the first
such algorithms for the more powerful {\em edge arrival} model. In order to
achieve the above results, we introduce a new general sketching technique for
coverage functions: This sketching scheme can be applied to convert an
-approximation algorithm for a coverage problem to a
(1-\eps)\alpha-approximation algorithm for the same problem in streaming, or
RAM models. We show the significance of our sketching technique by ruling out
the possibility of solving coverage problems via accessing (as a black box) a
(1 \pm \eps)-approximate oracle (e.g., a sketch function) that estimates the
coverage function on any subfamily of the sets
Streaming Algorithms for Connectivity Augmentation
We study the -connectivity augmentation problem (-CAP) in the
single-pass streaming model. Given a -edge connected graph
that is stored in memory, and a stream of weighted edges with weights in
, the goal is to choose a minimum weight subset such that is -edge connected. We give a
-approximation algorithm for this problem which requires to store
words. Moreover, we show our result is tight: Any
algorithm with better than -approximation for the problem requires
bits of space even when . This establishes a gap between the
optimal approximation factor one can obtain in the streaming vs the offline
setting for -CAP.
We further consider a natural generalization to the fully streaming model
where both and arrive in the stream in an arbitrary order. We show that
this problem has a space lower bound that matches the best possible size of a
spanner of the same approximation ratio. Following this, we give improved
results for spanners on weighted graphs: We show a streaming algorithm that
finds a -approximate weighted spanner of size at most
for integer , whereas the best prior
streaming algorithm for spanner on weighted graphs had size depending on . Using our spanner result, we provide an optimal -approximation for
-CAP in the fully streaming model with words of space.
Finally we apply our results to network design problems such as Steiner tree
augmentation problem (STAP), -edge connected spanning subgraph (-ECSS),
and the general Survivable Network Design problem (SNDP). In particular, we
show a single-pass -approximation for SNDP using
words of space, where is the maximum connectivity requirement
Robust Communication Complexity of Matching: EDCS Achieves 5/6 Approximation
We study the robust communication complexity of maximum matching. Edges of an
arbitrary -vertex graph are randomly partitioned between Alice and Bob
independently and uniformly. Alice has to send a single message to Bob such
that Bob can find an (approximate) maximum matching of the whole graph . We
specifically study the best approximation ratio achievable via protocols where
Alice communicates only bits to Bob.
There has been a growing interest on the robust communication model due to
its connections to the random-order streaming model. An algorithm of Assadi and
Behnezhad [ICALP'21] implies a -approximation for a
small constant , which remains the best-known
approximation for general graphs. For bipartite graphs, Assadi and Behnezhad
[Random'21] improved the approximation to .716 albeit with a computationally
inefficient (i.e., exponential time) protocol.
In this paper, we study a natural and efficient protocol implied by a
random-order streaming algorithm of Bernstein [ICALP'20] which is based on
edge-degree constrained subgraphs (EDCS) [Bernstein and Stein; ICALP'15]. The
result of Bernstein immediately implies that this protocol achieves an (almost)
-approximation in the robust communication model. We present a
new analysis, proving that it achieves a much better (almost) -approximation. This significantly improves previous approximations both
for general and bipartite graphs. We also prove that our analysis of
Bernstein's protocol is tight
AFQN: approximate Qn estimation in data streams
We present afqn (Approximate Fast Qn), a novel algorithm for approximate computation of the Qn scale estimator in a streaming setting, in the sliding window model. It is well-known that computing the Qn estimator exactly may be too costly for some applications, and the problem is a fortiori exacerbated in the streaming setting, in which the time available to process incoming data stream items is short. In this paper we show how to efficiently and accurately approximate the Qn estimator. As an application, we show the use of afqn for fast detection of outliers in data streams. In particular, the outliers are detected in the sliding window model, with a simple check based on the Qn scale estimator. Extensive experimental results on synthetic and real datasets confirm the validity of our approach by showing up to three times faster updates per second. Our contributions are the following ones: (i) to the best of our knowledge, we present the first approximation algorithm for online computation of the Qn scale estimator in a streaming setting and in the sliding window model; (ii) we show how to take advantage of our UDDSketch algorithm for quantile estimation in order to quickly compute the Qn scale estimator; (iii) as an example of a possible application of the Qn scale estimator, we discuss how to detect outliers in an input data stream
Recommended from our members
Submodular Secretary Problem with Shortlists under General Constraints
In submodular k-secretary problem, the goal is to select k items in a randomly ordered input so as to maximize the expected value of a given monotone submodular function on the set of selected items. In this paper, we introduce a relaxation of this problem, which we refer to as submodular k-secretary problem with shortlists. In the proposed problem setting, the algorithm is allowed to choose more than k items as part of a shortlist. Then, after seeing the entire input, the algorithm can choose a subset of size k from the bigger set of items in the shortlist. We are interested in understanding to what extent this relaxation can improve the achievable competitive ratio for the submodular k-secretary problem. In particular, using an O(k) shortlist, can an online algorithm achieve a competitive ratio close to the best achievable online approximation factor for this problem? We answer this question affirmatively by giving a polynomial time algorithm that achieves a 1 - 1/e - epsilon -O(k^{-1}) competitive ratio for any constant epsilon>0, using a shortlist of size eta {epsilon}(k)=O(k). Also, for the special case of m-submodular functions, we demonstrate an algorithm that achieves a 1 - epsilon competitive ratio for any constant epsilon > 0, using an O(1) shortlist. Finally, we show that our algorithm can be implemented in the streaming setting using a memory buffer of size eta{epsilon}(k)=O(k) to achieve a 1 - 1/e - epsilon - O(k^{-1}) approximation for submodular function maximization in the random order streaming model. This substantially improves upon the previously best known approximation factor of 1/2 + 8*10^{-14} [Norouzi-Fard et al. 2018] that used a memory buffer of size O(k log k).
We further generalize our results to the case of matroid constraints. We design an algorithm that achieves a 1/2(1 - 1/e^2 - epsilon - O(1/k)) competitive ratio for any constant epsilon>0, using a shortlist of size O(k). This is especially surprising considering that the best known competitive ratio for the matroid secretary problem is O(log log k). An important application of our algorithm is for the random order streaming of submodular functions. We show that our algorithm can be implemented in the streaming setting using O(k) memory. It achieves a 1/2 (1 - 1/e^2 - epsilon - O(1/k)) approximation. The previously best known approximation ratio for streaming submodular maximization under matroid constraint is 0.25 (adversarial order) due to [Feldman et al.], [Chekuri et al.], and [Chakrabarti et al.]. Moreover, we generalize our results to the case of p-matchoid constraints and give a frac{1}{p+1}(1 - 1/e^{p+1} - epsilon - O(1/k)) approximation using O(k) memory, which asymptotically approaches the best known offline guarantee frac{1}{p+1} [Nemhauser et al.]. Finally we empirically evaluate our results on real world data sets such as YouTube video and Twitter stream
- …