3,793 research outputs found

    Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover

    Full text link
    Set cover, over a universe of size nn, may be modelled as a data-streaming problem, where the mm sets that comprise the instance are to be read one by one. A semi-streaming algorithm is allowed only O(n poly{log⁥n,log⁥m})O(n\, \mathrm{poly}\{\log n, \log m\}) space to process this stream. For each p≄1p \ge 1, we give a very simple deterministic algorithm that makes pp passes over the input stream and returns an appropriately certified (p+1)n1/(p+1)(p+1)n^{1/(p+1)}-approximation to the optimum set cover. More importantly, we proceed to show that this approximation factor is essentially tight, by showing that a factor better than 0.99 n1/(p+1)/(p+1)20.99\,n^{1/(p+1)}/(p+1)^2 is unachievable for a pp-pass semi-streaming algorithm, even allowing randomisation. In particular, this implies that achieving a Θ(log⁥n)\Theta(\log n)-approximation requires Ω(log⁥n/log⁥log⁥n)\Omega(\log n/\log\log n) passes, which is tight up to the log⁥log⁥n\log\log n factor. These results extend to a relaxation of the set cover problem where we are allowed to leave an Δ\varepsilon fraction of the universe uncovered: the tight bounds on the best approximation factor achievable in pp passes turn out to be Θp(min⁥{n1/(p+1),Δ−1/p})\Theta_p(\min\{n^{1/(p+1)}, \varepsilon^{-1/p}\}). Our lower bounds are based on a construction of a family of high-rank incidence geometries, which may be thought of as vast generalisations of affine planes. This construction, based on algebraic techniques, appears flexible enough to find other applications and is therefore interesting in its own right.Comment: 20 page

    An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance

    Get PDF
    We prove an optimal Ω(n)\Omega(n) lower bound on the randomized communication complexity of the much-studied Gap-Hamming-Distance problem. As a consequence, we obtain essentially optimal multi-pass space lower bounds in the data stream model for a number of fundamental problems, including the estimation of frequency moments. The Gap-Hamming-Distance problem is a communication problem, wherein Alice and Bob receive nn-bit strings xx and yy, respectively. They are promised that the Hamming distance between xx and yy is either at least n/2+nn/2+\sqrt{n} or at most n/2−nn/2-\sqrt{n}, and their goal is to decide which of these is the case. Since the formal presentation of the problem by Indyk and Woodruff (FOCS, 2003), it had been conjectured that the naive protocol, which uses nn bits of communication, is asymptotically optimal. The conjecture was shown to be true in several special cases, e.g., when the communication is deterministic, or when the number of rounds of communication is limited. The proof of our aforementioned result, which settles this conjecture fully, is based on a new geometric statement regarding correlations in Gaussian space, related to a result of C. Borell (1985). To prove this geometric statement, we show that random projections of not-too-small sets in Gaussian space are close to a mixture of translated normal variables

    Robust lower bounds for communication and stream computation

    Get PDF
    We study the communication complexity of evaluating functions when the input data is randomly allocated (according to some known distribution) amongst two or more players, possibly with information overlap. This naturally extends previously studied variable partition models such as the best-case and worst-case partition models. We aim to understand whether the hardness of a communication problem holds for almost every allocation of the input, as opposed to holding for perhaps just a few atypical partitions. A key application is to the heavily studied data stream model. There is a strong connection between our communication lower bounds and lower bounds in the data stream model that are “robust” to the ordering of the data. That is, we prove lower bounds for when the order of the items in the stream is chosen not adversarially but rather uniformly (or near-uniformly) from the set of all permutations. This random-order data stream model has attracted recent interest, since lower bounds here give stronger evidence for the inherent hardness of streaming problems. Our results include the first random-partition communication lower bounds for problems including multi-party set disjointness and gap-Hamming-distance. Both are tight. We also extend and improve previous results for a form of pointer jumping that is relevant to the problem of selection (in particular, median finding). Collectively, these results yield lower bounds for a variety of problems in the random-order data stream model, including estimating the number of distinct elements, approximating frequency moments, and quantile estimation. A short version of this article is available in the Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC'08), ACM, pp. 641-650. Compared to the conference presentation, this version considerably expands the detail of the discussion and in the proofs, and substantially changes some of the proof techniques

    Some Communication Complexity Results and their Applications

    Get PDF
    Communication Complexity represents one of the premier techniques for proving lower bounds in theoretical computer science. Lower bounds on communication problems can be leveraged to prove lower bounds in several different areas. In this work, we study three different communication complexity problems. The lower bounds for these problems have applications in circuit complexity, wireless sensor networks, and streaming algorithms. First, we study the multiparty pointer jumping problem. We present the first nontrivial upper bound for this problem. We also provide a suite of strong lower bounds under several restricted classes of protocols. Next, we initiate the study of several non-monotone functions in the distributed functional monitoring setting and provide several lower bounds. In particular, we give a generic adversarial technique and show that when deletions are allowed, no nontrivial protocol is possible. Finally, we study the Gap-Hamming-Distance problem and give tight lower bounds for protocols that use a constant number of messages. As a result, we take a well-known lower bound for one-pass streaming algorithms for a host of problems and extend it so it applies to streaming algorithms that use a constant number of passes

    Lower Bounds for Multi-Pass Processing of Multiple Data Streams

    Get PDF
    This paper gives a brief overview of computation models for data stream processing, and it introduces a new model for multi-pass processing of multiple streams, the so-called mp2s-automata. Two algorithms for solving the set disjointness problem wi th these automata are presented. The main technical contribution of this paper is the proof of a lower bound on the size of memory and the number of heads that are required for solvin g the set disjointness problem with mp2s-automata

    Parameterized Streaming Algorithms for Min-Ones d-SAT

    Get PDF
    In this work, we initiate the study of the Min-Ones d-SAT problem in the parameterized streaming model. An instance of the problem consists of a d-CNF formula F and an integer k, and the objective is to determine if F has a satisfying assignment which sets at most k variables to 1. In the parameterized streaming model, input is provided as a stream, just as in the usual streaming model. A key difference is that the bound on the read-write memory available to the algorithm is O(f(k) log n) (f: N -> N, a computable function) as opposed to the O(log n) bound of the usual streaming model. The other important difference is that the number of passes the algorithm makes over its input must be a (preferably small) function of k. We design a (k + 1)-pass parameterized streaming algorithm that solves Min-Ones d-SAT (d >= 2) using space O((kd^(ck) + k^d)log n) (c > 0, a constant) and a (d + 1)^k-pass algorithm that uses space O(k log n). We also design a streaming kernelization for Min-Ones 2-SAT that makes (k + 2) passes and uses space O(k^6 log n) to produce a kernel with O(k^6) clauses. To complement these positive results, we show that any k-pass algorithm for or Min-Ones d-SAT (d >= 2) requires space Omega(max{n^(1/k) / 2^k, log(n / k)}) on instances (F, k). This is achieved via a reduction from the streaming problem POT Pointer Chasing (Guha and McGregor [ICALP 2008]), which might be of independent interest. Given this, our (k + 1)-pass parameterized streaming algorithm is the best possible, inasmuch as the number of passes is concerned. In contrast to the results of Fafianie and Kratsch [MFCS 2014] and Chitnis et al. [SODA 2015], who independently showed that there are 1-pass parameterized streaming algorithms for Vertex Cover (a restriction of Min-Ones 2-SAT), we show using lower bounds from Communication Complexity that for any d >= 1, a 1-pass streaming algorithm for Min-Ones d-SAT requires space Omega(n). This excludes the possibility of a 1-pass parameterized streaming algorithm for the problem. Additionally, we show that any p-pass algorithm for the problem requires space Omega(n/p)

    Towards Tight Bounds for the Streaming Set Cover Problem

    Full text link
    We consider the classic Set Cover problem in the data stream model. For nn elements and mm sets (m≄nm\geq n) we give a O(1/ÎŽ)O(1/\delta)-pass algorithm with a strongly sub-linear O~(mnÎŽ)\tilde{O}(mn^{\delta}) space and logarithmic approximation factor. This yields a significant improvement over the earlier algorithm of Demaine et al. [DIMV14] that uses exponentially larger number of passes. We complement this result by showing that the tradeoff between the number of passes and space exhibited by our algorithm is tight, at least when the approximation factor is equal to 11. Specifically, we show that any algorithm that computes set cover exactly using (12ή−1)({1 \over 2\delta}-1) passes must use Ω~(mnÎŽ)\tilde{\Omega}(mn^{\delta}) space in the regime of m=O(n)m=O(n). Furthermore, we consider the problem in the geometric setting where the elements are points in R2\mathbb{R}^2 and sets are either discs, axis-parallel rectangles, or fat triangles in the plane, and show that our algorithm (with a slight modification) uses the optimal O~(n)\tilde{O}(n) space to find a logarithmic approximation in O(1/ÎŽ)O(1/\delta) passes. Finally, we show that any randomized one-pass algorithm that distinguishes between covers of size 2 and 3 must use a linear (i.e., Ω(mn)\Omega(mn)) amount of space. This is the first result showing that a randomized, approximate algorithm cannot achieve a space bound that is sublinear in the input size. This indicates that using multiple passes might be necessary in order to achieve sub-linear space bounds for this problem while guaranteeing small approximation factors.Comment: A preliminary version of this paper is to appear in PODS 201

    On Conceptually Simple Algorithms for Variants of Online Bipartite Matching

    Full text link
    We present a series of results regarding conceptually simple algorithms for bipartite matching in various online and related models. We first consider a deterministic adversarial model. The best approximation ratio possible for a one-pass deterministic online algorithm is 1/21/2, which is achieved by any greedy algorithm. D\"urr et al. recently presented a 22-pass algorithm called Category-Advice that achieves approximation ratio 3/53/5. We extend their algorithm to multiple passes. We prove the exact approximation ratio for the kk-pass Category-Advice algorithm for all k≄1k \ge 1, and show that the approximation ratio converges to the inverse of the golden ratio 2/(1+5)≈0.6182/(1+\sqrt{5}) \approx 0.618 as kk goes to infinity. The convergence is extremely fast --- the 55-pass Category-Advice algorithm is already within 0.01%0.01\% of the inverse of the golden ratio. We then consider a natural greedy algorithm in the online stochastic IID model---MinDegree. This algorithm is an online version of a well-known and extensively studied offline algorithm MinGreedy. We show that MinDegree cannot achieve an approximation ratio better than 1−1/e1-1/e, which is guaranteed by any consistent greedy algorithm in the known IID model. Finally, following the work in Besser and Poloczek, we depart from an adversarial or stochastic ordering and investigate a natural randomized algorithm (MinRanking) in the priority model. Although the priority model allows the algorithm to choose the input ordering in a general but well defined way, this natural algorithm cannot obtain the approximation of the Ranking algorithm in the ROM model
    • 

    corecore