3 research outputs found

    The Communication Complexity of Set Intersection and Multiple Equality Testing

    Full text link
    In this paper we explore fundamental problems in randomized communication complexity such as computing Set Intersection on sets of size kk and Equality Testing between vectors of length kk. Sa\u{g}lam and Tardos and Brody et al. showed that for these types of problems, one can achieve optimal communication volume of O(k)O(k) bits, with a randomized protocol that takes O(logk)O(\log^* k) rounds. Aside from rounds and communication volume, there is a \emph{third} parameter of interest, namely the \emph{error probability} perrp_{\mathrm{err}}. It is straightforward to show that protocols for Set Intersection or Equality Testing need to send Ω(k+logperr1)\Omega(k + \log p_{\mathrm{err}}^{-1}) bits. Is it possible to simultaneously achieve optimality in all three parameters, namely O(k+logperr1)O(k + \log p_{\mathrm{err}}^{-1}) communication and O(logk)O(\log^* k) rounds? In this paper we prove that there is no universally optimal algorithm, and complement the existing round-communication tradeoffs with a new tradeoff between rounds, communication, and probability of error. In particular: 1. Any protocol for solving Multiple Equality Testing in rr rounds with failure probability 2E2^{-E} has communication volume Ω(Ek1/r)\Omega(Ek^{1/r}). 2. There exists a protocol for solving Multiple Equality Testing in r+log(k/E)r + \log^*(k/E) rounds with O(k+rEk1/r)O(k + rEk^{1/r}) communication, thereby essentially matching our lower bound and that of Sa\u{g}lam and Tardos. Our original motivation for considering perrp_{\mathrm{err}} as an independent parameter came from the problem of enumerating triangles in distributed (CONGEST\textsf{CONGEST}) networks having maximum degree Δ\Delta. We prove that this problem can be solved in O(Δ/logn+loglogΔ)O(\Delta/\log n + \log\log \Delta) time with high probability 11/poly(n)1-1/\operatorname{poly}(n).Comment: 44 page

    Efficient Algorithms for Large Scale Network Problems

    Full text link
    In recent years, the growing scale of data has renewed our understanding of what is an efficient algorithm and poses many essential challenges for the algorithm designers. This thesis aims to improve our understanding of many algorithmic problems in this context. These include problems in communication complexity, matching theory, and approximate query processing for database systems. We first study the fundamental and well-known question of {SetIntersection} in communication complexity. We give a result that incorporates the error probability as an independent parameter into the classical trade-off between round complexity and communication complexity. We show that any rr-round protocol that errs with error probability 2E2^{-E} requires Omega(Ek1/r)Omega(Ek^{1/r}) bits of communication. We also give several almost matching upper bounds. In matching theory, we first study several generalizations of the ordinary matching problem, namely the ff-matching and ff-edge cover problem. We also consider the problem of computing a minimum weight perfect matching in a metric space with moderate expansion. We give almost linear time approximation algorithms for all these problems. Finally, we study the sample-based join problem in approximate query processing. We present a result that improves our understanding of the effectiveness and limitations in using sampling to approximate join queries and provides a guideline for practitioners in building AQP systems from a theory perspective.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155263/1/hdawei_1.pd
    corecore