9,882 research outputs found

    Lower Bounds on Time-Space Trade-Offs for Approximate Near Neighbors

    Get PDF
    We show tight lower bounds for the entire trade-off between space and query time for the Approximate Near Neighbor search problem. Our lower bounds hold in a restricted model of computation, which captures all hashing-based approaches. In articular, our lower bound matches the upper bound recently shown in [Laarhoven 2015] for the random instance on a Euclidean sphere (which we show in fact extends to the entire space Rd\mathbb{R}^d using the techniques from [Andoni, Razenshteyn 2015]). We also show tight, unconditional cell-probe lower bounds for one and two probes, improving upon the best known bounds from [Panigrahy, Talwar, Wieder 2010]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than for one probe. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.Comment: 47 pages, 2 figures; v2: substantially revised introduction, lots of small corrections; subsumed by arXiv:1608.03580 [cs.DS] (along with arXiv:1511.07527 [cs.DS]

    Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors

    Full text link
    [See the paper for the full abstract.] We show tight upper and lower bounds for time-space trade-offs for the cc-Approximate Near Neighbor Search problem. For the dd-dimensional Euclidean space and nn-point datasets, we develop a data structure with space n1+ρu+o(1)+O(dn)n^{1 + \rho_u + o(1)} + O(dn) and query time nρq+o(1)+dno(1)n^{\rho_q + o(1)} + d n^{o(1)} for every ρu,ρq0\rho_u, \rho_q \geq 0 such that: \begin{equation} c^2 \sqrt{\rho_q} + (c^2 - 1) \sqrt{\rho_u} = \sqrt{2c^2 - 1}. \end{equation} This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor c>1c > 1, improving upon [Kapralov, PODS 2015]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on Spherical Locality-Sensitive Filtering [Becker, Ducas, Gama, Laarhoven, SODA 2016] and data-dependent hashing [Andoni, Indyk, Nguyen, Razenshteyn, SODA 2014] [Andoni, Razenshteyn, STOC 2015]. Our matching lower bounds are of two types: conditional and unconditional. First, we prove tightness of the whole above trade-off in a restricted model of computation, which captures all known hashing-based approaches. We then show unconditional cell-probe lower bounds for one and two probes that match the above trade-off for ρq=0\rho_q = 0, improving upon the best known lower bounds from [Panigrahy, Talwar, Wieder, FOCS 2010]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than the one-probe bound. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.Comment: 62 pages, 5 figures; a merger of arXiv:1511.07527 [cs.DS] and arXiv:1605.02701 [cs.DS], which subsumes both of the preprints. New version contains more elaborated proofs and fixed some typo

    Finding the Median (Obliviously) with Bounded Space

    Full text link
    We prove that any oblivious algorithm using space SS to find the median of a list of nn integers from {1,...,2n}\{1,...,2n\} requires time Ω(nloglogSn)\Omega(n \log\log_S n). This bound also applies to the problem of determining whether the median is odd or even. It is nearly optimal since Chan, following Munro and Raman, has shown that there is a (randomized) selection algorithm using only ss registers, each of which can store an input value or O(logn)O(\log n)-bit counter, that makes only O(loglogsn)O(\log\log_s n) passes over the input. The bound also implies a size lower bound for read-once branching programs computing the low order bit of the median and implies the analog of PNPcoNPP \ne NP \cap coNP for length o(nloglogn)o(n \log\log n) oblivious branching programs

    Convex Hull of Points Lying on Lines in o(n log n) Time after Preprocessing

    Full text link
    Motivated by the desire to cope with data imprecision, we study methods for taking advantage of preliminary information about point sets in order to speed up the computation of certain structures associated with them. In particular, we study the following problem: given a set L of n lines in the plane, we wish to preprocess L such that later, upon receiving a set P of n points, each of which lies on a distinct line of L, we can construct the convex hull of P efficiently. We show that in quadratic time and space it is possible to construct a data structure on L that enables us to compute the convex hull of any such point set P in O(n alpha(n) log* n) expected time. If we further assume that the points are "oblivious" with respect to the data structure, the running time improves to O(n alpha(n)). The analysis applies almost verbatim when L is a set of line-segments, and yields similar asymptotic bounds. We present several extensions, including a trade-off between space and query time and an output-sensitive algorithm. We also study the "dual problem" where we show how to efficiently compute the (<= k)-level of n lines in the plane, each of which lies on a distinct point (given in advance). We complement our results by Omega(n log n) lower bounds under the algebraic computation tree model for several related problems, including sorting a set of points (according to, say, their x-order), each of which lies on a given line known in advance. Therefore, the convex hull problem under our setting is easier than sorting, contrary to the "standard" convex hull and sorting problems, in which the two problems require Theta(n log n) steps in the worst case (under the algebraic computation tree model).Comment: 26 pages, 5 figures, 1 appendix; a preliminary version appeared at SoCG 201

    Data Structure Lower Bounds for Document Indexing Problems

    Get PDF
    We study data structure problems related to document indexing and pattern matching queries and our main contribution is to show that the pointer machine model of computation can be extremely useful in proving high and unconditional lower bounds that cannot be obtained in any other known model of computation with the current techniques. Often our lower bounds match the known space-query time trade-off curve and in fact for all the problems considered, there is a very good and reasonable match between the our lower bounds and the known upper bounds, at least for some choice of input parameters. The problems that we consider are set intersection queries (both the reporting variant and the semi-group counting variant), indexing a set of documents for two-pattern queries, or forbidden- pattern queries, or queries with wild-cards, and indexing an input set of gapped-patterns (or two-patterns) to find those matching a document given at the query time.Comment: Full version of the conference version that appeared at ICALP 2016, 25 page

    Deterministic Time-Space Tradeoffs for k-SUM

    Get PDF
    Given a set of numbers, the kk-SUM problem asks for a subset of kk numbers that sums to zero. When the numbers are integers, the time and space complexity of kk-SUM is generally studied in the word-RAM model; when the numbers are reals, the complexity is studied in the real-RAM model, and space is measured by the number of reals held in memory at any point. We present a time and space efficient deterministic self-reduction for the kk-SUM problem which holds for both models, and has many interesting consequences. To illustrate: * 33-SUM is in deterministic time O(n2lglg(n)/lg(n))O(n^2 \lg\lg(n)/\lg(n)) and space O(nlg(n)lglg(n))O\left(\sqrt{\frac{n \lg(n)}{\lg\lg(n)}}\right). In general, any polylogarithmic-time improvement over quadratic time for 33-SUM can be converted into an algorithm with an identical time improvement but low space complexity as well. * 33-SUM is in deterministic time O(n2)O(n^2) and space O(n)O(\sqrt n), derandomizing an algorithm of Wang. * A popular conjecture states that 3-SUM requires n2o(1)n^{2-o(1)} time on the word-RAM. We show that the 3-SUM Conjecture is in fact equivalent to the (seemingly weaker) conjecture that every O(n.51)O(n^{.51})-space algorithm for 33-SUM requires at least n2o(1)n^{2-o(1)} time on the word-RAM. * For k4k \ge 4, kk-SUM is in deterministic O(nk2+2/k)O(n^{k - 2 + 2/k}) time and O(n)O(\sqrt{n}) space

    Element Distinctness, Frequency Moments, and Sliding Windows

    Full text link
    We derive new time-space tradeoff lower bounds and algorithms for exactly computing statistics of input data, including frequency moments, element distinctness, and order statistics, that are simple to calculate for sorted data. We develop a randomized algorithm for the element distinctness problem whose time T and space S satisfy T in O (n^{3/2}/S^{1/2}), smaller than previous lower bounds for comparison-based algorithms, showing that element distinctness is strictly easier than sorting for randomized branching programs. This algorithm is based on a new time and space efficient algorithm for finding all collisions of a function f from a finite set to itself that are reachable by iterating f from a given set of starting points. We further show that our element distinctness algorithm can be extended at only a polylogarithmic factor cost to solve the element distinctness problem over sliding windows, where the task is to take an input of length 2n-1 and produce an output for each window of length n, giving n outputs in total. In contrast, we show a time-space tradeoff lower bound of T in Omega(n^2/S) for randomized branching programs to compute the number of distinct elements over sliding windows. The same lower bound holds for computing the low-order bit of F_0 and computing any frequency moment F_k, k neq 1. This shows that those frequency moments and the decision problem F_0 mod 2 are strictly harder than element distinctness. We complement this lower bound with a T in O(n^2/S) comparison-based deterministic RAM algorithm for exactly computing F_k over sliding windows, nearly matching both our lower bound for the sliding-window version and the comparison-based lower bounds for the single-window version. We further exhibit a quantum algorithm for F_0 over sliding windows with T in O(n^{3/2}/S^{1/2}). Finally, we consider the computations of order statistics over sliding windows.Comment: arXiv admin note: substantial text overlap with arXiv:1212.437
    corecore