    Towards Tight Lower Bounds for Range Reporting on the RAM

    In the orthogonal range reporting problem, we are to preprocess a set of nn points with integer coordinates on a U×UU \times U grid. The goal is to support reporting all kk points inside an axis-aligned query rectangle. This is one of the most fundamental data structure problems in databases and computational geometry. Despite the importance of the problem its complexity remains unresolved in the word-RAM. On the upper bound side, three best tradeoffs exists: (1.) Query time O(lglgn+k)O(\lg \lg n + k) with O(nlgεn)O(nlg^{\varepsilon}n) words of space for any constant ε>0\varepsilon>0. (2.) Query time O((1+k)lglgn)O((1 + k) \lg \lg n) with O(nlglgn)O(n \lg \lg n) words of space. (3.) Query time O((1+k)lgεn)O((1+k)\lg^{\varepsilon} n) with optimal O(n)O(n) words of space. However, the only known query time lower bound is Ω(loglogn+k)\Omega(\log \log n +k), even for linear space data structures. All three current best upper bound tradeoffs are derived by reducing range reporting to a ball-inheritance problem. Ball-inheritance is a problem that essentially encapsulates all previous attempts at solving range reporting in the word-RAM. In this paper we make progress towards closing the gap between the upper and lower bounds for range reporting by proving cell probe lower bounds for ball-inheritance. Our lower bounds are tight for a large range of parameters, excluding any further progress for range reporting using the ball-inheritance reduction

    Optimality of the Johnson-Lindenstrauss Lemma

    For any integers d,n2d, n \geq 2 and 1/(min{n,d})0.4999<ε<11/({\min\{n,d\}})^{0.4999} < \varepsilon<1, we show the existence of a set of nn vectors XRdX\subset \mathbb{R}^d such that any embedding f:XRmf:X\rightarrow \mathbb{R}^m satisfying x,yX, (1ε)xy22f(x)f(y)22(1+ε)xy22 \forall x,y\in X,\ (1-\varepsilon)\|x-y\|_2^2\le \|f(x)-f(y)\|_2^2 \le (1+\varepsilon)\|x-y\|_2^2 must have m=Ω(ε2lgn). m = \Omega(\varepsilon^{-2} \lg n). This lower bound matches the upper bound given by the Johnson-Lindenstrauss lemma [JL84]. Furthermore, our lower bound holds for nearly the full range of ε\varepsilon of interest, since there is always an isometric embedding into dimension min{d,n}\min\{d, n\} (either the identity map, or projection onto span(X)\mathop{span}(X)). Previously such a lower bound was only known to hold against linear maps ff, and not for such a wide range of parameters ε,n,d\varepsilon, n, d [LN16]. The best previously known lower bound for general ff was m=Ω(ε2lgn/lg(1/ε))m = \Omega(\varepsilon^{-2}\lg n/\lg(1/\varepsilon)) [Wel74, Lev83, Alo03], which is suboptimal for any ε=o(1)\varepsilon = o(1).Comment: v2: simplified proof, also added reference to Lev8

    In the orthogonal range reporting problem, we are to preprocess a set of n points with integer coordinates on a UxU grid. The goal is to support reporting all k points inside an axis-aligned query rectangle. This is one of the most fundamental data structure problems in databases and computational geometry. Despite the importance of the problem its complexity remains unresolved in the word-RAM. On the upper bound side, three best tradeoffs exist, all derived by reducing range reporting to a ball-inheritance problem. Ball-inheritance is a problem that essentially encapsulates all previous attempts at solving range reporting in the word-RAM. In this paper we make progress towards closing the gap between the upper and lower bounds for range reporting by proving cell probe lower bounds for ball-inheritance. Our lower bounds are tight for a large range of parameters, excluding any further progress for range reporting using the ball-inheritance reduction

    New Unconditional Hardness Results for Dynamic and Online Problems

    There has been a resurgence of interest in lower bounds whose truth rests on the conjectured hardness of well known computational problems. These conditional lower bounds have become important and popular due to the painfully slow progress on proving strong unconditional lower bounds. Nevertheless, the long term goal is to replace these conditional bounds with unconditional ones. In this paper we make progress in this direction by studying the cell probe complexity of two conjectured to be hard problems of particular importance: matrix-vector multiplication and a version of dynamic set disjointness known as Patrascu's Multiphase Problem. We give improved unconditional lower bounds for these problems as well as introducing new proof techniques of independent interest. These include a technique capable of proving strong threshold lower bounds of the following form: If we insist on having a very fast query time, then the update time has to be slow enough to compute a lookup table with the answer to every possible query. This is the first time a lower bound of this type has been proven

    Optimal learning of joint alignments with a faulty oracle

    Full text link
    We consider the following problem, which is useful in applications such as joint image and shape alignment. The goal is to recover n discrete variables gi ∈ {0, . . . , k − 1} (up to some global offset) given noisy observations of a set of their pairwise differences {(gi − gj) mod k}; specifically, with probability 1 k + for some > 0 one obtains the correct answer, and with the remaining probability one obtains a uniformly random incorrect answer. We consider a learning-based formulation where one can perform a query to observe a pairwise difference, and the goal is to perform as few queries as possible while obtaining the exact joint alignment. We provide an easy-to-implement, time efficient algorithm that performs O (n lg n k^2 ) queries, and recovers the joint alignment with high probability. We also show that our algorithm is optimal by proving a general lower bound that holds for all non-adaptive algorithms. Our work improves significantly recent work by Chen and Cand´es [CC16], who view the problem as a constrained principal components analysis problem that can be solved using the power method. Specifically, our approach is simpler both in the algorithm and the analysis, and provides additional insights into the problem structure.First author draf

    Near-optimal labeling schemes for nearest common ancestors

    Full text link
    We consider NCA labeling schemes: given a rooted tree TT, label the nodes of TT with binary strings such that, given the labels of any two nodes, one can determine, by looking only at the labels, the label of their nearest common ancestor. For trees with nn nodes we present upper and lower bounds establishing that labels of size (2±ϵ)logn(2\pm \epsilon)\log n, ϵ<1\epsilon<1 are both sufficient and necessary. (All logarithms in this paper are in base 2.) Alstrup, Bille, and Rauhe (SIDMA'05) showed that ancestor and NCA labeling schemes have labels of size logn+Ω(loglogn)\log n +\Omega(\log \log n). Our lower bound increases this to logn+Ω(logn)\log n + \Omega(\log n) for NCA labeling schemes. Since Fraigniaud and Korman (STOC'10) established that labels in ancestor labeling schemes have size logn+Θ(loglogn)\log n +\Theta(\log \log n), our new lower bound separates ancestor and NCA labeling schemes. Our upper bound improves the 10logn10 \log n upper bound by Alstrup, Gavoille, Kaplan and Rauhe (TOCS'04), and our theoretical result even outperforms some recent experimental studies by Fischer (ESA'09) where variants of the same NCA labeling scheme are shown to all have labels of size approximately 8logn8 \log n

    Lower Bounds for Oblivious Near-Neighbor Search

    We prove an Ω(dlgn/(lglgn)2)\Omega(d \lg n/ (\lg\lg n)^2) lower bound on the dynamic cell-probe complexity of statistically oblivious\mathit{oblivious} approximate-near-neighbor search (ANN\mathsf{ANN}) over the dd-dimensional Hamming cube. For the natural setting of d=Θ(logn)d = \Theta(\log n), our result implies an Ω~(lg2n)\tilde{\Omega}(\lg^2 n) lower bound, which is a quadratic improvement over the highest (non-oblivious) cell-probe lower bound for ANN\mathsf{ANN}. This is the first super-logarithmic unconditional\mathit{unconditional} lower bound for ANN\mathsf{ANN} against general (non black-box) data structures. We also show that any oblivious static\mathit{static} data structure for decomposable search problems (like ANN\mathsf{ANN}) can be obliviously dynamized with O(logn)O(\log n) overhead in update and query time, strengthening a classic result of Bentley and Saxe (Algorithmica, 1980).Comment: 28 page