99 research outputs found

### Towards Tight Lower Bounds for Range Reporting on the RAM

In the orthogonal range reporting problem, we are to preprocess a set of $n$
points with integer coordinates on a $U \times U$ grid. The goal is to support
reporting all $k$ points inside an axis-aligned query rectangle. This is one of
the most fundamental data structure problems in databases and computational
geometry. Despite the importance of the problem its complexity remains
unresolved in the word-RAM. On the upper bound side, three best tradeoffs
exists: (1.) Query time $O(\lg \lg n + k)$ with $O(nlg^{\varepsilon}n)$ words
of space for any constant $\varepsilon>0$. (2.) Query time $O((1 + k) \lg \lg
n)$ with $O(n \lg \lg n)$ words of space. (3.) Query time
$O((1+k)\lg^{\varepsilon} n)$ with optimal $O(n)$ words of space. However, the
only known query time lower bound is $\Omega(\log \log n +k)$, even for linear
space data structures.
All three current best upper bound tradeoffs are derived by reducing range
reporting to a ball-inheritance problem. Ball-inheritance is a problem that
essentially encapsulates all previous attempts at solving range reporting in
the word-RAM. In this paper we make progress towards closing the gap between
the upper and lower bounds for range reporting by proving cell probe lower
bounds for ball-inheritance. Our lower bounds are tight for a large range of
parameters, excluding any further progress for range reporting using the
ball-inheritance reduction

### Optimality of the Johnson-Lindenstrauss Lemma

For any integers $d, n \geq 2$ and $1/({\min\{n,d\}})^{0.4999} <
\varepsilon<1$, we show the existence of a set of $n$ vectors $X\subset
\mathbb{R}^d$ such that any embedding $f:X\rightarrow \mathbb{R}^m$ satisfying
$\forall x,y\in X,\ (1-\varepsilon)\|x-y\|_2^2\le \|f(x)-f(y)\|_2^2 \le
(1+\varepsilon)\|x-y\|_2^2$ must have $m = \Omega(\varepsilon^{-2} \lg n).$ This lower bound matches the upper bound given by the Johnson-Lindenstrauss
lemma [JL84]. Furthermore, our lower bound holds for nearly the full range of
$\varepsilon$ of interest, since there is always an isometric embedding into
dimension $\min\{d, n\}$ (either the identity map, or projection onto
$\mathop{span}(X)$).
Previously such a lower bound was only known to hold against linear maps $f$,
and not for such a wide range of parameters $\varepsilon, n, d$ [LN16]. The
best previously known lower bound for general $f$ was $m =
\Omega(\varepsilon^{-2}\lg n/\lg(1/\varepsilon))$ [Wel74, Lev83, Alo03], which
is suboptimal for any $\varepsilon = o(1)$.Comment: v2: simplified proof, also added reference to Lev8

### Towards Tight Lower Bounds for Range Reporting on the RAM

In the orthogonal range reporting problem, we are to preprocess a set of n points with integer coordinates on a UxU grid. The goal is to support reporting all k points inside an axis-aligned query rectangle. This is one of the most fundamental data structure problems in databases and computational geometry. Despite the importance of the problem its complexity remains unresolved in the word-RAM.
On the upper bound side, three best tradeoffs exist, all derived by reducing range reporting to a ball-inheritance problem. Ball-inheritance is a problem that essentially encapsulates all previous attempts at solving range reporting in the word-RAM. In this paper we make progress towards closing the gap between the upper and lower bounds for range reporting by proving cell probe lower bounds for ball-inheritance. Our lower bounds are tight for a large range of parameters, excluding any further progress for range reporting using the ball-inheritance reduction

### New Unconditional Hardness Results for Dynamic and Online Problems

There has been a resurgence of interest in lower bounds whose truth rests on
the conjectured hardness of well known computational problems. These
conditional lower bounds have become important and popular due to the painfully
slow progress on proving strong unconditional lower bounds. Nevertheless, the
long term goal is to replace these conditional bounds with unconditional ones.
In this paper we make progress in this direction by studying the cell probe
complexity of two conjectured to be hard problems of particular importance:
matrix-vector multiplication and a version of dynamic set disjointness known as
Patrascu's Multiphase Problem. We give improved unconditional lower bounds for
these problems as well as introducing new proof techniques of independent
interest. These include a technique capable of proving strong threshold lower
bounds of the following form: If we insist on having a very fast query time,
then the update time has to be slow enough to compute a lookup table with the
answer to every possible query. This is the first time a lower bound of this
type has been proven

### Optimal learning of joint alignments with a faulty oracle

We consider the following problem, which is useful in applications such as joint image and
shape alignment. The goal is to recover n discrete variables gi ∈ {0, . . . , k − 1} (up to some
global offset) given noisy observations of a set of their pairwise differences {(gi − gj) mod k};
specifically, with probability 1
k + for some > 0 one obtains the correct answer, and with
the remaining probability one obtains a uniformly random incorrect answer. We consider a
learning-based formulation where one can perform a query to observe a pairwise difference, and
the goal is to perform as few queries as possible while obtaining the exact joint alignment.
We provide an easy-to-implement, time efficient algorithm that performs O (n lg n
k^2 ) queries, and
recovers the joint alignment with high probability. We also show that our algorithm is optimal
by proving a general lower bound that holds for all non-adaptive algorithms. Our work improves
significantly recent work by Chen and Cand´es [CC16], who view the problem as a constrained
principal components analysis problem that can be solved using the power method. Specifically,
our approach is simpler both in the algorithm and the analysis, and provides additional insights
into the problem structure.First author draf

### Near-optimal labeling schemes for nearest common ancestors

We consider NCA labeling schemes: given a rooted tree $T$, label the nodes of
$T$ with binary strings such that, given the labels of any two nodes, one can
determine, by looking only at the labels, the label of their nearest common
ancestor.
For trees with $n$ nodes we present upper and lower bounds establishing that
labels of size $(2\pm \epsilon)\log n$, $\epsilon<1$ are both sufficient and
necessary. (All logarithms in this paper are in base 2.)
Alstrup, Bille, and Rauhe (SIDMA'05) showed that ancestor and NCA labeling
schemes have labels of size $\log n +\Omega(\log \log n)$. Our lower bound
increases this to $\log n + \Omega(\log n)$ for NCA labeling schemes. Since
Fraigniaud and Korman (STOC'10) established that labels in ancestor labeling
schemes have size $\log n +\Theta(\log \log n)$, our new lower bound separates
ancestor and NCA labeling schemes. Our upper bound improves the $10 \log n$
upper bound by Alstrup, Gavoille, Kaplan and Rauhe (TOCS'04), and our
theoretical result even outperforms some recent experimental studies by Fischer
(ESA'09) where variants of the same NCA labeling scheme are shown to all have
labels of size approximately $8 \log n$

### Lower Bounds for Oblivious Near-Neighbor Search

We prove an $\Omega(d \lg n/ (\lg\lg n)^2)$ lower bound on the dynamic
cell-probe complexity of statistically $\mathit{oblivious}$
approximate-near-neighbor search ($\mathsf{ANN}$) over the $d$-dimensional
Hamming cube. For the natural setting of $d = \Theta(\log n)$, our result
implies an $\tilde{\Omega}(\lg^2 n)$ lower bound, which is a quadratic
improvement over the highest (non-oblivious) cell-probe lower bound for
$\mathsf{ANN}$. This is the first super-logarithmic $\mathit{unconditional}$
lower bound for $\mathsf{ANN}$ against general (non black-box) data structures.
We also show that any oblivious $\mathit{static}$ data structure for
decomposable search problems (like $\mathsf{ANN}$) can be obliviously dynamized
with $O(\log n)$ overhead in update and query time, strengthening a classic
result of Bentley and Saxe (Algorithmica, 1980).Comment: 28 page

- …