Search CORE

13,109 research outputs found

Using Hashing to Solve the Dictionary Problem (In External Memory)

Author: Iacono John
Pǎtraşcu Mihai
Publication venue
Publication date: 01/01/2011
Field of study

We consider the dictionary problem in external memory and improve the update time of the well-known buffer tree by roughly a logarithmic factor. For any \lambda >= max {lg lg n, log_{M/B} (n/B)}, we can support updates in time O(\lambda / B) and queries in sublogarithmic time, O(log_\lambda n). We also present a lower bound in the cell-probe model showing that our data structure is optimal. In the RAM, hash tables have been used to solve the dictionary problem faster than binary search for more than half a century. By contrast, our data structure is the first to beat the comparison barrier in external memory. Ours is also the first data structure to depart convincingly from the indivisibility paradigm

arXiv.org e-Print Archive

CiteSeerX

DI-fusion

Deterministic and Probabilistic Binary Search in Graphs

Author: Aslam J. A.
Burnashev M. V.
Dhagat A.
Karp R. M.
Laber E. S.
Mozes S.
Nowak R.
Pedrotti A.
Ulam S. M.
Publication venue
Publication date: 28/07/2017
Field of study

We consider the following natural generalization of Binary Search: in a given undirected, positively weighted graph, one vertex is a target. The algorithm's task is to identify the target by adaptively querying vertices. In response to querying a node

q

, the algorithm learns either that

q

is the target, or is given an edge out of

q

that lies on a shortest path from

q

to the target. We study this problem in a general noisy model in which each query independently receives a correct answer with probability

p > \frac{1}{2}

(a known constant), and an (adversarial) incorrect one with probability

1-p

. Our main positive result is that when

p = 1

(i.e., all answers are correct),

\log_2 n

queries are always sufficient. For general

p

, we give an (almost information-theoretically optimal) algorithm that uses, in expectation, no more than

(1 - \delta)\frac{\log_2 n}{1 - H(p)} + o(\log n) + O(\log^2 (1/\delta))

queries, and identifies the target correctly with probability at leas

1-\delta

. Here,

H(p) = -(p \log p + (1-p) \log(1-p))

denotes the entropy. The first bound is achieved by the algorithm that iteratively queries a 1-median of the nodes not ruled out yet; the second bound by careful repeated invocations of a multiplicative weights algorithm. Even for

p = 1

, we show several hardness results for the problem of determining whether a target can be found using

K

queries. Our upper bound of

\log_2 n

implies a quasipolynomial-time algorithm for undirected connected graphs; we show that this is best-possible under the Strong Exponential Time Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs with non-uniform node querying costs, the problem is PSPACE-complete. For a semi-adaptive version, in which one may query

r

nodes each in

k

rounds, we show membership in

\Sigma_{2k-1}

in the polynomial hierarchy, and hardness for

\Sigma_{2k-5}

arXiv.org e-Print Archive

Crossref

Optimal Joins Using Compact Data Structures

Author: Navarro Gonzalo
Reutter Juan L.
Rojas-Ledesma Javiel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

Worst-case optimal join algorithms have gained a lot of attention in the database literature. We now count with several algorithms that are optimal in the worst case, and many of them have been implemented and validated in practice. However, the implementation of these algorithms often requires an enhanced indexing structure: to achieve optimality we either need to build completely new indexes, or we must populate the database with several instantiations of indexes such as B+-trees. Either way, this means spending an extra amount of storage space that may be non-negligible. We show that optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of extra storage. Our representation is a compact quadtree for the static indexes, and a dynamic quadtree sharing subtrees (which we dub a qdag) for intermediate results. We develop a compositional algorithm to process full join queries under this representation, and show that the running time of this algorithm is worst-case optimal in data complexity. Remarkably, we can extend our framework to evaluate more expressive queries from relational algebra by introducing a lazy version of qdags (lqdags). Once again, we can show that the running time of our algorithms is worst-case optimal

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Separating decision tree complexity from subcube partition complexity

Author: Kothari Robin
Racicot-Desloges David
Santha Miklos
Publication venue
Publication date: 01/01/2015
Field of study

The subcube partition model of computation is at least as powerful as decision trees but no separation between these models was known. We show that there exists a function whose deterministic subcube partition complexity is asymptotically smaller than its randomized decision tree complexity, resolving an open problem of Friedgut, Kahn, and Wigderson (2002). Our lower bound is based on the information-theoretic techniques first introduced to lower bound the randomized decision tree complexity of the recursive majority function. We also show that the public-coin partition bound, the best known lower bound method for randomized decision tree complexity subsuming other general techniques such as block sensitivity, approximate degree, randomized certificate complexity, and the classical adversary bound, also lower bounds randomized subcube partition complexity. This shows that all these lower bound techniques cannot prove optimal lower bounds for randomized decision tree complexity, which answers an open question of Jain and Klauck (2010) and Jain, Lee, and Vishnoi (2014).Comment: 16 pages, 1 figur

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server