Search CORE

2,442 research outputs found

Marked Ancestor Problems (Preliminary Version)

Author: Alstrup Stephen
Husfeldt Thore
Rauhe Theis
Publication venue: 'Aarhus University Library'
Publication date: 07/01/1998
Field of study

Consider a rooted tree whose nodes can be marked or unmarked. Given a node, we want to find its nearest marked ancestor. This generalises the well-known predecessor problem, where the tree is a path. We show tight upper and lower bounds for this problem. The lower bounds are proved in the cell probe model, the upper bounds run on a unit-cost RAM. As easy corollaries we prove (often optimal) lower bounds on a number of problems. These include planar range searching, including the existential or emptiness problem, priority search trees, static tree union-find, and several problems from dynamic computational geometry, including intersection problems, proximity problems, and ray shooting. Our upper bounds improve a number of algorithms from various fields, including dynamic dictionary matching and coloured ancestor problems

Tidsskrift.dk (Det Kongelige Bibliotek)

Dynamic Integer Sets with Optimal Rank, Select, and Predecessor Search

Author: Patrascu Mihai
Thorup Mikkel
Publication venue
Publication date: 01/01/2014
Field of study

We present a data structure representing a dynamic set S of w-bit integers on a w-bit word RAM. With |S|=n and w > log n and space O(n), we support the following standard operations in O(log n / log w) time: - insert(x) sets S = S + {x}. - delete(x) sets S = S - {x}. - predecessor(x) returns max{y in S | y= x}. - rank(x) returns #{y in S | y< x}. - select(i) returns y in S with rank(y)=i, if any. Our O(log n/log w) bound is optimal for dynamic rank and select, matching a lower bound of Fredman and Saks [STOC'89]. When the word length is large, our time bound is also optimal for dynamic predecessor, matching a static lower bound of Beame and Fich [STOC'99] whenever log n/log w=O(log w/loglog w). Technically, the most interesting aspect of our data structure is that it supports all the above operations in constant time for sets of size n=w^{O(1)}. This resolves a main open problem of Ajtai, Komlos, and Fredman [FOCS'83]. Ajtai et al. presented such a data structure in Yao's abstract cell-probe model with w-bit cells/words, but pointed out that the functions used could not be implemented. As a partial solution to the problem, Fredman and Willard [STOC'90] introduced a fusion node that could handle queries in constant time, but used polynomial time on the updates. We call our small set data structure a dynamic fusion node as it does both queries and updates in constant time.Comment: Presented with different formatting in Proceedings of the 55nd IEEE Symposium on Foundations of Computer Science (FOCS), 2014, pp. 166--175. The new version fixes a bug in one of the bounds stated for predecessor search, pointed out to me by Djamal Belazzougu

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Towards Tight Lower Bounds for Range Reporting on the RAM

Author: Grønlund Allan
Larsen Kasper Green
Publication venue
Publication date: 03/11/2014
Field of study

In the orthogonal range reporting problem, we are to preprocess a set of

n

points with integer coordinates on a

U \times U

grid. The goal is to support reporting all

k

points inside an axis-aligned query rectangle. This is one of the most fundamental data structure problems in databases and computational geometry. Despite the importance of the problem its complexity remains unresolved in the word-RAM. On the upper bound side, three best tradeoffs exists: (1.) Query time

O(\lg \lg n + k)

with

O(nlg^{\varepsilon}n)

words of space for any constant

\varepsilon>0

. (2.) Query time

O((1 + k) \lg \lg n)

with

O(n \lg \lg n)

words of space. (3.) Query time

O((1+k)\lg^{\varepsilon} n)

with optimal

O(n)

words of space. However, the only known query time lower bound is

\Omega(\log \log n +k)

, even for linear space data structures. All three current best upper bound tradeoffs are derived by reducing range reporting to a ball-inheritance problem. Ball-inheritance is a problem that essentially encapsulates all previous attempts at solving range reporting in the word-RAM. In this paper we make progress towards closing the gap between the upper and lower bounds for range reporting by proving cell probe lower bounds for ball-inheritance. Our lower bounds are tight for a large range of parameters, excluding any further progress for range reporting using the ball-inheritance reduction

arXiv.org e-Print Archive

CiteSeerX

Linear-Space Data Structures for Range Mode Query in Arrays

Author: Durocher Stephane
Morrison Jason
Publication venue
Publication date: 01/01/2011
Field of study

A mode of a multiset

S

is an element

a \in S

of maximum multiplicity; that is,

a

occurs at least as frequently as any other element in

S

. Given a list

A[1:n]

n

items, we consider the problem of constructing a data structure that efficiently answers range mode queries on

A

. Each query consists of an input pair of indices

(i, j)

for which a mode of

A[i:j]

must be returned. We present an

O(n^{2-2\epsilon})

-space static data structure that supports range mode queries in

O(n^\epsilon)

time in the worst case, for any fixed

\epsilon \in [0,1/2]

. When

\epsilon = 1/2

, this corresponds to the first linear-space data structure to guarantee

O(\sqrt{n})

query time. We then describe three additional linear-space data structures that provide

O(k)

O(m)

, and

O(|j-i|)

query time, respectively, where

k

denotes the number of distinct elements in

A

and

m

denotes the frequency of the mode of

A

. Finally, we examine generalizing our data structures to higher dimensions.Comment: 13 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Cell-Probe Bounds for Online Edit Distance and Other Pattern Matching Problems

Author: Clifford Raphael
Jalsenius Markus
Sach Benjamin
Publication venue
Publication date: 24/07/2014
Field of study

We give cell-probe bounds for the computation of edit distance, Hamming distance, convolution and longest common subsequence in a stream. In this model, a fixed string of

n

symbols is given and one

\delta

-bit symbol arrives at a time in a stream. After each symbol arrives, the distance between the fixed string and a suffix of most recent symbols of the stream is reported. The cell-probe model is perhaps the strongest model of computation for showing data structure lower bounds, subsuming in particular the popular word-RAM model. * We first give an

\Omega((\delta \log n)/(w+\log\log n))

lower bound for the time to give each output for both online Hamming distance and convolution, where

w

is the word size. This bound relies on a new encoding scheme and for the first time holds even when

w

is as small as a single bit. * We then consider the online edit distance and longest common subsequence problems in the bit-probe model (

w=1

) with a constant sized input alphabet. We give a lower bound of

\Omega(\sqrt{\log n}/(\log\log n)^{3/2})

which applies for both problems. This second set of results relies both on our new encoding scheme as well as a carefully constructed hard distribution. * Finally, for the online edit distance problem we show that there is an

O((\log n)^2/w)

upper bound in the cell-probe model. This bound gives a contrast to our new lower bound and also establishes an exponential gap between the known cell-probe and RAM model complexities.Comment: 32 pages, 4 figure

arXiv.org e-Print Archive

Explore Bristol Research

Random input helps searching predecessors

Author: Belazzougui D
Kaporis AC
Spirakis PG
Publication venue
Publication date: 01/01/2018
Field of study

A data structure problem consists of the finite sets: D of data, Q of queries, A of query answers, associated with a function f: D x Q → A. The data structure of file X is "static" ("dynamic") if we "do not" ("do") require quick updates as X changes. An important goal is to compactly encode a file X ϵ D, such that for each query y ϵ Q, function f (X, y) requires the minimum time to compute an answer in A. This goal is trivial if the size of D is large, since for each query y ϵ Q, it was shown that f(X,y) requires O(1) time for the most important queries in the literature. Hence, this goal becomes interesting to study as a trade off between the "storage space" and the "query time", both measured as functions of the file size n = \X\. The ideal solution would be to use linear O(n) = O(\X\) space, while retaining a constant O(1) query time. However, if f (X, y) computes the static predecessor search (find largest x ϵ X: x ≤ y), then Ajtai [Ajt88] proved a negative result. By using just n0(1) = [IX]0(1) data space, then it is not possible to evaluate f(X,y) in O(1) time Ay ϵ Q. The proof exhibited a bad distribution of data D, such that Ey∗ ϵ Q (a "difficult" query y∗), that f(X,y∗) requires ω(1) time. Essentially [Ajt88] is an existential result, resolving the worst case scenario. But, [Ajt88] left open the question: do we typically, that is, with high probability (w.h.p.)1 encounter such "difficult" queries y ϵ Q, when assuming reasonable distributions with respect to (w.r.t.) queries and data? Below we make reasonable assumptions w.r.t. the distribution of the queries y ϵ Q, as well as w.r.t. the distribution of data X ϵ D. In two interesting scenarios studied in the literature, we resolve the typical (w.h.p.) query time

University of Liverpool Repository