202 research outputs found
Rank, select and access in grammar-compressed strings
Given a string of length on a fixed alphabet of symbols, a
grammar compressor produces a context-free grammar of size that
generates and only . In this paper we describe data structures to
support the following operations on a grammar-compressed string:
\mbox{rank}_c(S,i) (return the number of occurrences of symbol before
position in ); \mbox{select}_c(S,i) (return the position of the th
occurrence of in ); and \mbox{access}(S,i,j) (return substring
). For rank and select we describe data structures of size
bits that support the two operations in time. We
propose another structure that uses
bits and that supports the two queries in , where
is an arbitrary constant. To our knowledge, we are the first to
study the asymptotic complexity of rank and select in the grammar-compressed
setting, and we provide a hardness result showing that significantly improving
the bounds we achieve would imply a major breakthrough on a hard
graph-theoretical problem. Our main result for access is a method that requires
bits of space and time to extract
consecutive symbols from . Alternatively, we can achieve query time using bits of space. This matches a lower bound stated by Verbin
and Yu for strings where is polynomially related to .Comment: 16 page
Point Location in Constant Time
We preprocess the input subdivision with points on the plane in
time to facilitate point location in constant time.
Previously the preprocessing time is and point location takes
time.Comment: Sairam Chaganti is currently a senior software engineer at Southwest
Airline
Approximate Data Structures with Applications
In this paper we introduce the notion of approximate
data structures, in which a small amount of error is
tolerated in the output. Approximate data structures
trade error of approximation for faster operation, leading to theoretical and practical speedups for a wide variety of algorithms. We give approximate variants of the van Emde Boas data structure, which support the same dynamic operations as the standard van Emde Boas data structure [28, 201, except that answers to queries are approximate. The variants support all operations in constant time provided the error of approximation is l/polylog(n), and in O(loglog n) time provided the error
is l/polynomial(n), for n elements in the data structure.
We consider the tolerance of prototypical algorithms to approximate data structures. We study in particular Prim’s minimumspanning tree algorithm, Dijkstra’s single-source shortest paths algorithm, and an on-line variant of Graham’s convex hull algorithm. To obtain output which approximates the desired output
with the error of approximation tending to zero, Prim’s algorithm requires only linear time, Dijkstra’s algorithm requires O(mloglogn) time, and the on-line variant of Graham’s algorithm requires constant amortized time per operation
Optimal rank and select queries on dictionary-compressed text
We study the problem of supporting queries on a string S of length n within a space bounded by the size \u3b3 of a string attractor for S. In the paper introducing string attractors it was shown that random access on S can be supported in optimal O(log(n/\u3b3)/ log log n) time within O (\u3b3 polylog n) space. In this paper, we extend this result to rank and select queries and provide lower bounds matching our upper bounds on alphabets of polylogarithmic size. Our solutions are given in the form of a space-time trade-off that is more general than the one previously known for grammars and that improves existing bounds on LZ77-compressed text by a log log n time-factor in select queries. We also provide matching lower and upper bounds for partial sum and predecessor queries within attractor-bounded space, and extend our lower bounds to encompass navigation of dictionary-compressed tree representations
Random input helps searching predecessors
A data structure problem consists of the finite sets: D of data, Q of queries, A of query answers, associated with a function f: D x Q → A. The data structure of file X is "static" ("dynamic") if we "do not" ("do") require quick updates as X changes. An important goal is to compactly encode a file X ϵ D, such that for each query y ϵ Q, function f (X, y) requires the minimum time to compute an answer in A. This goal is trivial if the size of D is large, since for each query y ϵ Q, it was shown that f(X,y) requires O(1) time for the most important queries in the literature. Hence, this goal becomes interesting to study as a trade off between the "storage space" and the "query time", both measured as functions of the file size n = \X\. The ideal solution would be to use linear O(n) = O(\X\) space, while retaining a constant O(1) query time. However, if f (X, y) computes the static predecessor search (find largest x ϵ X: x ≤ y), then Ajtai [Ajt88] proved a negative result. By using just n0(1) = [IX]0(1) data space, then it is not possible to evaluate f(X,y) in O(1) time Ay ϵ Q. The proof exhibited a bad distribution of data D, such that Ey∗ ϵ Q (a "difficult" query y∗), that f(X,y∗) requires ω(1) time. Essentially [Ajt88] is an existential result, resolving the worst case scenario. But, [Ajt88] left open the question: do we typically, that is, with high probability (w.h.p.)1 encounter such "difficult" queries y ϵ Q, when assuming reasonable distributions with respect to (w.r.t.) queries and data? Below we make reasonable assumptions w.r.t. the distribution of the queries y ϵ Q, as well as w.r.t. the distribution of data X ϵ D. In two interesting scenarios studied in the literature, we resolve the typical (w.h.p.) query time
- …