19 research outputs found
Range Quantile Queries: Another Virtue of Wavelet Trees
We show how to use a balanced wavelet tree as a data structure that stores a
list of numbers and supports efficient {\em range quantile queries}. A range
quantile query takes a rank and the endpoints of a sublist and returns the
number with that rank in that sublist. For example, if the rank is half the
sublist's length, then the query returns the sublist's median. We also show how
these queries can be used to support space-efficient {\em coloured range
reporting} and {\em document listing}.Comment: Added note about generalization to any constant number of dimensions
Linear-Space Data Structures for Range Mode Query in Arrays
A mode of a multiset is an element of maximum multiplicity;
that is, occurs at least as frequently as any other element in . Given a
list of items, we consider the problem of constructing a data
structure that efficiently answers range mode queries on . Each query
consists of an input pair of indices for which a mode of must
be returned. We present an -space static data structure
that supports range mode queries in time in the worst case, for
any fixed . When , this corresponds to
the first linear-space data structure to guarantee query time. We
then describe three additional linear-space data structures that provide
, , and query time, respectively, where denotes the
number of distinct elements in and denotes the frequency of the mode of
. Finally, we examine generalizing our data structures to higher dimensions.Comment: 13 pages, 2 figure
Asymptotically Optimal Encodings of Range Data Structures for Selection and Top-k Queries
Given an array A[1, n] of elements with a total order, we consider the problem of building a
data structure that solves two queries: (a) selection queries receive a range [i, j] and an integer
k and return the position of the kth largest element in A[i, j]; (b) top-k queries receive [i, j] and
k and return the positions of the k largest elements in A[i, j]. These problems can be solved in
optimal time, O(1 + lg k/ lg lg n) and O(k), respectively, using linear-space data structures.
We provide the first study of the encoding data structures for the above problems, where A
cannot be accessed at query time. Several applications are interested in the relative order of the
entries of A, and their positions, rather their actual values, and thus we do not need to keep A
at query time. In those cases, encodings save storage space: we first show that any encoding
answering such queries requires n lg k − O(n + k lg k) bits of space; then, we design encodings
using O(n lg k) bits, that is, asymptotically optimal up to constant factors, while preserving
optimal query time.Peer-reviewedPost-prin
Path Queries in Weighted Trees
Trees are fundamental structures in computer science, being widely used in modeling
and representing different types of data in numerous computer applications. In many cases,
properties of objects being modeled are stored as weights or labels on the nodes of trees.
Thus researchers have studied the preprocessing of weighted trees in which each node is
assigned a weight, in order to support various path queries, for which a certain function
over the weights of the nodes along a given query path in the tree is computed [3, 14, 22, 26].
In this thesis, we consider the problem of supporting several various path queries over
a tree on n weighted nodes, where the weights are drawn from a set of σ distinct values.
One query we support is the path median query, which asks for the median weight on a
path between two given nodes. For this and the more general path selection query, we
present a linear space data structure that answers queries in O(lg σ) time under the word
RAM model. This greatly improves previous results on the same problem, as previous data
structures achieving O(lg n) query time use O(n lg^2 n) space, and previous linear space data
structures require O(n^ε) time to answer a query for any positive constant ε [26].
We also consider the path counting query and the path reporting query, where a path
counting query asks for the number of nodes on a query path whose weights are in a
query range, and a path reporting query requires to report these nodes. Our linear space
data structure supports path counting queries with O(lg σ) query time. This matches
the result of Chazelle [14] when σ is close to n, and has better performance when σ is
significantly smaller than n. The same data structure can also support path reporting
queries in O(lg σ + occ lg σ) time, where occ is the size of output. In addition, we present
a data structure that answers path reporting queries in O(lg σ + occ lg lg σ) time, using
O(n lg lg σ) words of space. These are the first data structures that answer path reporting
queries