9 research outputs found
Dynamic Range Majority Data Structures
Given a set of coloured points on the real line, we study the problem of
answering range -majority (or "heavy hitter") queries on . More
specifically, for a query range , we want to return each colour that is
assigned to more than an -fraction of the points contained in . We
present a new data structure for answering range -majority queries on a
dynamic set of points, where . Our data structure uses O(n)
space, supports queries in time, and updates in amortized time. If the coordinates of the points are integers,
then the query time can be improved to . For constant values of , this improved query
time matches an existing lower bound, for any data structure with
polylogarithmic update time. We also generalize our data structure to handle
sets of points in d-dimensions, for , as well as dynamic arrays, in
which each entry is a colour.Comment: 16 pages, Preliminary version appeared in ISAAC 201
Memory-Adjustable Navigation Piles with Applications to Sorting and Convex Hulls
We consider space-bounded computations on a random-access machine (RAM) where
the input is given on a read-only random-access medium, the output is to be
produced to a write-only sequential-access medium, and the available workspace
allows random reads and writes but is of limited capacity. The length of the
input is elements, the length of the output is limited by the computation,
and the capacity of the workspace is bits for some predetermined
parameter . We present a state-of-the-art priority queue---called an
adjustable navigation pile---for this restricted RAM model. Under some
reasonable assumptions, our priority queue supports and
in worst-case time and in worst-case time for any . We show how to use this
data structure to sort elements and to compute the convex hull of
points in the two-dimensional Euclidean space in
worst-case time for any . Following a known lower bound for the
space-time product of any branching program for finding unique elements, both
our sorting and convex-hull algorithms are optimal. The adjustable navigation
pile has turned out to be useful when designing other space-efficient
algorithms, and we expect that it will find its way to yet other applications.Comment: 21 page
Priority queues and sorting for read-only data
Abstract. We revisit the random-access-machine model in which the input is given on a read-only random-access media, the output is to be produced to a write-only sequential-access media, and in addition there is a limited random-access workspace. The length of the input is N elements, the length of the output is limited by the computation itself, and the capacity of the workspace is O(S + w) bits, where S is a parameter specified by the user and w is the number of bits per machine word. We present a state-of-the-art priority queue-called an adjustable navigation pile-for this model. Under some reasonable assumptions, our priority queue supports minimum and insert in O(1) worst-case time and extract in O(N/S +lg S) worst-case time, where lg N ≤ S ≤ N/ lg N . We also show how to use this data structure to simplify the existing optimal O(N 2 /S + N lg S)-time sorting algorithm for this model
Space-Efficient Data Structures in the Word-RAM and Bitprobe Models
This thesis studies data structures in the word-RAM and bitprobe models, with an emphasis on space efficiency. In the word-RAM model of computation the space cost of a data structure is measured in terms of the number of w-bit words stored in memory, and the cost of answering a query is measured in terms of the number of read, write, and arithmetic operations that must be performed. In the bitprobe model, like the word-RAM model, the space cost is measured in terms of the number of bits stored in memory, but the query cost is measured solely in terms of the number of bit accesses, or probes, that are performed.
First, we examine the problem of succinctly representing a partially ordered set, or poset, in the word-RAM model with word size
Theta(lg n) bits. A succinct representation of a combinatorial object is one that occupies space matching the information theoretic lower bound to within lower order terms. We show how to represent a poset on n vertices using a data structure that occupies n^2/4 + o(n^2) bits, and can answer precedence (i.e., less-than) queries in
constant time. Since the transitive closure of a directed acyclic graph is a poset, this implies that we can support reachability
queries on an arbitrary directed graph in the same space bound. As far as we are aware, this is the first representation of an arbitrary directed graph that supports reachability queries in constant time,
and stores less than n choose 2 bits. We also consider several additional query operations.
Second, we examine the problem of supporting range queries on strings
of n characters (or, equivalently, arrays of
n elements) in the word-RAM model with word size Theta(lg n) bits. We focus on the specific problem of answering range majority queries: i.e., given a range, report the
character that is the majority among those in the range, if one exists. We show that these queries can be supported in constant time
using a linear space (in words) data structure. We generalize this
result in several directions, considering various frequency thresholds, geometric variants of the problem, and dynamism. These
results are in stark contrast to recent work on the similar range mode problem, in which the query operation asks for the mode (i.e., most frequent) character in a given range. The current best data structures for the range mode problem take soft-Oh(n^(1/2)) time per query for linear space data structures.
Third, we examine the deterministic membership (or dictionary) problem in the bitprobe model. This problem asks us to store a set of n elements drawn from a universe [1,u] such that membership queries
can be always answered in t bit probes. We present several new fully explicit results for this problem, in particular for the case
when n = 2, answering an open problem posed by Radhakrishnan, Shah, and Shannigrahi [ESA 2010]. We also present a general strategy for the membership problem that can be used to solve many related fundamental problems, such as rank, counting, and emptiness queries.
Finally, we conclude with a list of open problems and avenues for future work
Efficient Data Structures for Partial Orders, Range Modes, and Graph Cuts
This thesis considers the study of data structures from the perspective of the theoretician, with a focus on simplicity and practicality. We consider both the time complexity as well as space usage of proposed solutions. Topics discussed fall in three main categories: partial order representation, range modes, and graph cuts.
We consider two problems in partial order representation. The first is a data structure to represent a lattice. A lattice is a partial order where the set of elements larger than any two elements x and y are all larger than an element z, known as the join of x and y; a similar condition holds for elements smaller than any two elements. Our data structure is the first correct solution that can simultaneously compute joins and the inverse meet operation in sublinear time while also using subquadratic space. The second is a data structure to support queries on a dynamic set of one-dimensional ordered data; that is, essentially any operation computable on a binary search tree. We develop a data structure that is able to interpolate between binary search trees and efficient priority queues, offering more-efficient insertion times than the former when query distribution is non-uniform.
We also consider static and dynamic exact and approximate range mode. Given one-dimensional data, the range mode problem is to compute the mode of a subinterval of the data. In the dynamic range mode problem, insertions and deletions are permitted. For the approximate problem, the element returned is to have frequency no less than a factor (1+epsilon) of the true mode, for some epsilon > 0. Our results include a linear-space dynamic exact range mode data structure that simultaneously improves on best previous operation complexity and an exact dynamic range mode data structure that breaks the Theta(n^(2/3)) time per operation barrier. For approximate range mode, we develop a static succinct data structure offering a logarithmic-factor space improvement and give the first dynamic approximate range mode data structure. We also consider approximate range selection.
The final category discussed is graph and dynamic graph algorithms. We develop an optimal offline data structure for dynamic 2- and 3- edge and vertex connectivity. Here, the data structure is given the entire sequence of operations in advance, and the dynamic operations are edge insertion and removal. Finally, we give a simplification of Karger's near-linear time minimum cut algorithm, utilizing heavy-light decomposition and iteration in place of dynamic programming in the subroutine to find a minimum cut of a graph G that cuts at most two edges of a spanning tree T of G
Compressed Dynamic Range Majority Data Structures
In the range α-majority query problem, we preprocess a given sequence S[1.n] for a fixed threshold α ϵ (0, 1], such that given a query range [i.j], the symbols that occur more than α (j-i+1) times in S[i.j] can be reported efficiently. We design the first compressed solution to this problem in dynamic settings. Our data structure represents S using nHk o(nlg σ) bits for any k = o(log σ n), where σ is the alphabet size and Hk is the k-Th order empirical entropy of S. It answers range α-majority queries in O(lgn/αlg lgn) time, and supports insertions and deletions in O(lgn/α) amortized time. The best previous solution [1] has the same query and update times, but uses O(n) words