11 research outputs found
Using Hashing to Solve the Dictionary Problem (In External Memory)
We consider the dictionary problem in external memory and improve the update
time of the well-known buffer tree by roughly a logarithmic factor. For any
\lambda >= max {lg lg n, log_{M/B} (n/B)}, we can support updates in time
O(\lambda / B) and queries in sublogarithmic time, O(log_\lambda n). We also
present a lower bound in the cell-probe model showing that our data structure
is optimal.
In the RAM, hash tables have been used to solve the dictionary problem faster
than binary search for more than half a century. By contrast, our data
structure is the first to beat the comparison barrier in external memory. Ours
is also the first data structure to depart convincingly from the indivisibility
paradigm
Cell-Probe Bounds for Online Edit Distance and Other Pattern Matching Problems
We give cell-probe bounds for the computation of edit distance, Hamming
distance, convolution and longest common subsequence in a stream. In this
model, a fixed string of symbols is given and one -bit symbol
arrives at a time in a stream. After each symbol arrives, the distance between
the fixed string and a suffix of most recent symbols of the stream is reported.
The cell-probe model is perhaps the strongest model of computation for showing
data structure lower bounds, subsuming in particular the popular word-RAM
model.
* We first give an lower bound for
the time to give each output for both online Hamming distance and convolution,
where is the word size. This bound relies on a new encoding scheme and for
the first time holds even when is as small as a single bit.
* We then consider the online edit distance and longest common subsequence
problems in the bit-probe model () with a constant sized input alphabet.
We give a lower bound of which
applies for both problems. This second set of results relies both on our new
encoding scheme as well as a carefully constructed hard distribution.
* Finally, for the online edit distance problem we show that there is an
upper bound in the cell-probe model. This bound gives a
contrast to our new lower bound and also establishes an exponential gap between
the known cell-probe and RAM model complexities.Comment: 32 pages, 4 figure
Towards the Efficient Generation of Gray Codes in the Bitprobe Model
We examine the problem of representing integers modulo L so that both increment and decrement operations can be performed efficiently. This problem is studied in the bitprobe model, where the complexity of the underlying problem is measured by the number of bit operations performed on the data structure. In this thesis, we will primarily be interested in constructing space-optimal data structures. That is, we would like to use exactly n bits to represent integers modulo 2^n. Brodal et al. gave such a data structure, which requires n-1 bit reads and 3 bit writes, in the worst case, to perform increment and decrement operations We provide several improvements to their data structure. First, we give a data structure that requires n-1 bit reads and 2 bit writes, in the worst case, to perform increment and decrement operations. Then, we refine this result to obtain a data structure that requires n-1 bit reads and a single bit write to perform both operations. This disproves the conjecture that, when a space-optimal data structure uses only 1 bit write to perform these operations, then every bit in the data structure must be inspected in the worst case
On the Fastest Vickrey Algorithm
We investigate the algorithmic performance of Vickrey-Clarke-Groves mechanisms in the single item case. We provide a formal definition of a Vickrey algorithm for this framework, and give a number of examples of Vickrey algorithms. We consider three performance criteria, one corresponding to a Pareto criterion, one to worst-case analysis, and one related to first-order stochastic dominance. We show that Pareto best Vickrey algorithms do not exist and that worst-case analysis is of no use in discriminating between Vickrey algorithms. For the case of two bidders, we show that the bisection auction stochastically dominates all Vickrey algorithms. We extend our analysis to the study of weak Vickrey algorithms and winner determina-tion algorithms. For the case of two bidders, we show that the One-Search algorithm stochastically dominates all column monotonic weak Vickrey algorithms and that a suitably adjusted version of the bisection algorithm, the WD bisection algorithm
Using Hashing to Solve the Dictionary Problem (In External Memory)
We consider the dictionary problem in external memory and improve the update time of the well-known buffer tree by roughly a logarithmic factor. For any λ ≥ max{lg lg n, logM/B(n/B)}, we can support updates in time O(λ/B) and queries in sublogarithmic time, O(log λ n). We also present a lower bound in the cell-probe model showing that our data structure is optimal. In the RAM, hash tables have been use to solve the dictionary problem faster than binary search for more than half a century. By contrast, our data structure is the first to beat the comparison barrier in external memory. Ours is also the first data structure to depart convincingly from the indivisibility paradigm.info:eu-repo/semantics/publishe
Lower Bound Framework for Differentially Private and Oblivious Data Structures
In recent years, there has been significant work in studying data structures that provide privacy for the operations that are executed. These primitives aim to guarantee that observable access patterns to physical memory do not reveal substantial information about the queries and updates executed on the data structure. Multiple recent works, including Larsen and Nielsen [Crypto\u2718], Persiano and Yeo [Eurocrypt\u2719], Hubáček et al. [TCC\u2719] and Komargodski and Lin [Crypto\u2721], have shown that logarithmic overhead is required to support even basic RAM (array) operations for various privacy notions including obliviousness and differential privacy as well as different choices of sizes for RAM blocks and memory cells .
We continue along this line of work and present the first logarithmic lower bounds for differentially private RAMs (DPRAMs) that apply regardless of the sizes of blocks and cells . This is the first logarithmic lower bounds for DPRAMs when blocks are significantly smaller than cells, that is . Furthermore, we present new logarithmic lower bounds for differentially private variants of classical data structure problems including sets, predecessor (successor) and disjoint sets (union-find) for which sub-logarithmic plaintext constructions are known. All our lower bounds extend to the multiple non-colluding servers setting.
We also address an unfortunate issue with this rich line of work where the lower bound techniques are difficult to use and require customization for each new result. To make the techniques more accessible, we generalize our proofs into a framework that reduces proving logarithmic lower bounds to showing that a specific problem satisfies two simple, minimal conditions. We show our framework is easy-to-use as all the lower bounds in our paper utilize the framework and hope our framework will spur more usage of these lower bound techniques
Data Structuring Problems in the Bit Probe Model
We study two data structuring problems under the bit probe model: the dynamic predecessor problem and integer representation in a manner supporting basic updates in as few bit operations as possible. The model of computation considered in this paper is the bit probe model. In this model, the complexity measure counts only the bitwise accesses to the data structure. The model ignores the cost of computation. As a result, the bit probe complexity of a data structuring problem can be considered as a fundamental measure of the problem. Lower bounds derived by this model are valid as lower bounds for any realistic, sequential model of computation. Furthermore, some of the problems are more suitable for study in this model as they can be solved using less than bit probes where is the size of a computer word.
The predecessor problem is one of the fundamental problems in computer science with numerous applications and has been studied for several decades. We study the colored predecessor problem, a variation of the predecessor problem, in which each element is associated with a symbol from a finite alphabet or color. The problem is to store a subset of size from a finite universe so that to support efficient insertion, deletion and queries to determine the color of the largest value in which is not larger than for a given We present a data structure for the problem that requires bit probes for the query and bit probes for the update operations, where is the universe size and is positive constant. We also show that the results on the colored predecessor problem can be used to solve some other related problems such as existential range query, dynamic prefix sum, segment representative, connectivity problems, etc.
The second structure considered is for integer representation. We examine the problem of integer representation in a nearly minimal number of bits so that increment and decrement (and indeed addition and subtraction) can be performed using few bit inspections and fewer bit changes. In particular, we prove a new lower bound of for the increment and decrement operation, where is the minimum number of bits required to represent the number. We present several efficient data structures to represent integers that use a logarithmic number of bit inspections and a constant number of bit changes per operation
Space-Efficient Data Structures in the Word-RAM and Bitprobe Models
This thesis studies data structures in the word-RAM and bitprobe models, with an emphasis on space efficiency. In the word-RAM model of computation the space cost of a data structure is measured in terms of the number of w-bit words stored in memory, and the cost of answering a query is measured in terms of the number of read, write, and arithmetic operations that must be performed. In the bitprobe model, like the word-RAM model, the space cost is measured in terms of the number of bits stored in memory, but the query cost is measured solely in terms of the number of bit accesses, or probes, that are performed.
First, we examine the problem of succinctly representing a partially ordered set, or poset, in the word-RAM model with word size
Theta(lg n) bits. A succinct representation of a combinatorial object is one that occupies space matching the information theoretic lower bound to within lower order terms. We show how to represent a poset on n vertices using a data structure that occupies n^2/4 + o(n^2) bits, and can answer precedence (i.e., less-than) queries in
constant time. Since the transitive closure of a directed acyclic graph is a poset, this implies that we can support reachability
queries on an arbitrary directed graph in the same space bound. As far as we are aware, this is the first representation of an arbitrary directed graph that supports reachability queries in constant time,
and stores less than n choose 2 bits. We also consider several additional query operations.
Second, we examine the problem of supporting range queries on strings
of n characters (or, equivalently, arrays of
n elements) in the word-RAM model with word size Theta(lg n) bits. We focus on the specific problem of answering range majority queries: i.e., given a range, report the
character that is the majority among those in the range, if one exists. We show that these queries can be supported in constant time
using a linear space (in words) data structure. We generalize this
result in several directions, considering various frequency thresholds, geometric variants of the problem, and dynamism. These
results are in stark contrast to recent work on the similar range mode problem, in which the query operation asks for the mode (i.e., most frequent) character in a given range. The current best data structures for the range mode problem take soft-Oh(n^(1/2)) time per query for linear space data structures.
Third, we examine the deterministic membership (or dictionary) problem in the bitprobe model. This problem asks us to store a set of n elements drawn from a universe [1,u] such that membership queries
can be always answered in t bit probes. We present several new fully explicit results for this problem, in particular for the case
when n = 2, answering an open problem posed by Radhakrishnan, Shah, and Shannigrahi [ESA 2010]. We also present a general strategy for the membership problem that can be used to solve many related fundamental problems, such as rank, counting, and emptiness queries.
Finally, we conclude with a list of open problems and avenues for future work
Lower bound techniques for data structures
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 135-143).We describe new techniques for proving lower bounds on data-structure problems, with the following broad consequences: * the first [omega](lg n) lower bound for any dynamic problem, improving on a bound that had been standing since 1989; * for static data structures, the first separation between linear and polynomial space. Specifically, for some problems that have constant query time when polynomial space is allowed, we can show [omega](lg n/ lg lg n) bounds when the space is O(n - polylog n). Using these techniques, we analyze a variety of central data-structure problems, and obtain improved lower bounds for the following: * the partial-sums problem (a fundamental application of augmented binary search trees); * the predecessor problem (which is equivalent to IP lookup in Internet routers); * dynamic trees and dynamic connectivity; * orthogonal range stabbing. * orthogonal range counting, and orthogonal range reporting; * the partial match problem (searching with wild-cards); * (1 + [epsilon])-approximate near neighbor on the hypercube; * approximate nearest neighbor in the l[infinity] metric. Our new techniques lead to surprisingly non-technical proofs. For several problems, we obtain simpler proofs for bounds that were already known.by Mihai Pǎtraşcu.Ph.D