11 research outputs found

    Using Hashing to Solve the Dictionary Problem (In External Memory)

    Full text link
    We consider the dictionary problem in external memory and improve the update time of the well-known buffer tree by roughly a logarithmic factor. For any \lambda >= max {lg lg n, log_{M/B} (n/B)}, we can support updates in time O(\lambda / B) and queries in sublogarithmic time, O(log_\lambda n). We also present a lower bound in the cell-probe model showing that our data structure is optimal. In the RAM, hash tables have been used to solve the dictionary problem faster than binary search for more than half a century. By contrast, our data structure is the first to beat the comparison barrier in external memory. Ours is also the first data structure to depart convincingly from the indivisibility paradigm

    Cell-Probe Bounds for Online Edit Distance and Other Pattern Matching Problems

    Full text link
    We give cell-probe bounds for the computation of edit distance, Hamming distance, convolution and longest common subsequence in a stream. In this model, a fixed string of nn symbols is given and one δ\delta-bit symbol arrives at a time in a stream. After each symbol arrives, the distance between the fixed string and a suffix of most recent symbols of the stream is reported. The cell-probe model is perhaps the strongest model of computation for showing data structure lower bounds, subsuming in particular the popular word-RAM model. * We first give an Ω((δlogn)/(w+loglogn))\Omega((\delta \log n)/(w+\log\log n)) lower bound for the time to give each output for both online Hamming distance and convolution, where ww is the word size. This bound relies on a new encoding scheme and for the first time holds even when ww is as small as a single bit. * We then consider the online edit distance and longest common subsequence problems in the bit-probe model (w=1w=1) with a constant sized input alphabet. We give a lower bound of Ω(logn/(loglogn)3/2)\Omega(\sqrt{\log n}/(\log\log n)^{3/2}) which applies for both problems. This second set of results relies both on our new encoding scheme as well as a carefully constructed hard distribution. * Finally, for the online edit distance problem we show that there is an O((logn)2/w)O((\log n)^2/w) upper bound in the cell-probe model. This bound gives a contrast to our new lower bound and also establishes an exponential gap between the known cell-probe and RAM model complexities.Comment: 32 pages, 4 figure

    Towards the Efficient Generation of Gray Codes in the Bitprobe Model

    Get PDF
    We examine the problem of representing integers modulo L so that both increment and decrement operations can be performed efficiently. This problem is studied in the bitprobe model, where the complexity of the underlying problem is measured by the number of bit operations performed on the data structure. In this thesis, we will primarily be interested in constructing space-optimal data structures. That is, we would like to use exactly n bits to represent integers modulo 2^n. Brodal et al. gave such a data structure, which requires n-1 bit reads and 3 bit writes, in the worst case, to perform increment and decrement operations We provide several improvements to their data structure. First, we give a data structure that requires n-1 bit reads and 2 bit writes, in the worst case, to perform increment and decrement operations. Then, we refine this result to obtain a data structure that requires n-1 bit reads and a single bit write to perform both operations. This disproves the conjecture that, when a space-optimal data structure uses only 1 bit write to perform these operations, then every bit in the data structure must be inspected in the worst case

    On the Fastest Vickrey Algorithm

    Get PDF
    We investigate the algorithmic performance of Vickrey-Clarke-Groves mechanisms in the single item case. We provide a formal definition of a Vickrey algorithm for this framework, and give a number of examples of Vickrey algorithms. We consider three performance criteria, one corresponding to a Pareto criterion, one to worst-case analysis, and one related to first-order stochastic dominance. We show that Pareto best Vickrey algorithms do not exist and that worst-case analysis is of no use in discriminating between Vickrey algorithms. For the case of two bidders, we show that the bisection auction stochastically dominates all Vickrey algorithms. We extend our analysis to the study of weak Vickrey algorithms and winner determina-tion algorithms. For the case of two bidders, we show that the One-Search algorithm stochastically dominates all column monotonic weak Vickrey algorithms and that a suitably adjusted version of the bisection algorithm, the WD bisection algorithm

    Using Hashing to Solve the Dictionary Problem (In External Memory)

    Full text link
    We consider the dictionary problem in external memory and improve the update time of the well-known buffer tree by roughly a logarithmic factor. For any λ ≥ max{lg lg n, logM/B(n/B)}, we can support updates in time O(λ/B) and queries in sublogarithmic time, O(log λ n). We also present a lower bound in the cell-probe model showing that our data structure is optimal. In the RAM, hash tables have been use to solve the dictionary problem faster than binary search for more than half a century. By contrast, our data structure is the first to beat the comparison barrier in external memory. Ours is also the first data structure to depart convincingly from the indivisibility paradigm.info:eu-repo/semantics/publishe

    Lower Bound Framework for Differentially Private and Oblivious Data Structures

    Get PDF
    In recent years, there has been significant work in studying data structures that provide privacy for the operations that are executed. These primitives aim to guarantee that observable access patterns to physical memory do not reveal substantial information about the queries and updates executed on the data structure. Multiple recent works, including Larsen and Nielsen [Crypto\u2718], Persiano and Yeo [Eurocrypt\u2719], Hubáček et al. [TCC\u2719] and Komargodski and Lin [Crypto\u2721], have shown that logarithmic overhead is required to support even basic RAM (array) operations for various privacy notions including obliviousness and differential privacy as well as different choices of sizes for RAM blocks bb and memory cells ω\omega. We continue along this line of work and present the first logarithmic lower bounds for differentially private RAMs (DPRAMs) that apply regardless of the sizes of blocks bb and cells ω\omega. This is the first logarithmic lower bounds for DPRAMs when blocks are significantly smaller than cells, that is bωb \ll \omega. Furthermore, we present new logarithmic lower bounds for differentially private variants of classical data structure problems including sets, predecessor (successor) and disjoint sets (union-find) for which sub-logarithmic plaintext constructions are known. All our lower bounds extend to the multiple non-colluding servers setting. We also address an unfortunate issue with this rich line of work where the lower bound techniques are difficult to use and require customization for each new result. To make the techniques more accessible, we generalize our proofs into a framework that reduces proving logarithmic lower bounds to showing that a specific problem satisfies two simple, minimal conditions. We show our framework is easy-to-use as all the lower bounds in our paper utilize the framework and hope our framework will spur more usage of these lower bound techniques

    Data Structuring Problems in the Bit Probe Model

    Get PDF
    We study two data structuring problems under the bit probe model: the dynamic predecessor problem and integer representation in a manner supporting basic updates in as few bit operations as possible. The model of computation considered in this paper is the bit probe model. In this model, the complexity measure counts only the bitwise accesses to the data structure. The model ignores the cost of computation. As a result, the bit probe complexity of a data structuring problem can be considered as a fundamental measure of the problem. Lower bounds derived by this model are valid as lower bounds for any realistic, sequential model of computation. Furthermore, some of the problems are more suitable for study in this model as they can be solved using less than ww bit probes where ww is the size of a computer word. The predecessor problem is one of the fundamental problems in computer science with numerous applications and has been studied for several decades. We study the colored predecessor problem, a variation of the predecessor problem, in which each element is associated with a symbol from a finite alphabet or color. The problem is to store a subset SS of size n,n, from a finite universe UU so that to support efficient insertion, deletion and queries to determine the color of the largest value in SS which is not larger than x,x, for a given xU.x \in U. We present a data structure for the problem that requires O(klogUloglogUk)O(k \sqrt[k]{{\log U} \over {\log \log U}}) bit probes for the query and O(k2logUloglogU)O(k^2 {{\log U} \over {\log \log U}}) bit probes for the update operations, where UU is the universe size and kk is positive constant. We also show that the results on the colored predecessor problem can be used to solve some other related problems such as existential range query, dynamic prefix sum, segment representative, connectivity problems, etc. The second structure considered is for integer representation. We examine the problem of integer representation in a nearly minimal number of bits so that increment and decrement (and indeed addition and subtraction) can be performed using few bit inspections and fewer bit changes. In particular, we prove a new lower bound of Ω(n)\Omega(\sqrt{n}) for the increment and decrement operation, where nn is the minimum number of bits required to represent the number. We present several efficient data structures to represent integers that use a logarithmic number of bit inspections and a constant number of bit changes per operation

    Space-Efficient Data Structures in the Word-RAM and Bitprobe Models

    Get PDF
    This thesis studies data structures in the word-RAM and bitprobe models, with an emphasis on space efficiency. In the word-RAM model of computation the space cost of a data structure is measured in terms of the number of w-bit words stored in memory, and the cost of answering a query is measured in terms of the number of read, write, and arithmetic operations that must be performed. In the bitprobe model, like the word-RAM model, the space cost is measured in terms of the number of bits stored in memory, but the query cost is measured solely in terms of the number of bit accesses, or probes, that are performed. First, we examine the problem of succinctly representing a partially ordered set, or poset, in the word-RAM model with word size Theta(lg n) bits. A succinct representation of a combinatorial object is one that occupies space matching the information theoretic lower bound to within lower order terms. We show how to represent a poset on n vertices using a data structure that occupies n^2/4 + o(n^2) bits, and can answer precedence (i.e., less-than) queries in constant time. Since the transitive closure of a directed acyclic graph is a poset, this implies that we can support reachability queries on an arbitrary directed graph in the same space bound. As far as we are aware, this is the first representation of an arbitrary directed graph that supports reachability queries in constant time, and stores less than n choose 2 bits. We also consider several additional query operations. Second, we examine the problem of supporting range queries on strings of n characters (or, equivalently, arrays of n elements) in the word-RAM model with word size Theta(lg n) bits. We focus on the specific problem of answering range majority queries: i.e., given a range, report the character that is the majority among those in the range, if one exists. We show that these queries can be supported in constant time using a linear space (in words) data structure. We generalize this result in several directions, considering various frequency thresholds, geometric variants of the problem, and dynamism. These results are in stark contrast to recent work on the similar range mode problem, in which the query operation asks for the mode (i.e., most frequent) character in a given range. The current best data structures for the range mode problem take soft-Oh(n^(1/2)) time per query for linear space data structures. Third, we examine the deterministic membership (or dictionary) problem in the bitprobe model. This problem asks us to store a set of n elements drawn from a universe [1,u] such that membership queries can be always answered in t bit probes. We present several new fully explicit results for this problem, in particular for the case when n = 2, answering an open problem posed by Radhakrishnan, Shah, and Shannigrahi [ESA 2010]. We also present a general strategy for the membership problem that can be used to solve many related fundamental problems, such as rank, counting, and emptiness queries. Finally, we conclude with a list of open problems and avenues for future work

    Lower bound techniques for data structures

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 135-143).We describe new techniques for proving lower bounds on data-structure problems, with the following broad consequences: * the first [omega](lg n) lower bound for any dynamic problem, improving on a bound that had been standing since 1989; * for static data structures, the first separation between linear and polynomial space. Specifically, for some problems that have constant query time when polynomial space is allowed, we can show [omega](lg n/ lg lg n) bounds when the space is O(n - polylog n). Using these techniques, we analyze a variety of central data-structure problems, and obtain improved lower bounds for the following: * the partial-sums problem (a fundamental application of augmented binary search trees); * the predecessor problem (which is equivalent to IP lookup in Internet routers); * dynamic trees and dynamic connectivity; * orthogonal range stabbing. * orthogonal range counting, and orthogonal range reporting; * the partial match problem (searching with wild-cards); * (1 + [epsilon])-approximate near neighbor on the hypercube; * approximate nearest neighbor in the l[infinity] metric. Our new techniques lead to surprisingly non-technical proofs. For several problems, we obtain simpler proofs for bounds that were already known.by Mihai Pǎtraşcu.Ph.D
    corecore