9 research outputs found

    Dynamic Ordered Sets with Approximate Queries, Approximate Heaps and Soft Heaps

    Get PDF
    We consider word RAM data structures for maintaining ordered sets of integers whose select and rank operations are allowed to return approximate results, i.e., ranks, or items whose rank, differ by less than Delta from the exact answer, where Delta=Delta(n) is an error parameter. Related to approximate select and rank is approximate (one-dimensional) nearest-neighbor. A special case of approximate select queries are approximate min queries. Data structures that support approximate min operations are known as approximate heaps (priority queues). Related to approximate heaps are soft heaps, which are approximate heaps with a different notion of approximation. We prove the optimality of all the data structures presented, either through matching cell-probe lower bounds, or through equivalences to well studied static problems. For approximate select, rank, and nearest-neighbor operations we get matching cell-probe lower bounds. We prove an equivalence between approximate min operations, i.e., approximate heaps, and the static partitioning problem. Finally, we prove an equivalence between soft heaps and the classical sorting problem, on a smaller number of items. Our results have many interesting and unexpected consequences. It turns out that approximation greatly speeds up some of these operations, while others are almost unaffected. In particular, while select and rank have identical operation times, both in comparison-based and word RAM implementations, an interesting separation emerges between the approximate versions of these operations in the word RAM model. Approximate select is much faster than approximate rank. It also turns out that approximate min is exponentially faster than the more general approximate select. Next, we show that implementing soft heaps is harder than implementing approximate heaps. The relation between them corresponds to the relation between sorting and partitioning. Finally, as an interesting byproduct, we observe that a combination of known techniques yields a deterministic word RAM algorithm for (exactly) sorting n items in O(n log log_w n) time, where w is the word length. Even for the easier problem of finding duplicates, the best previous deterministic bound was O(min{n log log n,n log_w n}). Our new unifying bound is an improvement when w is sufficiently large compared with n

    Multiparty Selection

    Get PDF
    Given a sequence A of n numbers and an integer (target) parameter 1 ? i ? n, the (exact) selection problem is that of finding the i-th smallest element in A. An element is said to be (i,j)-mediocre if it is neither among the top i nor among the bottom j elements of S. The approximate selection problem is that of finding an (i,j)-mediocre element for some given i,j; as such, this variant allows the algorithm to return any element in a prescribed range. In the first part, we revisit the selection problem in the two-party model introduced by Andrew Yao (1979) and then extend our study of exact selection to the multiparty model. In the second part, we deduce some communication complexity benefits that arise in approximate selection. In particular, we present a deterministic protocol for finding an approximate median among k players

    Lazy Search Trees

    Get PDF

    Algorithmic and Combinatorial Results in Selection and Computational Geometry

    Get PDF
    This dissertation investigates two sets of algorithmic and combinatorial problems. Thefirst part focuses on the selection problem under the pairwise comparison model. For the classic “median of medians” scheme, contrary to the popular belief that smaller group sizes cause superlinear behavior, several new linear time algorithms that utilize small groups are introduced. Then the exact number of comparisons needed for an optimal selection algorithm is studied. In particular, the implications of a long standing conjecture known as Yao’s hypothesis are explored. For the multiparty model, we designed low communication complexity protocols for selecting an exact or an approximate median of data that is distributed among multiple players. In the second part, three computational geometry problems are studied. For the longestspanning tree with neighborhoods, approximation algorithms are provided. For the stretch factor of polygonal chains, upper bounds are proved and almost matching lower bound constructions in \mathbb{R}^2 and higher dimensions are developed. For the piercing number τ and independence number ν of a family of axis-parallel rectangles in the plane, a lower bound construction for ν = 4 that matches Wegner’s conjecture is analyzed. The previous matching construction for ν = 3, due to Wegner himself, dates back to 1968

    Efficient Data Structures for Partial Orders, Range Modes, and Graph Cuts

    Get PDF
    This thesis considers the study of data structures from the perspective of the theoretician, with a focus on simplicity and practicality. We consider both the time complexity as well as space usage of proposed solutions. Topics discussed fall in three main categories: partial order representation, range modes, and graph cuts. We consider two problems in partial order representation. The first is a data structure to represent a lattice. A lattice is a partial order where the set of elements larger than any two elements x and y are all larger than an element z, known as the join of x and y; a similar condition holds for elements smaller than any two elements. Our data structure is the first correct solution that can simultaneously compute joins and the inverse meet operation in sublinear time while also using subquadratic space. The second is a data structure to support queries on a dynamic set of one-dimensional ordered data; that is, essentially any operation computable on a binary search tree. We develop a data structure that is able to interpolate between binary search trees and efficient priority queues, offering more-efficient insertion times than the former when query distribution is non-uniform. We also consider static and dynamic exact and approximate range mode. Given one-dimensional data, the range mode problem is to compute the mode of a subinterval of the data. In the dynamic range mode problem, insertions and deletions are permitted. For the approximate problem, the element returned is to have frequency no less than a factor (1+epsilon) of the true mode, for some epsilon > 0. Our results include a linear-space dynamic exact range mode data structure that simultaneously improves on best previous operation complexity and an exact dynamic range mode data structure that breaks the Theta(n^(2/3)) time per operation barrier. For approximate range mode, we develop a static succinct data structure offering a logarithmic-factor space improvement and give the first dynamic approximate range mode data structure. We also consider approximate range selection. The final category discussed is graph and dynamic graph algorithms. We develop an optimal offline data structure for dynamic 2- and 3- edge and vertex connectivity. Here, the data structure is given the entire sequence of operations in advance, and the dynamic operations are edge insertion and removal. Finally, we give a simplification of Karger's near-linear time minimum cut algorithm, utilizing heavy-light decomposition and iteration in place of dynamic programming in the subroutine to find a minimum cut of a graph G that cuts at most two edges of a spanning tree T of G

    A Selectable Sloppy Heap

    No full text
    We study the selection problem, namely that of computing the ith order statistic of n given elements. Here we offer a data structure called selectable sloppy heap that handles a dynamic version in which upon request (i) a new element is inserted or (ii) an element of a prescribed quantile group is deleted from the data structure. Each operation is executed in constant time—and is thus independent of n (the number of elements stored in the data structure)—provided that the number of quantile groups is fixed. This is the first result of this kind accommodating both insertion and deletion in constant time. As such, our data structure outperforms the soft heap data structure of Chazelle (which only offers constant amortized complexity for a fixed error rate 0 < ε ≤ 1 / 2 ) in applications such as dynamic percentile maintenance. The design demonstrates how slowing down a certain computation can speed up the data structure. The method described here is likely to have further impact in the field of data structure design in extending asymptotic amortized upper bounds to same formula asymptotic worst-case bounds
    corecore